Mounting AWS EFS volumes inside Docker Containers

Amazon announced the development of the Amazon Elastic File System (AWS EFS) in 2015. EFS was designed to provide multiple EC2 instances with shared, low-latency access to a fully-managed file system. On June 28, 2016 Amazon announced that EFS is now available for production use in the US East (Northern Virginia), US West (Oregon), and Europe (Ireland) Regions.

Apcera‘s NFS Service Gateway can be used to access AWS EFS storage volumes within containers. You can use EFS to provide persistent storage to your containers running on AWS-hosted clouds in regions where EFS is available.

Gathering information

Before you begin you will need to know:

  • The name of the AWS Region where your Apcera Platform is running
  • The name/ID of the AWS VPC where your Apcera Platform is running
  • The name/ID of the AWS security group for your Apcera Platform

Setting up an EFS volume

  1. Log into your AWS console.
  2. Select the name of the AWS Region where your Apcera Platform is running on the upper right side of the screen.
  3. Select Elastic File System.
  4. Click Create File System.
  5. Configure the file system access:
    • Select the name of the VPC.
    • The availability zone and subnet should be selected for you automatically.
    • If your VPC has more than one subnet (unusual) then select the subnet containing the Instance Managers that will be connecting to the EFS volume.
    • Leave IP address set to Automatic.
    • The first EFS volume you create will create a new security group. Use that security group for this and all future EFS volumes. Write down the name of the new EFS security group – we’ll configure it in the next few steps.
    • Click Next Step
  6. Configure optional settings:
    • Set the name of the EFS volume.
    • Choose the performance mode.
    • Click Next Step
      mounting-aws-with-esf-inside-docker
  7. Review and create:
  8. If everything looks OK, click Create File System.
    mounting-aws-with-esf-inside-docker
  9. You should see a “Success!” message and a new EFS volume with “Life Cycle State” = “Creating”.
  10. Write down the IP address of the EFS volume.

Update the EFS security group

  • Go back to the main console menu and select EC2.
  • Click Security Groups in the left hand nav menu.
  • Type the name of the new EFS security group into the search filter list.
  • On the bottom half of the screen delete the default inbound and outbound rules.
  • Add one inbound rule to allow all TCP traffic on port 2049 from the source “name/ID of the AWS security group for your Apcera Platform”
  • Add one outbound rule to allow all TCP traffic on port 2049 to the destination “name/ID of the AWS security group for your Apcera Platform”
  • This allows all VMs within your Apcera Platform security group to connect to your EFS volume on port 2049 (NFS).
  • No other traffic from any other source or to any other destination is allowed.|

Create an NFS Provider for the EFS volume

We’re going to create a single provider for the EFS volume. Each time you have a container or set of containers that need a persistent file system, just create a new service from the same provider. Each new service will carve out a new namespace on the EFS volume, keeping the files associated with that service separate from the files in all other services that use the same provider.

According to the EFS FAQ, When you create a file system, you create endpoints in your VPC called “mount targets.” Each mount target provides an IP address and a DNS name, and you use this IP address or DNS name in your mount command. Only resources that can access a mount target can access your file system. Since the Apcera Platform isn’t using Amazon DNS services internally, we’ll use the IP address to connect to the EFS volume.

To create the provider, you need to construct a URL describing the volume. In this case, we’ll use the internal IP address of the EFS volume as the hostname and / as the exported volume name. All EFS volumes use the NFS v4.1 protocol. If the IP address of the EFS volume is 10.0.0.112 we’d construct a provider using:

apc provider register awsefs --type nfs \
    --url "nfs://10.0.0.112/" \
    --description 'Amazon EFS' \
    --batch \
    -- --version 4.1

Create a service from the provider:

apc service create efs-service-1 \
    --provider awsefs \
    --description 'Amazon EFS Service' \
    --batch

Create a capsule, bind the service to the capsule, and connect to the capsule:

apc capsule create efs-capsule1 --image linux -ae --batch
apc service bind efs-service-1 --job efs-capsule1 \
    --batch -- --mountpath /an/unlimited/supply
apc capsule connect efs-capsule1

Once connected, type df -k to see the mounted file system.

You can bind this service to any container that needs a shared, persistent file system. Each time you need a new shared, persistent file system for a container or group of containers just create a new service using the same provider and bind the service to your job or jobs.

Persistence for Docker

Now that we have a provider that can carve out EFS storage for containers, let’s try spinning up some Docker images.

On the Apcera Platform, if the specification for a Docker image (Dockerfile) specifies that the app requires persisted volumes, you must do one of the following when creating the job:

  • Include the –provider flag when you create or run the Docker job. You must include this flag if you include the –volume flag when creating or running the Docker job.
  • Include the –ignore-volumes flag when you create or run the Docker job.

Here is an example of running NGINX inside a Docker container on the Apcera platform, where the content for the site is stored on an EFS volume:

I’m using the Apcera “apc” command-line tool to build the container, pulling the nginx image directly off hub.docker.com, telling it to use the awsefs EFS volume provider I created earlier for persistence, and to mount the EFS volume at the mount point “/usr/share/nginx/html”.

Now connect to the container:

/proc/mounts contains a list of all of the container’s mount points. I can verify that the container does indeed have an EFS volume by grepping /proc/mounts for the mount point:

Grepping for “/usr/share/nginx/html” shows the IP address 10.0.0.112, which is the IP of the EFS volume, the log directory name after is the unique namespace for the service, the mountpoint is “/usr/share/nginx/html”, and the mount type “nfs4”.

There is no content in the directory, so I add some by echoing some HTML code to an index.html file. My container will proclaim to the world “NGINX in a Docker container on Apcera with content stored on EFS” in an H3 typeface!

Now that I have some content I need to add a route to the content. Right now the NGINX container is running, and listening on ports 80 and 443, but it’s completely isolated from the outside world — no one can connect to those ports unless there’s a route (a URL) set up.

My cluster is running on the domain earlruby.apcera-platform.io, so I add a route like so:

I have successfully added the http route http://nginx.earlruby.apcera-platform.io/ to my NGINX container. This is a real public DNS entry. To verify that it works I point my browser at the route I just added:

Success!

Such an amazing app is bound to go viral, and a single NGINX container may not be able to keep up with the load. I want to ensure that my app can keep up and remain highly-available, and that it keeps running even if one or more VMs in my cluster get killed off, so I add more NGINX containers:

Now I’ve got 20 containers running my NGINX app, all serving up the same content, running on multiple VMs across my cluster, all load-balanced under the single URL http://nginx.earlruby.apcera-platform.io/. If any container gets killed off, the Apcera platform will spin up a new one. If any VM in the cluster dies, any containers running on it will automatically be migrated to new hosts. If I want to scale up the app to 100 or 1000 containers, or back down to 1, it’s a one-line command to make the change.

In terms of resources, I’m using slightly less than 45 MiB to run those 20 containers. That’s not a typo — 45MiB! Containers are much more efficient users of RAM than VMs.

I hope you find this useful.

This article originally appeared as an Apcera blog post on July 21, 2016.

Policy-based Cloud Storage

This is a talk I gave last week at the SF Microservices Meetup titled Policy-based Cloud Storage, Persisting Data in a Multi-Site, Multi-Cloud World. In it I cover Apcera‘s approach to storage for containers and how to use policy to manage very large scale application deployments.

Adding a LUKS-encrypted iSCSI volume to Synology DS414 NAS and Ubuntu 15.04

I have an Ubuntu 15.04 “Vivid” workstation already set up with LUKS full disk encryption, and I have a Synology DS414 NAS with 12TB raw storage on my home network. I wanted to add a disk volume on the Synology DS414 that I could mount on the Ubuntu server, but NFS doesn’t support “at rest” encrypted file systems, and using EncFS over NFS seemed like the wrong way to go about it, so I decided to try setting up an iSCSI volume and encrypting it with LUKS. Using this type of setup, all data is encrypted both “on the wire” and “at rest”.

Log into the Synology Admin Panel and select Main Menu > Storage Manager:

  • Add an iSCSI LUN
    • Set Thin Provisioning = No
    • Advanced LUN Features = No
    • Make the volume as big as you need
  • Add an iSCSI Target
    • Use CHAP authentication
    • Write down the login name and password you choose

On your Ubuntu box switch over to a root prompt:

sudo /bin/bash

Install the open-iscsi drivers. (Since I’m already running LUKS on my Ubuntu box I don’t need to install LUKS.)

apt-get install open-iscsi

Edit the conf file

vi /etc/iscsi/iscsid.conf

Edit these lines:

node.startup = automatic
node.session.auth.username = [CHAP user name on Synology box]
node.session.auth.password = [CHAP password on Synology box]

Restart the open-iscsi service:

service open-iscsi restart
service open-iscsi status

Start open-iscsi at boot time:

systemctl enable open-iscsi

Now find the name of the iSCSI target on the Synology box:

iscsiadm -m discovery -t st -p $SYNOLOGY_IP
iscsiadm -m node

The target name should look something like “iqn.2000-01.com.synology:boxname.target-1.62332311”

Still on the Ubuntu workstation, log into the iSCSI target:

iscsiadm -m node --targetname "$TARGET_NAME" --portal "$SYNOLOGY_IP:3260" --login

Look for new devices:

fdisk -l

At this point fdisk should show you a new block device which is the iSCSI disk volume on the Synology box. In my case it was /dev/sdd.

Partition the device. I made one big /dev/sdd1 partition, type 8e (Linux LVM):

fdisk /dev/sdd

Set up the device as a LUKS-encrypted device:

cryptsetup --verbose --verify-passphrase luksFormat /dev/sdd1

Open the LUKS volume:

cryptsetup luksOpen /dev/sdd1 backupiscsi

Create a physical volume from the LUKS volume:

pvcreate /dev/mapper/backupiscsi

Add that to a new volume group:

vgcreate ibackup /dev/mapper/backupiscsi

Create a logical volume within the volume group:

lvcreate -L 1800GB -n backupvol /dev/ibackup

Put a file system on the logical volume:

mkfs.ext4 /dev/ibackup/backupvol

Add the logical volume to /etc/fstab to mount it on startup:

# Synology iSCSI target LUN-1
/dev/ibackup/backupvol /mnt/backup ext4 defaults,nofail,nobootwait 0 6

Get the UUID of the iSCSI drive:

ls -l /dev/disk/by-uuid | grep sdd1

Add the UUID to /etc/crypttab to be automatically prompted for the decrypt passphrase when you boot up Ubuntu:

backupiscsi UUID=693568ca-9334-4c19-8b01-881f2247ae0d none luks

If you found this interesting, you might want to check out my article Adding an external encrypted drive with LVM to Ubuntu Linux.

Hope you found this useful.

2014 HPCwire Awards

The StratoStor project I’ve been working on for the past 10 months just got a “Top 5 New Products or Technologies to Watch” award from HPCwire announced at this week’s SuperComputing 2014 (SC14) conference in New Orleans.

HPC = High Performance Computing, HPCwire is a news bureau for all things regarding High Performance Computing, and SC14 is where every major vendor of HPC equipment and products shows off their wares, so getting this bit of recognition from the readers of HPCwire is really nice.

So THANK YOU HPCwire readers, for this award.

https://www.hpcwire.com/2014-hpcwire-readers-choice-awards/23/

2014 HPCwire Awards

Validating Distributed Application Workloads

This is the talk I gave at RICON this year on Validating Distributed Application Workloads. It’s about how we set up test environments at Seagate for validating storage system performance at the petabyte scale. This talk centers around the testing done to validate performance of a 2PB rack running Riak CS.