post

Allow ping from specific subnets to AWS EC2 instances using Terraform

If you’re using Terraform to set up EC2 instances on AWS you may be a little confused about how to allow ping through the AWS VPC firewall, especially if you want to limit ping so that it only works from specific IPs or subnets.

To do this just add a Terraform ingress security group rule to the aws_security_group:

ingress {
  cidr_blocks = ["1.2.3.4/32"]
  from_port   = 8
  to_port     = 0
  protocol    = "icmp"
  description = "Allow ping from 1.2.3.4"
}

The above rule will only allow ping from the single IPv4 address “1.2.3.4”. You can use the cidr_blocks setting to allow ping from any set of IPv4 IP address and subnets that you wish. If you want to allow IPv6 addresses use the ipv6_cidr_blocks setting:

ingress {
  cidr_blocks       = ["1.2.3.4/32"]
  ipv6_cidr_blocks  = [aws_vpc.example.ipv6_cidr_block]
  from_port         = 8
  to_port           = 0
  protocol          = "icmp"
  description       = "Allow ping from 1.2.3.4 and the example.ipv6_cidr_block"
}

Right about now you should be scratching your head and asking why a port range is specified from port 8 to port 0? Isn’t that backwards? Also, this is ICMP, so why are we specifying port ranges at all?

Well, for ICMP security group rules Terraform uses the from_port field to define the ICMP message type, and “ping” is an ICMP “echo request” type 8 message.

So why is to_port = 0? Since ICMP is a network-layer protocol there is no TCP or UDP port number associated with ICMP packets as these numbers are associated with the transport layer, which is above the network layer. So you might think it’s set to 0 because it’s a “don’t care” setting, but that is not the case.

It’s actually set to 0 because Terraform (and AWS) use the to_port field to define the ICMP code of the ICMP packet being allowed through the firewall, and “ping” is defined as a type 8, code 0 ICMP message.

I have no idea why Terraform chose to obscure the usage this way, but I suspect it’s because the AWS API reuses the from_port field for storing the ICMP message type, and reuses the to_port for storing the ICMP code, and Terraform just copied their bad design. A more user-friendly implementation of Terraform would have created an icmp_message_type and icmp_message_code fields (or aliases) that are mapped to the AWS from_port and to_port fields to make it obvious what you’re setting and why it works.

Hope you find this useful.

post

Too many authentication failures

I was working with a new Linux distro and after creating a brand-new VM with a single login I attempted to ssh into the VM only to be greeted with:

Received disconnect from 10.0.0.180 port 22:2: Too many authentication failures
Disconnected from 10.0.0.180 port 22

It was a new VM, and I hadn’t loaded an ssh key (there was no option to do so in the install). I’d set up a user and password, so I expected to get a password prompt. I didn’t get to a password prompt, just an immediate disconnect.

I used ssh -vvv to connect and found that my ssh client was attempting to use my ssh keys, as ssh is supposed to, and on the third key the VM spat back the error:

Received disconnect from 10.0.0.180 port 22:2: Too many authentication failures
Disconnected from 10.0.0.180 port 22

Well, I wanted to connect with a password anyhow, so I tried:

ssh -o PubkeyAuthentication=no username@10.0.0.180

I was greeted with a password: prompt.

I checked the /etc/ssh/sshd_config and found that someone who’d built the install image had changed the default setting for MaxAuthTries from 6 to 2.

The MaxAuthTries setting tells the ssh daemon how many different authentication attempts a user can try before it disconnects. Each ssh key loaded into ssh-agent counts as one authentication attempt. The default is 6 because many users (like me) have multiple ssh keys loaded into ssh-agent so that we can automatically log into different hosts that use different ssh keys. Trying more than one ssh key isn’t the same as thumb-fingering a password — ssh is designed to allow for multiple key attempts. After the ssh connection attempts all of your ssh keys and you haven’t run out of attempts and passwords are enabled you’ll eventually get a password prompt.

Setting MaxAuthTries back to the more reasonable default of 6 and reloading the sshd daemon fixed the issue. Apparently whoever tested the setup only has one ssh key and wasn’t aware of what changing the MaxAuthTries setting does when people with more than one key attempt to log in.

Alternatively, if it’s someone else’s server and you can’t change the /etc/ssh/sshd_config file, you can also add these lines to your local ~/.ssh/config file:

Host 10.0.0.180
    PubkeyAuthentication no

If you’re concerned about ssh security sshd_config allows you to control what versions of the ssh protocol are supported, which ciphers you trust (or don’t trust), and to tune other settings that lock down what you will or won’t allow ssh to do in your environment. It may be that for some applications in some environments setting MaxAuthTries 2 makes sense, but using it for an out of the box installation just breaks ssh for no good reason.

Hope you find this useful.

post

Creating AWS Elastic Filesystems (EFS) with Terraform

The AWS Elastic Filesystem (EFS) gives you an NFSv4-mountable file system with almost unlimited storage capacity. The filesystem I just created to write this article reports 9,007,199,254,739,968 bytes free. In human-readable format df -kh reports 8.0E (Exabytes) of available disk space. In the year 2019, that’s a lot of storage space.

In past articles I’ve shown how to create EFS resources manually, but this week I wanted to programmatically create EFS resources with Terraform so that I could easily create, test, and tear-down EFS and VM resources on AWS.

I also wanted to make sure that my EFS resources are secure, that only VMs within my Virtual Private Cloud (VPC) could access the EFS data, so that no one outside of my VPC could mount or otherwise access the data.

Creating an EFS resource is easy. The Terraform code looks like this:

// efs.tf
resource "aws_efs_file_system" "efs-example" {
creation_token = "efs-example"
performance_mode = "generalPurpose"
throughput_mode = "bursting"
encrypted = "true"
tags = {
Name = "EfsExample"
}
}

This creates the EFS filesystem on AWS. EFS also requires a mount target, which gives your VMs a way to mount the EFS volume using NFS. The Terraform code to create a mount target looks like this:

// efs.tf (continued)
resource "aws_efs_mount_target" "efs-mt-example" {
file_system_id = "${aws_efs_file_system.efs-example.id}"
subnet_id = "${aws_subnet.subnet-efs.id}"
security_groups = ["${aws_security_group.ingress-efs.id}"]
}

The file_system_id is automatically set to the efs-example resource’s ID, which ties the mount target to the EFS file system.

The subnet_id for subnet-efs is a separate /24 subnet I created from my VPC just for EFS. The ingress-efs security group is a separate security group I created for EFS. Let’s cover each one of these separately.

A separate EFS subnet

First off I’ve allocated a /16 subnet for my VPC and I carve out individual /24 subnets from that VPC for each cluster of VMs and/or EFS resources that I add to an AWS availability zone. Here’s how I’ve defined my test environment VPC and EFS subnet:

//network.tf
resource "aws_vpc" "test-env" {
cidr_block = "10.0.0.0/16"
enable_dns_hostnames = true
enable_dns_support = true
tags {
Name = "test-env"
}
}

resource "aws_subnet" "subnet-efs" {
cidr_block = "${cidrsubnet(aws_vpc.test-env.cidr_block, 8, 8)}"
vpc_id = "${aws_vpc.test-env.id}"
availability_zone = "us-east-1a"
}

That will give me the subnet 10.0.8.0/24 for my EFS subnet.

If you want to understand how to use Terraform’s cidrsubnet command to carve out separate subnets, see the article Terraform `cidrsubnet` Deconstructed by Lisa Hagemann. Her article gives excellent examples on how to do just that.

The EFS security group

Finally, I need a security group that only allows traffic between my test environment VMs and my test environment EFS volume. I already have a security group called ingress-test-env that is used to control security for my VMs. For EFS I create another security group that allows inbound traffic on port 2049 (the NFSv4 port), allows egress traffic on any port.

By setting the ingress-efs-test resource’s security_groups attribute to ingress-test-env this only allows network traffic to and from VMs in the ingress-test-env security group to talk to the EFS volume. If you use security_groups like this, you really lock down the EFS volume and you don’t need to set the cidr_blocks attribute at all.

// security.tf
resource "aws_security_group" "ingress-efs-test" {
name = "ingress-efs-test-sg"
vpc_id = "${aws_vpc.test-env.id}"

// NFS
ingress {
security_groups = ["${aws_security_group.ingress-test-env.id}"]
from_port = 2049
to_port = 2049
protocol = "tcp"
}

// Terraform removes the default rule
egress {
security_groups = ["${aws_security_group.ingress-test-env.id}"]
from_port = 0
to_port = 0
protocol = "-1"
}
}

After adding these Terraform files to my cluster configuration and running terraform apply, I end up with a new EFS filesystem that I can mount from any VM running in my VPC.

# mount -t nfs4 -o nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport fs-31337er3.efs.us-east-1.amazonaws.com:/ /mnt/efs
# df -kh
Filesystem Size Used Avail Use% Mounted on
udev 481M 0 481M 0% /dev
tmpfs 99M 744K 98M 1% /run
/dev/xvda1 7.7G 3.0G 4.7G 40% /
tmpfs 492M 0 492M 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 492M 0 492M 0% /sys/fs/cgroup
/dev/loop0 13M 13M 0 100% /snap/amazon-ssm-agent/150
/dev/loop1 87M 87M 0 100% /snap/core/4650
/dev/loop2 90M 90M 0 100% /snap/core/6130
/dev/loop3 18M 18M 0 100% /snap/amazon-ssm-agent/930
tmpfs 99M 0 99M 0% /run/user/1000
fs-31337er3.efs.us-east-1.amazonaws.com:/ 8.0E 0 8.0E 0% /mnt/efs

Hope you found this useful.

post

How to get the IP address of a KVM/virsh VM

Since virsh domifaddr doesn’t work to get the IP addresses of VMs on a bridged network, I wrote a get-vm-ip script (which you can download from Github) which uses this to get the IP of a running VM:

HOSTNAME=[your vm name]
MAC=$(virsh domiflist $HOSTNAME | awk '{ print $5 }' | tail -2 | head -1)
arp -a | grep $MAC | awk '{ print $2 }' | sed 's/[()]//g'

The virsh command gets the MAC address, the last line finds the IP address using arp.

Hope you find this useful.

post

Use .iso and Kickstart files to automatically create Ubuntu VMs

I wrote an updated version of this article in 2023: Quickly create guest VMs using virsh, cloud image files, and cloud-init. The newer code works better on more recent distros. Unless you really need to use Kickstart, try the new code.

I was looking for a way to automate the creation of VMs for testing various distributed system / cluster software packages. I’ve used Vagrant in the past but I wanted something that would:

  • Allow me to use raw ISO files as the basis for guest VMs.
  • Guest VMs should be set up with bridged IPs that are routable from the host.
  • Guest VMs should be able to reach the Internet.
  • Other hosts on the local network should be able to reach guest VMs. (Setting up additional routes is OK).
  • VM creation should work with any distro that supports Kickstart files.
  • Scripts should be able to create and delete VMs in a scripted, fully-automatic manner.
  • Guest VMs should be set up to allow passwordless ssh access from the “ansible” user.

I’ve previously used virsh’s virt-install tool to create VMs and I like how easy it is to set up things like extra network interfaces and attach existing disk images. The scripts in this repo fully automate the virsh VM creation process.

Scripts

I put all of my code into a Github repo containing these scripts:

  • create-vm – Use .iso and kickstart files to auto-generate a VM.
  • delete-vm – Delete a virtual machine created with create-vm.
  • get-vm-ip – Get the IP address of a VM managed by virsh.
  • encrypt-pw – Returns a SHA512 encrypted password suitable for pasting into Kickstart files.

I’ve also included a sample ubuntu.ks Kickstart file for creating an Ubuntu host.

Host setup

I’m running the scripts from a host with Ubuntu Linux 18.10 installed. I added the following to the host’s Ansible playbook to install the necessary virtualization packages:

  - name: Install virtualization packages
apt:
name: "{{item}}"
state: latest
with_items:
- qemu-kvm
- libvirt-bin
- libvirt-clients
- libvirt-daemon
- libvirt-daemon-driver-storage-zfs
- python-libvirt
- python3-libvirt
- system-config-kickstart
- vagrant-libvirt
- vagrant-sshfs
- virt-manager
- virtinst

If you’re not using Ansible just apt-get install the above packages.

Sample Kickstart file

There are plenty of documents on the Internet on how to set up Kickstart files.

A couple of things that are special about the included Kickstart file

The Ansible user: Although I’d prefer to create the “ansible” user as a locked account,with no password just an ssh public key, Kickstart on Ubuntu does not allow this, so I do set up an encrypted password.

To set up your own password, use the encrypt-pw script to create a SHA512-hashed password that you can copy and paste into the Kickstart file. After a VM is created you can use this password if you need to log into the VM via the console.

To use your own ssh key, replace the ssh key in the %post section with your own public key.

The %post section at the bottom of the Kickstart file does a couple of things:

  • It updates all packages with the latest versions.
  • To configure a VM with Ansible, you just need ssh access to a VM and Python installed. on the VM. So I use %post to install an ssh-server and Python.
  • I start the serial console, so that virsh console $vmname works.
  • I add a public key for Ansible, so I can configure the servers with Ansible without entering a password.

Despite the name, the commands in the %post section are not the last commands executed by Kickstart on an Ubuntu 18.10 server. The “ansible” user is added after the %post commands are executed. This means that the Ansible ssh public key gets added before the ansible user is created.

To make key-based logins work I set the UID:GID of authorized_keys to 1000:1000. The user is later created with UID=1000, GID=1000, which means that the authorized_keys file ends up being owned by the ansible user by the time the VM creation is complete.

Create an Ubuntu 18.10 server

This creates a VM using Ubuntu’s text-based installer. Since the `-d` parameter is used,progress of the install is shown on screen.

create-vm -n node1 \
-i ~/isos/ubuntu-18.10-server-amd64.iso \
-k ~/conf/ubuntu.ks \
-d

Create 8 Ubuntu 18.10 servers

This starts the VM creation process and exits. Creation of the VMs continues in the background.

for n in `seq 1 8`; do
create-vm -n node$n \
-i ~/isos/ubuntu-18.10-server-amd64.iso \
-k ~/conf/ubuntu.ks
done

Delete 8 virtual machines

for n in `seq 1 8`; do
delete-vm node$n
done

Connect to a VM via the console

virsh console node1

Connect to a VM via ssh

ssh ansible@`get-vm-ip node1`

Generate an Ansible hosts file

(
echo '[hosts]'
for n in `seq 1 8`; do
ip=`get-vm-ip node$n`
echo "node$n ansible_ip=$ip ansible_user=ansible"
done
) > hosts

Handy virsh commands

  • virsh list – List all running VMs.
  • virsh domifaddr node1 – Get a node’s IP address. Does not work with all network setups,which is why I wrote the get-vm-ip script.
  • virsh net-list – Show what networks were created by virsh.
  • virsh net-dhcp-leases $network – Shows current DHCP leases when virsh is acting as the DHCP server. Leases may be shown for machines that no longer exist.

Known Issues

  • VMs created without the -d (debug mode) parameter may be created in “stopped” mode. To start them up, run the command virsh start $vmname
  • Depending on how your host is set up, you may need to run these scripts as root.
  • Ubuntu text mode install messes up terminal screens. Run reset from the command line to restore a terminal’s functionality.
  • I use Ansible to set a guest’s hostname, not Kickstart, so all Ubuntu guests created have the host name “ubuntu”.

Hope you find this useful.

create-vm script

Download create-vm from Github

#!/bin/bash

# create-vm - Use .iso and kickstart files to auto-generate a VM.

# Copyright 2018 Earl C. Ruby III
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

HOSTNAME=
ISO_FQN=
KS_FQN=
RAM=1024
VCPUS=2
STORAGE=20
BRIDGE=virbr0
MAC="RANDOM"
VERBOSE=
DEBUG=
VM_IMAGE_DIR=/var/lib/libvirt

usage()
{
cat << EOF
usage: $0 options

This script will take an .iso file created by revisor and generate a VM from it.

OPTIONS:
-h Show this message
-n Host name (required)
-i Full path and name of the .iso file to use (required)
-k Full path and name of the Kickstart file to use (required)
-r RAM in MB (defaults to ${RAM})
-c Number of VCPUs (defaults to ${VCPUS})
-s Amount of storage to allocate in GB (defaults to ${STORAGE})
-b Bridge interface to use (defaults to ${BRIDGE})
-m MAC address to use (default is to use a randomly-generated MAC)
-v Verbose
-d Debug mode
EOF
}

while getopts "h:n:i:k:r:c:s:b:m:v:d" option; do
case "${option}"
in
h)
usage
exit 0
;;
n) HOSTNAME=${OPTARG};;
i) ISO_FQN=${OPTARG};;
k) KS_FQN=${OPTARG};;
r) RAM=${OPTARG};;
c) VCPUS=${OPTARG};;
s) STORAGE=${OPTARG};;
b) BRIDGE=${OPTARG};;
m) MAC=${OPTARG};;
v) VERBOSE=1;;
d) DEBUG=1;;
esac
done

if [[ -z $HOSTNAME ]]; then
echo "ERROR: Host name is required"
usage
exit 1
fi

if [[ -z $ISO_FQN ]]; then
echo "ERROR: ISO file name or http url is required"
usage
exit 1
fi

if [[ -z $KS_FQN ]]; then
echo "ERROR: Kickstart file name or http url is required"
usage
exit 1
fi

if ! [[ -f $ISO_FQN ]]; then
echo "ERROR: $ISO_FQN file not found"
usage
exit 1
fi

if ! [[ -f $KS_FQN ]]; then
echo "ERROR: $KS_FQN file not found"
usage
exit 1
fi
KS_FILE=$(basename "$KS_FQN")

if [[ ! -z $VERBOSE ]]; then
echo "Building ${HOSTNAME} using MAC ${MAC} on ${BRIDGE}"
echo "======================= $KS_FQN ======================="
cat "$KS_FQN"
echo "=============================================="
set -xv
fi

mkdir -p $VM_IMAGE_DIR/{images,xml}

virt-install \
--connect=qemu:///system \
--name="${HOSTNAME}" \
--bridge="${BRIDGE}" \
--mac="${MAC}" \
--disk="${VM_IMAGE_DIR}/images/${HOSTNAME}.img,bus=virtio,size=${STORAGE}" \
--ram="${RAM}" \
--vcpus="${VCPUS}" \
--autostart \
--hvm \
--arch x86_64 \
--accelerate \
--check-cpu \
--os-type=linux \
--force \
--watchdog=default \
--extra-args="ks=file:/${KS_FILE} console=tty0 console=ttyS0,115200n8 serial" \
--initrd-inject="${KS_FQN}" \
--graphics=none \
--noautoconsole \
--debug \
--location="${ISO_FQN}"

if [[ ! -z $DEBUG ]]; then
# Connect to the console and watch the install
virsh console "${HOSTNAME}"
virsh start "${HOSTNAME}"
fi

# Make a backup of the VM's XML definition file
virsh dumpxml "${HOSTNAME}" > "${VM_IMAGE_DIR}/xml/${HOSTNAME}.xml"

if [ ! -z $VERBOSE ]; then
set +xv
fi

ubuntu.ks Kickstart file

Download ubuntu.ks on Github.

# System language
lang en_US
# Language modules to install
langsupport en_US
# System keyboard
keyboard us
# System mouse
mouse
# System timezone
timezone --utc Etc/UTC
# Root password
rootpw --disabled
# Initial user
user ansible --fullname "ansible" --iscrypted --password $6$CfjrLvwGbzSPGq49$t./6zxk9D16P6J/nq2eBVWQ74aGgzKDrQ9LdbTfVA0IrHTQ7rQ8iq61JTE66cUjdIPWY3fN7lGyR4LzrGwnNP.
# Reboot after installation
reboot
# Use text mode install
text
# Install OS instead of upgrade
install
# Use CDROM installation media
cdrom
# System bootloader configuration
bootloader --location=mbr 
# Clear the Master Boot Record
zerombr yes
# Partition clearing information
clearpart --all 
# Disk partitioning information
part / --fstype ext4 --size 3700 --grow
part swap --size 200 
# System authorization infomation
auth  --useshadow  --enablemd5 
# Firewall configuration
firewall --enabled --ssh 
# Do not configure the X Window System
skipx
%post --interpreter=/bin/bash
echo ### Redirect output to console
exec < /dev/tty6 > /dev/tty6
chvt 6
echo ### Update all packages
apt-get update
apt-get -y upgrade
# Install packages
apt-get install -y openssh-server vim python
echo ### Enable serial console so virsh can connect to the console
systemctl enable serial-getty@ttyS0.service
systemctl start serial-getty@ttyS0.service
echo ### Add public ssh key for Ansible
mkdir -m0700 -p /home/ansible/.ssh
cat <<EOF >/home/ansible/.ssh/authorized_keys
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQC14kgOfzuOA4+hD16JFTAtXWc0UkzvXw3TrPxivh+/t86FH1N1qlXLLZn60voysLHnyiwTXrC8Zy0H8rckRopepozMBRegeklnkLHaiFDBdbAkHm8DMOd5QuGedBZ+s80+05btzOVxZcN5M6m4A03vLGGoOqnJvewv1u/yP6et0hP2Bs6D9ycczOpXeOgKnUXt1rciVYTk9xwOXFWcZ5phnXSCGA1w0BACK2CZaCKmsAT5YR7PA4N+7xVMvhOgo3gt8dNxNtmEtkZSlYYqkJehuldt1IPpfQ5/QYngYKX1ZCKS2LHc9Ys3F8QX3djhOhqFL+kcDMrdTaT5GlAFcgqebIao5mTYpiqc72YbvbCMWRVomhE0TWgliIftR/65NzOf8N4b2fE/hakLkIsGyR7TQiNmAHgagqBX/qdBJ7QJdNN7GH2aGP/Ni7b9xX2jsWXRj8QcSee+IDgfm2k/uKGvI6+RotRVx/EjwOGVUeGp8txP8l4T0AmYsgiL1Phe5swAPaMj4R+m38dwzr2WF/PViI1upF/Weczoiu0dDODLsijdHBAIju9BEDBzcFbDPoLLKHOSMusy86CVGNSEaDZUYwZ2GHY6anfEaRzTJtqRKNvsGiSRJAeKIQrZ16e9QPihcQQQNM+Z9QW7Ppaum8f2QlFMit03UYMplw0EpNPAWQ== ansible@host
EOF
echo ### Set permissions for Ansible directory and key. Since the "Initial user"
echo ### is added *after* %post commands are executed, I use the UID:GID
echo ### as a hack since I know that the first user added will be 1000:1000.
chown -R 1000:1000 /home/ansible
chmod 0600 /home/ansible/.ssh/authorized_keys
echo ### Change back to terminal 1
chvt 1