Why adding a .conf or .cfg file to /etc/sudoers.d doesn’t work

I needed to add some sudo access rights for support personnel on about a hundred Centos 6.6 servers. Currently no one one these hosts had sudo rights, so the /etc/sudoers file was the default file. I’m using Ansible to maintain these hosts, but rather than modify the default /etc/sudoers file using Ansible’s lineinfile: command, I decided to create a support.conf file and use Ansible’s copy: command to copy that file into /etc/sudoers.d/. That way if a future version of Centos changes the /etc/sudoers file I’m leaving that file untouched, so my changes should always work.

  - name: Add custom sudoers
    copy: src=files/support.conf dest=/etc/sudoers.d/support.conf owner=root group=root mode=0440 validate='visudo -cf %s'

The support.conf file I created copied over just fine, and the validation step of running “visudo -cf” on the file before moving it into place claimed that the file was error-free and should work just fine as a sudoers file.

I logged in as the support user and it didn’t work:

[support@c1n1 ~]$ sudo /bin/ls /var/log/*
support is not in the sudoers file.  This incident will be reported.

Not only did it not work, it was telling me that the support user wasn’t even in the file, which they clearly were.

After Googling around a bit and not finding much I saw this in the Sudoers Manual:

sudo will read each file in /etc/sudoers.d, skipping file names that end in ‘~’ or contain a ‘.’ character to avoid causing problems with package manager or editor temporary/backup files.

sudo was skipping the file because the file name contained a period!

I changed the name of the file from support.conf to support and it worked.

  - name: Add custom sudoers
    copy: src=files/support dest=/etc/sudoers.d/support owner=root group=root mode=0440 validate='visudo -cf %s'

Hope you find this useful.

Here’s a snippet from /etc/sudoers.d/support if you’re interested. The “support” user has already been created by a separate Ansible command.

# Networking
Cmnd_Alias NETWORKING = /sbin/route, /sbin/ifconfig, /bin/ping, /sbin/dhclient, /usr/bin/net, /sbin/iptables, /usr/bin/rfcomm, /usr/bin/wvdial, /sbin/iwconfig, /sbin/mii-tool

# Installation and management of software
Cmnd_Alias SOFTWARE = /bin/rpm, /usr/bin/up2date, /usr/bin/yum

# Services
Cmnd_Alias SERVICES = /sbin/service, /sbin/chkconfig

# Reading logs
Cmnd_Alias READ_LOGS = /usr/bin/less /var/log/*, /bin/more /var/log/*, /bin/ls /var/log/*, /bin/ls /var/log

Share Button

Use Web of Trust (WOT) to thwart scammy web sites

My friend Shannon Phillips recently updated her Facebook status with:

Word to travelers: do not book hotel rooms through TripAdvisor. They will funnel you through sketchy third-party sites (“Amoma” is the one who burned me) who advertise made-up rates, take your money, and then get back in touch two weeks later to tell you oopsie, they can’t make a reservation at that hotel after all.

I guess it’s a nice scam while it lasts, but in this age of networked, instant word-of-mouth reviews, that kind of business model won’t hold up long.

I suggested Shannon try installing the Web of Trust (WOT) plug-in for her browser. I use it in all of mine, and it’s stopped scam sites from being loaded into my browser.

WOT works for the web like Waze works for driving. Here’s the explanation from the Web of Trust home page:

WOT displays a colored traffic light next to website links to show you which sites people trust for safe searching, surfing and shopping online: green for good, red for bad, and yellow as a warning to be cautious. The icons are shown in popular search engine results, social media, online email, shortened URL’s, and lots of other sites.

The cool part is, the rating is based on the aggregate ratings of all of the people who use a plug-in. Get burned by a site? Click the WOT icon and rate the site as untrustworthy. Have an excellent experience? Click the WOT icon and rate the site as trustworthy. The more that people use it, the more accurate and reliable the ratings become.

If a site is really untrustworthy, WOT will stop your browser from loading the site unless you tell it that you really want to go to that site. You can still go anywhere you want, but you’ll be warned about sites that others have had problems with.

Share Button

Restarting network interfaces in Ansible

I’m using Ansible to set up the network interface cards of multiple racks of storage servers running Centos 6.6. Each server has four network interfaces to configure, a public 1GbE interface, a private 1GbE interface, and two 10GbE interfaces that are set up as a bonded 20GbE interface with two VLANs assigned to the bond.

If Ansible changes an interface on a server it calls a handler to restart the network interfaces so the changes go into effect. However, I don’t want the network interfaces of every single server in a cluster to restart at the same time, so at the beginning of my network.yml playbook I set:

  serial: 1

That way Ansible just updates the network config of one server at a time.

Also, if there are any failures I want Ansible to stop immediately, so if I screwed something up I don’t take out the networking to every computer in the cluster. For this reason I also set:

max_fail_percentage: 1

If a change is made to an interface I’ve been using the following handler to restart the interface:

- name: Restart Network
  service: name=network state=restarted

That works, but about half the time Ansible detects a failure and drops out with an error, even though the network restarted just fine. Checking the server immediately after Ansible says that there’s an error shows that the server is running and it’s network interfaces were configured correctly.

This behavior is annoying since you have to restart the entire playbook after one server fails. If you’re configuring many racks of servers and the network setup is just updating one server at a time I’d end up having to restart the playbook a half dozen times to get through it, even though nothing was actually wrong.

At first I thought that maybe the ssh connection was dropping (I was restarting the network after all) but you can log in via ssh and restart the network and never lose the connection, so that wasn’t the problem.

The connection does pause as the interface that you’re ssh-ing in over resets, but the connection comes right back.

I wrote a short script to repeatedly restart the network interfaces and check the exit code returned, but the exit code was always 0, “no errors”, so network restart wasn’t reporting an error, but for some reason Ansible thought there was a failure.

There’s obviously some sort of timing issue causing a problem, where Ansible is checking to see if all is well, but since the network is being reset the check times out.

I initially came up with this workaround:

- name: Restart Network
  shell: service network restart; sleep 3

That fixes the problem, however, since “sleep 3″ will always exit with a 0 exit code (success), Ansible will always think this worked even when the network restart failed. (Ansible takes the last exit code returned as the success/failure of the entire shell operation.) If “service network restart” actually does fail, I want Ansible to stop processing.

In order to preserve the exit code, I wrote a one-line Perl script that restarts the network, sleeps 3 seconds, then exits with the same exit code returned by “service network restart”.

- name: Restart Network
  # Restart the network, sleep 3 seconds, return the
  # exit code returned by "service network restart".
  # This is to work-around a glitch in Ansible where
  # it detects a successful network restart as a failure.
  command: perl -e 'my $exit_code = system("service network restart"); sleep 3; $exit_code = $exit_code >> 8; exit($exit_code);'

Now Ansible grinds through the network configurations of all of the hosts in my racks without stopping.

Hope you find this useful.

Share Button

Peerio promises privacy for everyone

A new company called Peerio is promising secure, easy messaging and file sharing for everyone. They’re building apps that encrypt everything you send or share, making the code for these apps open source, and paying for security audits to peer-review the source code, looking for security weaknesses.

They’ve put together a short video to explain the basics of what they offer. I thought I’d give it a try and see how it works.

I went to Peerio.com using the Chrome browser, so the home page automatically offered to install Peerio on Chrome.

I clicked the install button and Peerio popped up as a new Chrome app.


Clicking the app brought up the new account screen, with the word “beta” displayed in small type just under the company logo, so they’re letting me know up front that this is going to be a little rough.


I clicked Sign Up, added a user name and email address, and was prompted for a pass phrase.

I have a couple of pass phrases I use. I typed one in, but apparently it wasn’t long enough. I tried another and another. Not long enough. The words “ALMOST THERE. JUST A FEW MORE LETTERS…” appeared on screen. One phrase I typed in had 40+ letters in it, but still the words “ALMOST THERE. JUST A FEW MORE LETTERS…” persisted. Tried again, this time putting spaces between the words. Phrase accepted! Maybe the check is trying to verify the number of space-separated words, not the total number of characters? Anyhow, got past that hurdle.

Next it sends you an email with a confirmation code and gives you 10 minutes (with a second by second countdown) to enter the confirmation code. I guess if you don’t enter it within 10 minutes your account is toast?

Once past that step I was prompted to create a shorter PIN code that can be used to login to the site. The long pass phrase is only needed to log in the first time you use a new device, after that your PIN can be used. I tried entering a few short number sequences. All were rejected as “too weak” so I used a strong, unique password with a mix of upper and lowercase letters, numbers, and special characters. The screen hid what I was typing and only asked for the PIN once, so if I thumb-fingered it, my account was going to be rendered useless pretty quickly. Hopefully I typed what I thought I typed.


Of course to use the service to send messages to people you have to load your contacts in. I added a friend’s email and Peerio sent him an invite. Tried adding another email address and the “Add Contact” form cut me off at the “.c” in “.com” — looks like the folks at Peerio only let you have friends with email addresses that are less than 16 characters long. My friends at monkeybots.com, you’re out of luck.


The Contacts tab has sub-tabs for “All Contacts”, “Confirmed Contacts”, and “Pending Contacts”, but the one email address I entered that was less than 16 characters long didn’t show up anywhere (I expected to see it under “Pending Contacts”). With my entries disappearing or truncated, I stopped trying to use the system.

It’s an interesting idea for a service, the source code for the clients is supposed to be available on Github, but the Peerio.com site directed me to https://github.com/TeamPeerio for the source, and that link is 404. Searching Github for “Peerio” shows https://github.com/PeerioTechnologies/peerio-client and https://github.com/PeerioTechnologies/peerio-website, so it looks like this is just a case of a BETA web site with a broken link.

Before the developers pay for another security audit, they really ought to try doing some basic usability testing — set up a new user in front of a laptop, and make two videos — one of the keyboard and screen and one of the user’s face, and then watch them try to log in and set up an account. I think they’d find the experience invaluable.

Anyhow, if you’re interested and feel like trying out their very BETA (feels like ALPHA) release, head over to Peerio.com and sign up. If you want to send me a message, you can reach me on Peerio as “earl”.

Share Button

Stop mounting ISO files in Linux with “-t iso9660″

Google “How do I mount an ISO image in Linux” and most of the links still say to use “-t iso9660″. For example:

mount -t iso9660 -o loop,ro diskimage.iso /mnt/iso

That worked fine 10 years ago, but these days not all ISOs use ISO9660 file systems. Many use the UDF (Universal Disk Format) file system, and if you specify ISO9660 when mounting a UDF ISO file, subtle problems can occur. For instance, file names that contain upper case letters on a UDF file system will appear in lower case when that ISO is mounted using ISO9660.

On any modern Linux distro mount is smart enough to figure out what type of file system to use when mounting an ISO file, so it’s perfectly fine to let mount infer the type, e.g.:

mount -o loop,ro diskimage.iso /mnt/iso

Here’s an example of what happens when you try to mount a type UDF ISO as type ISO9660. Note that the case of the file names changes to all lower case when mounting as iso9660, which in this case causes subtle errors to occur within the software.

[~]$ blkid /srv/isos/specsfs/SPECsfs2014-1.0.iso
/srv/isos/specsfs/SPECsfs2014-1.0.iso: UUID="2014-10-22-15-52-41-00" LABEL="SPEC_SFS2014" TYPE="udf"

[~]$ mount -t iso9660 -o loop,ro /srv/isos/specsfs/SPECsfs2014-1.0.iso /mnt/iso
[~]$ cd /mnt/iso
[/mnt/iso]$ ls
benchmarks.xml    netmist_modify     redistributable_sources
binaries          netmist_modify.c   sfs2014result.css
copyright.txt     netmist_monitor    sfs_ext_mon
docs              netmist_monitor.c  sfsmanager
import.c          netmist_pro.in     sfs_rc
license.txt       netmist_proj       spec_license.txt
makefile          netmist.sln        specreport
map_share_script  notice             submission_template.xml
mempool.c         pdsm               token_config_file
mix_table.c       pdsmlib.c          win32lib
netmist.c         rcschangelog.txt   workload.c
netmist.h         readme.txt

[/mnt/iso]$ cd
[~]$ umount /mnt/iso
[~]$ mount -o loop,ro /srv/isos/specsfs/SPECsfs2014-1.0.iso /mnt/iso
[~]$ cd /mnt/iso
[/mnt/iso]$ ls
benchmarks.xml    netmist_modify     redistributable_sources
binaries          netmist_modify.c   sfs2014result.css
copyright.txt     netmist_monitor    sfs_ext_mon
docs              netmist_monitor.c  SfsManager
import.c          netmist_pro.in     sfs_rc
license.txt       netmist_proj       SPEC_LICENSE.txt
makefile          netmist.sln        SpecReport
Map_share_script  NOTICE             submission_template.xml
mempool.c         pdsm               token_config_file
mix_table.c       pdsmlib.c          win32lib
netmist.c         rcschangelog.txt   workload.c
netmist.h         README.txt
Share Button

Click to stream .m3u files in Ubuntu

I just recently heard about CCMixter.org on FLOSS Weekly. CCMixter.org is a resource and collaborative space for musicians and remixers. They have thousands of music tracks which can be downloaded, remixed, sampled, or streamed.

I recently did a fresh install of Ubuntu on the computer I was using, and clicking on any of CCMixter’s streaming links caused a window to pop up asking me if I wanted to play the stream using Rhythmbox or “Other”. Selecting Rhythmbox popped up Rhythmbox, but it wouldn’t play the stream. Googling around a bit led me to discussions of Rhythmbox brokenness going back to 2008, so I took a different tack.

I fired up Synaptic Package Manager and installed the VLC Media Player.

Then I clicked the gear icon on Unity’s upper right menu bar, selected “About this Computer”, clicked Default Applications, and changed the default application for Music to “VLC Media Player.”

Now when I click on a link to an .m3u stream, Ubuntu sends the link to VLC, and the music starts to play.

Hope you find this useful.

Share Button

Get Ansible’s “pip” method to install the right version of Django

I was using Ansible to set up a bunch of Scientific Linux 6.6 servers running Django and I wanted to use a specific version of Django, version 1.6.5, on all servers.

Ansible makes this easy with the “pip” module:

  - name: Install pip package from yum
    yum: name={{ item }} state=present
    - python-pip
    - python-setuptools

  - name: Install Django 1.6.5
    pip: name=django version=1.6.5 state=present

This works great if you’re installing on a clean, empty server, but if you’re upgrading a server that had an older version of Django on it (1.6.4 in my case) Ansible will act as if it’s installing 1.6.5, but when it’s done I still had version 1.6.4.

If I try using straight PIP commands I get this:

$ pip install django==1.6.5
Downloading/unpacking django==1.6.5
  Running setup.py egg_info for package django
    warning: no previously-included files matching '__pycache__' found under directory '*'
    warning: no previously-included files matching '*.py[co]' found under directory '*'
  Requested django==1.6.5, but installing version 1.6.4
Installing collected packages: django
  Found existing installation: Django 1.6.4
    Uninstalling Django:
      Successfully uninstalled Django
  Running setup.py install for django
    warning: no previously-included files matching '__pycache__' found under directory '*'
    warning: no previously-included files matching '*.py[co]' found under directory '*'
    changing mode of /usr/bin/django-admin.py to 755
Successfully installed django
Cleaning up...

Note the line “Requested django==1.6.5, but installing version 1.6.4″. Thanks PIP!

It turned out to be a bug in PIP versions earlier than PIP 1.4, not Ansible. A little Googling turned up a page on Stackoverflow that pointed the finger at an old cached copy of 1.6.4 in the build directory, which I found in /tmp/pip-build-root.

I updated my Ansible YAML file to get rid of the temporary directory and now it works fine:

  - name: Install pip package from yum
    yum: name={{ item }} state=present
    - python-pip
    - python-setuptools

  - name: Remove PIP temp directory
    file: path=/tmp/pip-build-root state=absent

  - name: Install Django 1.6.5
    pip: name=django version=1.6.5 state=present

Hope you find this useful.

Share Button

2014 HPCwire Awards

The StratoStor project I’ve been working on for the past 10 months just got a “Top 5 New Products or Technologies to Watch” award from HPCwire announced at this week’s SuperComputing 2014 (SC14) conference in New Orleans.

HPC = High Performance Computing, HPCwire is a news bureau for all things regarding High Performance Computing, and SC14 is where every major vendor of HPC equipment and products shows off their wares, so getting this bit of recognition from the readers of HPCwire is really nice.

So THANK YOU HPCwire readers, for this award.


2014 HPCwire Awards

Share Button

Validating Distributed Application Workloads

This is the talk I gave at RICON this year on Validating Distributed Application Workloads. It’s about how we set up test environments at Seagate for validating storage system performance at the petabyte scale. This talk centers around the testing done to validate performance of a 2PB rack running Riak CS.

Share Button

Increase a VM’s available memory with virsh

If you try to increase the amount of available memory using the obvious command it fails with an error message:

# virsh setmem <vm name> 16G --live
error: invalid argument: cannot set memory higher than max memory

The physical host in this case has 128G RAM and 32 CPUs. Plenty of capacity. To increase the maximum amount of memory that can be allocated to the VM:

# virsh setmaxmem <vm name> 16G --config

There are also –live and –current options which claim to affect the running/current domain. These options do not actually work. You have to use the –config option (changes take effect after next boot) and then power off the machine by logging in and running “poweroff”.

Once the machine is off set the actual memory with:

# virsh setmem <vm name> 16G --config

Then start the vm:

# virsh start <vm name>

Once the VM starts up it will have more memory.

Hope you find this useful.

Share Button