Adding a task bar to Gnome 3 on Ubuntu 11.10

To install Gnome 3 on Ubuntu 11.10 start up a terminal and type:

sudo apt-get install gnome-shell

To use Gnome 3 instead of Unity: when you log in, click the “gear” above your password. Select “Gnome”, log in.

After you get tired of “click Activities, find the window you want, click the window” every time you want to switch from one window to another, and you decide you really need a taskbar again to maintain your sanity, start up a terminal and type:

sudo apt-get install tint2
tint2 &

You now have a taskbar again. To get it to appear every time you start Gnome 3 go to Activities > Applications > Other > Startup Applications, then click “Add”, Name: “tint2 task bar”, Command: “tint2”, click “Save”.

Done.

Hope you find this useful.

Recovering from a lost connection when upgrading Ubuntu via ssh

I wanted to upgrade my desktop machine at work to the latest version of Ubuntu, but since it takes several hours to upgrade an Ubuntu host, and I have work to do during the day, I figured I could log into my workstation from home in using ssh and start the upgrade remotely.

So I logged into my workstation from home and ran:

> sudo apt-get install update-manager-core
> sudo do-release-upgrade

The upgrade script warned me that I was using ssh and asked if I was sure I wanted to continue. I said “Y”, and a little while later the upgrade manager was busy downloading upgrade packages.

I planned to check it a couple of times that night, answer any package upgrade questions that popped up, and then in the morning when I got to work the upgrade would be complete.

Of course what actually happened was that I got side-tracked onto some other problem that night, forgot about the upgrade in progress, and when I got to work the next day my workstation was in a state of limbo, with the upgrade halfway complete, waiting for me to answer some question on the screen — at my house.

Luckily the Ubuntu developers who created the ssh upgrade process run that upgrade inside of a screen session. As the screen pages states, “Screen is a full-screen window manager that multiplexes a physical terminal between several processes (typically interactive shells).”

So at work all I had to do was get the list of current screen sessions:

> sudo screen -list
There are screens on:
        9129.ubuntu-release-upgrade-screen-window       (05/17/2011 08:50:08 PM)        (Attached)
2 Sockets in /var/run/screen/S-root.

Invoke screen using the “-d -r sessionowner/[pid.tty.host]” flags:

> sudo screen -d -r root/9129.ubuntu-release-upgrade-screen-window

… and I could pull up the screen at work that had been displaying at my home. Once I answered the remaining questions about whether to keep my custom configuration files or use the new, packaged configuration files my workstation rebooted and the latest version of Ubuntu booted right up.

Increasing the size of an LVM Physical Volume (PV) while running multipathd — without rebooting

If you’re using the Linux Logical Volume Manager (LVM) to manage your disk space it’s easy to enlarge a logical volume while a server is up and running. It’s also easy to add new drives to an existing volume group.

But if you’re using a SAN the underlying physical drives can have different performance characteristics because they’re assigned to different QOS bands on the SAN. If you want to keep performance optimized it’s important to know what physical volume a logical volume is assigned to — otherwise you can split a single logical volume across multiple physical volumes and end up degrading system performance. If you run out of space on a physical volume and then enlarge a logical volume you will split the LV across two or more PVs. To prevent this from happening you need to enlarge the LUN, tell multipathd about the change, then enlarge the PV, then enlarge the LV, and finally enlarge the file system.

I have three SANs at the company where I work (two Pillar Axioms and a Xyratex) which are attached two two fibrechannel switches and several racks of blade servers. Each blade is running an Oracle database with multiple physical volumes (PVs) grouped together into a single LVM. The PVs are tagged and as logical volumes (LVs) are added they’re assigned to the base physical volume with the same tag name as the logical volume. That way we can assign the PV to a higher or lower performance band on the SAN and optimize the database’s performance. Oracle tablespaces that contain frequently-accessed data get assigned to a PV with a higher QOS band on the SAN. Archival data gets put on a PV with a lower QOS band.

We run OpenSUSE 11.x using multipathd to deal with the multiple fiber paths available between each blade and a SAN. Since each blade has 2 fiber ports for redundancy, which are attached to two fiber switches, each of which is cross-connected to 2 ports on 2 different controllers on the SAN, so there are 4 different fiber paths that data can take between the blade and the SAN. If any path fails, or one port on a fiber card fails, or one fiber switch fails, multipathd re-routes the data using the remaining data paths and everything keeps working. If a blade fails we switch to another blade.

If we run out of space on a PV I can log into the SAN’s administrative interface and enlarge the size of the underlying LUN, but getting the operating system on the blade to recognize the fact that more physical disk space was available is tricky. LVM’s pvresize command would claim that it was enlarging the PV, but nothing would happen unless the server was rebooted and then pvresize was run again. I wanted to be able to enlarge physical volumes without taking a database off-line and rebooting its server. Here’s how I did it:

  • First log into the SAN’s administrative interface and enlarge the LUN in question.
  • Open two xterm windows on the host as root
  • Gather information – you will need the physical device name, the multipath block device names, and the multipath map name. (Since our setup gives us 4 data paths for each LUN there are 4 multipath block device names.)
  • List the physical volumes and their associated tags with pvs -o +tags:
    # pvs -o +tags
      PV         VG     Fmt  Attr PSize   PFree   PV Tags                
      /dev/dm-1  switch lvm2 a-   500.38G 280.38G db024-lindx,lindx      
      /dev/dm-10 switch lvm2 a-     1.95T 801.00G db024-ldata,ldata      
      /dev/dm-11 switch lvm2 a-    81.50G      0  db024-mindx,mindx      
      /dev/dm-12 switch lvm2 a-   650.00G 100.00G db024-reports,reports  
      /dev/dm-13 switch lvm2 a-    51.25G  31.25G db024-log,log          
      /dev/dm-14 switch lvm2 a-   450.12G  50.12G db024-home,home        
      /dev/dm-15 switch lvm2 a-     1.76T 342.00G db024-q_backup,q_backup
      /dev/dm-16 switch lvm2 a-     1.00G 640.00M db024-control,control  
      /dev/dm-2  switch lvm2 a-   301.38G 120.38G db024-dbs,dbs          
      /dev/dm-3  switch lvm2 a-   401.88G 101.88G db024-cdr_data,cdr_data
      /dev/dm-5  switch lvm2 a-   450.62G 290.62G db024-archlogs,archlogs
      /dev/dm-6  switch lvm2 a-    40.88G  22.50G db024-boot,boot        
      /dev/dm-7  switch lvm2 a-    51.25G   1.25G db024-rbs,rbs          
      /dev/dm-8  switch lvm2 a-    51.25G  27.25G db024-temp,temp        
      /dev/dm-9  switch lvm2 a-   201.38G 161.38G db024-summary,summary
  • Find the device that corresponds to the LUN you just enlarged, e.g. /dev/dm-11
  • Run multipath -ll, find the device name in the listing. The large hex number at the start of the line is the multipath map name and the sdX block devices after the device name are the multipath block devices. So in this example the map name is 2000b080112002142 and the block devices are sdy, sdan, sdj, and sdbc:
    2000b080112002142 dm-11 Pillar,Axiom 500                 
    [size=82G][features=1 queue_if_no_path][hwhandler=0][rw] 
    \_ round-robin 0 [prio=100][active]                      
     \_ 0:0:5:9  sdy        65:128 [active][ready]           
     \_ 1:0:4:9  sdan       66:112 [active][ready]           
    \_ round-robin 0 [prio=20][enabled]                      
     \_ 0:0:4:9  sdj        8:144  [active][ready]           
     \_ 1:0:5:9  sdbc       67:96  [active][ready]
  • Next get multipath to recognize that the device is larger:
    • For each block device do echo 1 > /sys/block/sdX/device/rescan:
      # echo 1 > /sys/block/sdy/device/rescan
      # echo 1 > /sys/block/sdan/device/rescan
      # echo 1 > /sys/block/sdj/device/rescan
      # echo 1 > /sys/block/sdbc/device/rescan
    • In the second root window, pull up a multipath command line with multipathd -k
    • Delete and re-add the first block device from each group. Since multipathd provides multiple paths to the underlying SAN, the device will remain up and on-line during this process. Make sure that you get an ‘ok’ after each command. If you see ‘fail’ or anything else besides ‘ok’, STOP WHAT YOU’RE DOING and go to the next step.
      multipathd> del path sdy                                             
      ok                                                                   
      multipathd> add path sdy                                             
      ok                                                                   
      multipathd> del path sdj                                             
      ok                                                                   
      multipathd> add path sdj                                             
      ok
    • If you got a ‘fail’ response:
      • Type exit to get back to a command line.
      • Type multipath -r on the command line. This should recover/rebuild all block device paths.
      • Type multipath -ll | less again and verify that the block devices were re-added.
      • At this point multipath may actually recognize the new device size (you can see the size in the multipath -ll output). If everything looks good, skip ahead to the pvresize step.
    • In the first root window run multipath -ll again and verify that the block devices were re-added:
      2000b080112002142 dm-11 Pillar,Axiom 500                 
      [size=82G][features=1 queue_if_no_path][hwhandler=0][rw] 
      \_ round-robin 0 [prio=100][active]                      
       \_ 1:0:4:9  sdan       66:112 [active][ready]           
       \_ 0:0:5:9  sdy        65:128 [active][ready]           
      \_ round-robin 0 [prio=20][enabled]                      
       \_ 1:0:5:9  sdbc       67:96  [active][ready]           
       \_ 0:0:4:9  sdj        8:144  [active][ready]
    • Delete and re-add the remaining two block devices in the second root window:
      multipathd> del path sdan
      ok                       
      multipathd> add path sdan
      ok                       
      multipathd> del path sdbc
      ok                       
      multipathd> add path sdbc
      ok
    • In the first root window run multipath -ll again and verify that the block devices were re-added.
    • Tell multipathd to resize the block device map using the map name:
      multipathd> resize map 2000b080112002142
      ok
    • Press Ctrl-D to exit multipathd command line.
  • In the first root window run multipath -llagain to verify that multipath sees the new physical device size. The device below went from 82G to 142G:
    2000b080112002142 dm-11 Pillar,Axiom 500
    [size=142G][features=1 queue_if_no_path][hwhandler=0][rw]
    \_ round-robin 0 [prio=100][active]
     \_ 0:0:5:9  sdy        65:128 [active][ready]
     \_ 1:0:4:9  sdan       66:112 [active][ready]
    \_ round-robin 0 [prio=20][enabled]
     \_ 0:0:4:9  sdj        8:144  [active][ready]
     \_ 1:0:5:9  sdbc       67:96  [active][ready]
  • Finally, get the LVM volume group to recognize that the physical volume is larger using pvresize:
    # pvresize /dev/dm-11
      Physical volume "/dev/dm-11" changed
      1 physical volume(s) resized / 0 physical volume(s) not resized
    # pvs -o +tags
      PV         VG     Fmt  Attr PSize   PFree   PV Tags
      /dev/dm-1  switch lvm2 a-   500.38G 280.38G db024-lindx,lindx
      /dev/dm-10 switch lvm2 a-     1.95T 801.00G db024-ldata,ldata
      /dev/dm-11 switch lvm2 a-   141.50G  60.00G db024-mindx,mindx
      /dev/dm-12 switch lvm2 a-   650.00G 100.00G db024-reports,reports
      /dev/dm-13 switch lvm2 a-    51.25G  31.25G db024-log,log
      /dev/dm-14 switch lvm2 a-   450.12G  50.12G db024-home,home
      /dev/dm-15 switch lvm2 a-     1.76T 342.00G db024-q_backup,q_backup
      /dev/dm-16 switch lvm2 a-     1.00G 640.00M db024-control,control
      /dev/dm-2  switch lvm2 a-   301.38G 120.38G db024-dbs,dbs
      /dev/dm-3  switch lvm2 a-   401.88G 101.88G db024-cdr_data,cdr_data
      /dev/dm-5  switch lvm2 a-   450.62G 290.62G db024-archlogs,archlogs
      /dev/dm-6  switch lvm2 a-    40.88G  22.50G db024-boot,boot
      /dev/dm-7  switch lvm2 a-    51.25G   1.25G db024-rbs,rbs
      /dev/dm-8  switch lvm2 a-    51.25G  27.25G db024-temp,temp
      /dev/dm-9  switch lvm2 a-   201.38G 161.38G db024-summary,summary

    pvs shows that /dev/dm-11 is now 141.5G.

At this point you can enlarge any logical volumes residing on the underlying physical volume without splitting the logical volume across multiple (non-contiguous) physical volumes using lvresize and enlarge the file system using the file system tools, e.g. resize2fs.

If you ran out of space, your LVs were split across multiple PVs, and you need to coalesce a PV onto a single LV use pvmove to move the physical volume to a single device.

Hope you find this useful.

Eliminating PXE Boot “Missing parameter in config file” Error Messages

I’d recently made a bunch of changes to my company’s server build scripts, making changes that automatically generate system builds for a variety of operating systems and CPU architectures. We use PXE Boot to boot servers, then install images from a central install server which is running OpenSUSE 11.1.

After I made the changes I started seeing the error message “Missing parameter in config file” at boot:

XELINUX 3.07 0x41e470ae  Copyright (C) 1994-2005 H. Peter Anvin                 
Missing parameter in config file.                                               
Missing parameter in config file.                                               
Missing parameter in config file.                                               
Missing parameter in config file.                                               
Missing parameter in config file.                                               
Missing parameter in config file.                                               
Missing parameter in config file.                                               
Missing parameter in config file.                                               
boot:

Everything worked just fine, we had no problems booting or building servers, but the error messages seemed to indicate that there was a problem. I searched the net to see if anyone else was having this problem, and many people were, but no one seemed to have a good answer as to what was causing the error message.

Our pxelinux.cfg/default config file lists about a hundred different boot options, so I made a backup copy and started deleting portions of pxelinux.cfg/default to see if I could reduce the number of errors.

I managed to reduce the number of errors, but couldn’t find anything in the lines I’d deleted that would cause a problem — I was deleting working configurations from the list, and I couldn’t figure out why a working configuration would cause an error message.

I finally figured it out — it was the comment blocks that I’d added. Something is buggy about the way that PXE parses comment lines. I’d commented out some old labels and even though every line started with “#”, those were the lines that were causing the “Missing parameter in config file” errors.

To solve the problem, I edited pxelinux.cfg/default, deleted blank lines in-between comments and labels, deleted blank lines between comments, deleted spaces between “#” and the start of any comment text, deleted old labels that had previously been commented out, and restarted atftpd.

By simplifying the comment lines and deleting old blocks of commented-out code I eliminated the error messages. I couldn’t believe that comment lines could be causing errors, so I tried various combinations of edits on the pxelinux.cfg/default file, restarting atftpd and rebooting a spare server about 30 times, but the only thing that eliminated the error messages was removing comment lines and blank lines from pxelinux.cfg/default.

Hope you find this useful.

Adding an external encrypted drive with LVM to Ubuntu Linux

I recently added an external eSATA drive to my home computer so I could back up critical data from my home network to one drive. I bought a Western Digital 1TB “green” drive and a Thermaltake external hard drive enclosure with eSATA and USB connectors.

Since my internal hard drives are encrypted it didn’t make sense to back up all of that data to an unencrypted external drive. I’d read Uwe Hermann’s excellent how-to article on disk encryption, but he didn’t cover setting up an LVM partition, which I always use so I can change drive volume sizes on the fly.

This is what I did to set up an external encrypted drive with LVM on an Ubuntu system:

  1. Open a terminal
  2. Get a root prompt:
    sudo /bin/bash
  3. Watch the system log:
    tail -f /var/log/messages
  4. Attach the external drive. The system log tells me that it was detected as /dev/sdc.
  5. Check the drive for bad blocks (takes a couple of hours):
    badblocks -c 10240 -s -w -t random -v /dev/sdc
  6. Write random data to the entire drive. This step takes all night, but it ensures that never-written drive space can’t be differentiated from encrypted data if someone ever tries to crack the drive. (If you’re going to do this, you might as well do it right.)
    shred -v -n 1 /dev/sdc
  7. Create one big LVM partition on the drive using fdisk. Set up one big primary partition /dev/sdc1, set the tag to system id “8e” LVM, and write the changes to disk:
    > fdisk /dev/sdc                                                                                                                                              
    Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel                                                                                                  
    Building a new DOS disklabel with disk identifier 0xa6846916.                                                                                                                       
    Changes will remain in memory only, until you decide to write them.                                                                                                                 
    After that, of course, the previous content won't be recoverable.                                                                                                                   
    
    
    The number of cylinders for this disk is set to 121575.
    There is nothing wrong with that, but this is larger than 1024,
    and could in certain setups cause problems with:               
    1) software that runs at boot time (e.g., old versions of LILO)
    2) booting and partitioning software from other OSs                                                                                                                                 
       (e.g., DOS FDISK, OS/2 FDISK)                                                                                                                                                    
    Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)                                                                                                      
                                                                                                                                                                                        
    Command (m for help): p                                                                                                                                            
                                                                                                                                                                                        
    Disk /dev/sdc: 999.9 GB, 999989182464 bytes                                                                                                                                         
    255 heads, 63 sectors/track, 121575 cylinders                                                                                                                                       
    Units = cylinders of 16065 * 512 = 8225280 bytes                                                                                                                                    
    Disk identifier: 0xa6846916                                                                                                                                                         
                                                                                                                                                                                        
       Device Boot      Start         End      Blocks   Id  System                                                                                                                      
                                                                                                                                                                                        
    Command (m for help): n                                                                                                                                            
    Command action                                                                                                                                                                      
       e   extended                                                                                                                                                                     
       p   primary partition (1-4)                                                                                                                                                      
    p                                                                                                                                                                  
    Partition number (1-4): 1                                                                                                                                          
    First cylinder (1-121575, default 1): [ENTER]                                                                                                                      
    Using default value 1
    Last cylinder, +cylinders or +size{K,M,G} (1-121575, default 121575): [ENTER]
    Using default value 121575
    
    Command (m for help): t
    Selected partition 1
    Hex code (type L to list codes): 8e
    Changed system type of partition 1 to 8e (Linux LVM)
    
    Command (m for help): p
    
    Disk /dev/sdc: 999.9 GB, 999989182464 bytes
    255 heads, 63 sectors/track, 121575 cylinders
    Units = cylinders of 16065 * 512 = 8225280 bytes
    Disk identifier: 0xa6846916
    
       Device Boot      Start         End      Blocks   Id  System
    /dev/sdc1               1      121575   976551156   8e  Linux LVM
    
    Command (m for help): w
    The partition table has been altered!
    
    Calling ioctl() to re-read partition table.
    Syncing disks.
  8. Use cryptsetup to encrypt the drive:
    cryptsetup --verbose --verify-passphrase luksFormat /dev/sdc1
  9. Unlock the drive:
    cryptsetup luksOpen /dev/sdc1 backupexternal
  10. Create the LVM physical volume:
    pvcreate /dev/mapper/backupexternal
  11. Create the LVM volume group:
    vgcreate xbackup /dev/mapper/backupexternal
  12. Create a logical volume within the volume group:
    lvcreate -L 500G -n backupvol /dev/xbackup
  13. At this point you have a device named /dev/xbackup/backupvol, so create a filesystem on the logical volume:
    mkfs.ext4 /dev/xbackup/backupvol
  14. Mount the volume:
    mount /dev/xbackup/backupvol /mnt/backup
  15. To get the volume to mount automatically at boot time add this line to your /etc/fstab file:
    /dev/xbackup/backupvol      /mnt/backup     ext4    defaults        0 5
  16. To be prompted for the decryption key / passphrase at boot time first get the drive’s UUID:
    ls -l /dev/disk/by-uuid

    (In my example I use the UUID for /dev/sdc1)

  17. Then add this line to the /etc/crypttab file:
    backupexternal UUID=[the UUID of the drive] none luks

That’s it. You now have an external, encrypted hard drive with LVM installed. You’ve created one 500GB volume that uses half the disk, leaving 500GB free for other volumes, or for expanding the first volume.

Hope you find this useful.