Corosync / Pacemaker

While building high-availability Linux computing clusters with Corosync, Pacemaker and OpenSUSE I’ve assembled a collection of useful notes. I’ll share them here, and add to them as time permits.

Terms and abbreviations

Corosync allows any number of servers to be part of the cluster using any number of fault-tolerant configurations (active/passive, active/active, N+1, etc.)

Corosync provides messaging between servers within the same cluster.

Pacemaker manages the resources and applications on a node within the cluster.

OpenAIS can be thought of as the API between Corosync and Pacemaker, as well as between Corosync and other plug in components.

The ClusterLabs FAQ explains it this way: “Originally Corosync and OpenAIS were the same thing. Then they split into two parts… the core messaging and membership capabilities are now called Corosync, and OpenAIS retained the layer containing the implementation of the AIS standard. Pacemaker itself only needs the Corosync piece in order to function, however some of the applications it can manage (such as OCFS2 and GFS2) require the OpenAIS layer as well.”

“DC” stands for “Designated Co-ordinator”, the node with final say on the cluster’s current state.

“dlm” is the “distributed lock manager” used for lock-update-unlock data updates across the cluster.

Pacemaker Daemons (managed by Corosync):

  • lrmd – local resource manager daemon
  • crmd – cluster resource manager daemon
  • cib – cluster information base (database of cluster information)
  • pengine – policy engine
  • stonithd – “shoot the other node in the head” daemon
  • attrd – I would guess “attribute daemon”, but I haven’t found a formal definition yet of what this does. Need to check the source code and see what it says
  • mgmtd – management daemon. I haven’t found a formal definition yet of what this does. Need to check the source code and see what it says

Operations

To start Corosync on an OpenSUSE server:

# rcopenais start

To stop Corosync on an OpenSUSE server:

# rcopenais stop

To get the current cluster status:

# crm status

Text-based cluster monitoring tool:

# crm_mon

To list all of the resources in a cluster:

# crm resource status

To view the current pacemaker config:

# crm configure show
# crm configure show xml

To verify the current pacemaker config:

# crm_verify -L -V

Force the ClusterIP resource (and all related resources based on your policy settings) to move from server0 to server1:

# crm resource move ClusterIP server1

Note that to transfer the resources this actually sets the preferred node to server1. To remove the preference and return control to the cluster:

# crm resource unmove ClusterIP

Making a new shadow (backup) copy of a working CIB called “working”:

# crm configure
crm(live)configure# cib new working
INFO: working shadow CIB created

Updating the “working” backup copy of the live CIB:

# crm configure
crm(live)configure# cib reset working
INFO: copied live CIB to working

Setting the “working” CIB to act as the “live” configuration:

# crm configure
crm(live)configure# cib commit working
INFO: commited 'working' shadow CIB to the cluster

Saving the configuration to a file, modifying the file, and loading just the updates:

# crm configure show > /tmp/crm.xml
# vi /tmp/crm.xml
# crm configure load update /tmp/crm.xml

To get a list of all of the OCF resource agents provided by Pacemaker and Heartbeat:

# crm ra list ocf heartbeat
# crm ra list ocf pacemaker

To see which host a resource is running on:

# crm resource status ClusterIP
resource ClusterIP is running on: server0

To clear the fail count on a WebSite resource:

# crm resource cleanup WebSite

Cluster configuration

On OpenSUSE servers OCF resource definitions can be found in /usr/lib/ocf/resource.d/.

Pacemaker resources are configured using the crm tool.

To disable STONITH on a two-node cluster:

# crm configure property stonith-enabled=false
# crm_verify -L -V

Adding a second IP address resource to the cluster:

# crm configure primitive ClusterIP ocf:heartbeat:IPaddr2 \
        params ip="192.168.1.97" cidr_netmask="24" \
        op monitor interval="30" timeout="25" start-delay="0"

To turn the quorum policy OFF for two-node clusters:

# crm configure property no-quorum-policy=ignore
# crm configure show

Adding an Apache resource with https enabled:

# crm configure primitive WebSite ocf:heartbeat:apache \
        params configfile="/etc/apache2/httpd.conf" options="-DSSL" \
        operations $id="WebSite-operations" \
        op start interval="0" timeout="40" \
        op stop interval="0" timeout="60" \
        op monitor interval="60" timeout="120" start-delay="0" \
        statusurl="http://127.0.0.1/server-status/" \
        meta target-role="Started"
 

Note that the Apache mod_status module must be loaded and configured so that http://127.0.0.1/server-status/ will return the server status page, otherwise Pacemaker cannot tell that the server is actually running and it will attempt to fail over to the other node, then fail again.

Adding a DRBD resource (based on http://www.drbd.org/users-guide-emb/s-pacemaker-crm-drbd-backed-service.html):

# crm configure primitive VolumeDRBD ocf:linbit:drbd \
        params drbd_resource="drbd_resource_name" \
        operations $id="VolumeDRBD-operations" \
        op start interval="0" timeout="240" \
        op promote interval="0" timeout="90" \
        op demote interval="0" timeout="90" \
        op stop interval="0" timeout="100" \
        op monitor interval="40" timeout="60" start-delay="0" \
        op notify interval="0" timeout="90" \
        meta target-role="started"
# crm configure primitive FileSystemDRBD ocf:heartbeat:Filesystem \
        params device="/dev/drbd0" directory="/srv/src" fstype="ext3" \
        operations $id="FileSystemDRBD-operations" \
        op start interval="0" timeout="60" \
        op stop interval="0" timeout="60" fast_stop="no" \
        op monitor interval="40" timeout="60" start-delay="0" \
        op notify interval="0" timeout="60"
# crm configure ms MasterDRBD VolumeDRBD \
        meta clone-max="2" notify="true" target-role="started"

Note that the resource name drbd_resource="drbd_resource_name" must match the name listed in /etc/drbd.conf and that drbd should NOT be started by init — you want to let Pacemaker control starting and stopping drbd. To make sure DRBD isn’t being started by init use:

# chkconfig drbd off

Telling Pacemaker that the floating IP address, DRBD primary, and Apache all have to be on the same node:

# crm configure group Cluster ClusterIP FileSystemDRBD WebSite \
        meta target-role="Started"
# crm configure colocation WebServerWithIP inf: Cluster MasterDRBD:Master
# crm configureorder StartFileSystemFirst inf: MasterDRBD:promote Cluster:start

2 thoughts on “Corosync / Pacemaker

  1. Pingback: Galera Cluster mit Ubuntu 12.04 64bit und MySQL 5.5 – Teil 3 IP Failover – exdc

  2. Pingback: Enabling federation VPN inside OpenStack using multiple External Networks | The XIFI blog

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.