Sanlock Disk Leases On Drbd/gfs2 Volume

Home » CentOS-Virt » Sanlock Disk Leases On Drbd/gfs2 Volume
CentOS-Virt No Comments

Hi all,

I have a 2 node cluster that consists of the following:

* 1 drbd/gfs2 partition that holds VM images and XML
* Sanlock configured with the disk lease directory on the same drbd/gfs partition

Everything is working well, aside from one small issue I ran into. When testing fencing, on one particular test GFS began replaying the journal for the remaining node, and in the middle of it rgmanager attempted to recover the VM. Normally this wouldn’t be an issue, as libvirt would just pause until GFS was ready, however since it’s talking to sanlock first, sanlock attempted to acquire the lock, while GFS was not ready, and failed. This caused the recovery itself to fail.

I’m attempting to keep the lease directory on the shared storage so that I do not have introduce another single point of failure in the cluster by having an outside NFS mount. It seems like I could get around this particular scenario by changing the recovery policy to “restart” (it’s on relocate right now), and have it try restarts several times before giving up, but I wanted to see first if you guys had any advice for this issue as well. Perhaps I’m missing a setting that would correct this?