C6, Drbd And File Systems

Home » CentOS » C6, Drbd And File Systems
CentOS 9 Comments

I have a pair of CentOS 6 systems, with a rather large raid thats DRBD
replicated from box 1 to box 2… box 1 mounts this as /data

when box 1 reboots, /data doesn’t get mounted, but the drbd replication starts up just fine. the entry in fstab is…

/dev/drbd0 /data xfs inode64 1 0

once the system is booted up, if I `mount /data`, it mounts just fine.

I’m not using any sort of heartbeat or other HA management software, if box 1 fails, I’ll manually configure box2 to take over after physically disabling box1 so it can’t wake up a zombie.

9 thoughts on - C6, Drbd And File Systems

  • Hi,

    Are you using SE Linux? If so does the context for /dev/drbd0 match on both systems ?

  • afaik, this has nothing to do with the drbd slave. I reboot the master, replication resumes just fine, but the /data filesystem doesn’t get automounted til I manually mount it.

  • indeed, it appears it tries to mount local file systems BEFORE drbd is started. because /data isn’t mounted, BackupPC can’t start either.

    so do I have to edit the chkconfig priorities in /etc/init.d and recreate the rc3.d S** and K** scripts so drbd starts earlier? ugh.

    # cat ore /var/log/boot.log
    Welcome to CentOS
    Starting udev: [ OK ]
    Setting hostname hostname.mydomain.com: [ OK ]
    Setting up Logical Volume Management: 5 logical volume(s) in volume group “vg_sysdata” now active
    3 logical volume(s) in volume group “vg_sys” now active
    [ OK ]
    Checking filesystems
    /dev/mapper/vg_sys-lv_root: clean, 60004/3276800 files, 4432672/13107200
    blocks
    /dev/sdb1: clean, 66/128016 files, 198283/512000 blocks
    /dev/mapper/vg_sys-lv_home: clean, 593694/129236992 files,
    235674982/516925440 blocks
    [ OK ]
    Remounting root filesystem in read-write mode: [ OK ]
    Mounting local filesystems: mount: special device /dev/drbd0 does not exist
    [FAILED]
    Enabling local filesystem quotas: [ OK ]
    Enabling /etc/fstab swaps: [ OK ]
    Entering non-interactive startup Calling the system activity data collector (sadc)… Starting monitoring for VG vg_sys: 3 logical volume(s) in volume group
    “vg_sys” monitored
    [ OK ]
    Starting monitoring for VG vg_sysdata: 5 logical volume(s) in volume group “vg_sysdata” monitored
    [ OK ]
    Bringing up loopback interface: [ OK ]
    Bringing up interface eth0:
    Determining IP information for eth0… done.
    [ OK ]
    Starting auditd: [ OK ]
    Starting system logger: [ OK ]
    Enabling ondemand cpu frequency scaling: [ OK ]
    Starting irqbalance: [ OK ]
    Starting rpcbind: [ OK ]
    Starting NFS statd: [ OK ]
    Starting system message bus: [ OK ]
    Starting Avahi daemon… [ OK ]
    NFS filesystems queued to be mounted Mounting filesystems: mount: special device /dev/drbd0 does not exist
    [FAILED]
    Starting acpi daemon: [ OK ]
    Starting HAL daemon: [ OK ]
    Starting lm_sensors: loading module ipmi-si adm1021 coretem[ OK ]0
    Retrigger failed udev events [ OK ]

    Rebuilding /boot/initrd-2.6.32-573.22.1.el6.x86_64kdump.img Warning: There might not be enough space to save a vmcore.
    The size of UUID”88bad6-ceb0-4984-bf3d-c57666603495 should be greater than 49416020 kilo bytes.

    Starting DRBD resources: [ d(main) s(main) n(main) ].

    Starting BackupPC: 2016-05-03 09:57:14 Can’t create a test hardlink between a file in /var/lib/BackupPC//pc and /var/lib/BackupPC//cpool.
    Either these are different file system s, or this file system doesn’t support hardlinks, or these directories don’t exist, or there is a permissions problem, or the file system is out of inodes or full. Use df, df –
    i, and ls -ld to check each of these possibilities. Quitting…

  • started. because /data isn’t mounted, BackupPC can’t start either. the rc3.d S** and K** scripts so drbd starts earlier? ugh. group “vg_sysdata” now active blocks
    235674982/516925440 blocks exist
    “vg_sys” monitored group “vg_sysdata” monitored greater than 49416020 kilo bytes. between a file in /var/lib/BackupPC//pc and /var/lib/BackupPC//cpool. Either these are different file system don’t exist, or there is a permissions problem, or the file system is out of inodes or full. Use df, df –

    You might want to try adding _netdev to the options for /data … Although you’re still likely to hit the race condition on EL6 … This is much simpler on EL7.

    A custom init script to check the environment and only mount when ready (or rc.local for initial testing) … Set the options to noauto so you still can have it listed in fstab to make life easier.

    The alternative is to add the pacemaker framework and let the cluster stuff handle it for you.

  • You could also use pacemaker to manage promoting the drbd device and mounting it with a dead master role as a dependency. Are you using anything to automatically promote the send slave already?

  • I’m not using anything, this is a disaster recovery sort of scenario, the master is a backup server (BackupPC), and the slave is an offsite backup of the backups. if the master fails, the plan was to manually bring the slave up after manually fencing the master.

  • If you manage your server with something like Puppet, you could configure it to ensure the file system is mounted.

    Or use cron to check periodically. I’m not using anything, this is a disaster recovery sort of scenario, the master is a backup server (BackupPC), and the slave is an offsite backup of the backups. if the master fails, the plan was to manually bring the slave up after manually fencing the master.