EL9/udev Generates Wrong Device Nodes/symlinks With HPE Smart Array Controller

Home » CentOS » EL9/udev Generates Wrong Device Nodes/symlinks With HPE Smart Array Controller
CentOS 3 Comments

Hi,

I see some strange and dangerous things happening on a HPE server with HPE
Smart Array controller where EL9 ends up with wrong device nodes/symlinks to the attached disks/raid volumes:

(I didn’t touch anything here but at 08:09 some symlinks were changed)
/dev/disk/by-id/:
lrwxrwxrwx 1 root root 9 Mar 1 07:57 scsi-0HP_LOGICAL_VOLUME_00000000 ->
../../sdc lrwxrwxrwx 1 root root 10 Mar 1 07:57
scsi-0HP_LOGICAL_VOLUME_00000000-part1 -> ../../sdc1
lrwxrwxrwx 1 root root 10 Mar 1 07:57
scsi-0HP_LOGICAL_VOLUME_00000000-part2 -> ../../sdc2
lrwxrwxrwx 1 root root 9 Mar 1 07:57 scsi-0HP_LOGICAL_VOLUME_01000000 ->
../../sdb lrwxrwxrwx 1 root root 9 Mar 1 08:09 scsi-0HP_LOGICAL_VOLUME_02000000 ->
../../sda lrwxrwxrwx 1 root root 9 Mar 1 07:57 scsi-0HP_LOGICAL_VOLUME_03000000 ->
../../sdd lrwxrwxrwx 1 root root 9 Mar 1 08:09
scsi-SHP_LOGICAL_VOLUME_500143801722C0B0 -> ../../sda lrwxrwxrwx 1 root root 10 Mar 1 07:57
scsi-SHP_LOGICAL_VOLUME_500143801722C0B0-part1 -> ../../sdc1
lrwxrwxrwx 1 root root 10 Mar 1 07:57
scsi-SHP_LOGICAL_VOLUME_500143801722C0B0-part2 -> ../../sdc2

/dev/disk/by-path/:
lrwxrwxrwx 1 root root 9 Mar 1 07:57 pci-0000:03:00.0-scsi-0:1:0:0 ->
../../sdc lrwxrwxrwx 1 root root 10 Mar 1 07:57 pci-0000:03:00.0-scsi-0:1:0:0-part1
-> ../../sdc1
lrwxrwxrwx 1 root root 10 Mar 1 07:57 pci-0000:03:00.0-scsi-0:1:0:0-part2
-> ../../sdc2
lrwxrwxrwx 1 root root 9 Mar 1 07:57 pci-0000:03:00.0-scsi-0:1:0:1 ->
../../sdb lrwxrwxrwx 1 root root 9 Mar 1 08:09 pci-0000:03:00.0-scsi-0:1:0:2 ->
../../sda lrwxrwxrwx 1 root root 9 Mar 1 07:57 pci-0000:03:00.0-scsi-0:1:0:3 ->
../../sdd

After rebooting, the things are different but also wrong:

(here nothing has changed after boot but symlinks are already wrong)
/dev/disk/by-id/:
lrwxrwxrwx 1 root root 9 Mar 1 10:56 scsi-0HP_LOGICAL_VOLUME_00000000
-> ../../sdb lrwxrwxrwx 1 root root 10 Mar 1 10:56
scsi-0HP_LOGICAL_VOLUME_00000000-part1 -> ../../sdb1
lrwxrwxrwx 1 root root 10 Mar 1 10:56
scsi-0HP_LOGICAL_VOLUME_00000000-part2 -> ../../sdb2
lrwxrwxrwx 1 root root 9 Mar 1 10:56 scsi-0HP_LOGICAL_VOLUME_01000000
-> ../../sda lrwxrwxrwx 1 root root 9 Mar 1 10:56 scsi-0HP_LOGICAL_VOLUME_02000000
-> ../../sdd lrwxrwxrwx 1 root root 9 Mar 1 10:56 scsi-0HP_LOGICAL_VOLUME_03000000
-> ../../sdc lrwxrwxrwx 1 root root 9 Mar 1 10:56
scsi-SHP_LOGICAL_VOLUME_500143801722C0B0 -> ../../sda lrwxrwxrwx 1 root root 10 Mar 1 10:56
scsi-SHP_LOGICAL_VOLUME_500143801722C0B0-part1 -> ../../sdb1
lrwxrwxrwx 1 root root 10 Mar 1 10:56
scsi-SHP_LOGICAL_VOLUME_500143801722C0B0-part2 -> ../../sdb2

/dev/disk/by-path/:
lrwxrwxrwx 1 root root 9 Mar 1 10:56 pci-0000:03:00.0-scsi-0:1:0:0 ->
../../sdb lrwxrwxrwx 1 root root 10 Mar 1 10:56
pci-0000:03:00.0-scsi-0:1:0:0-part1 -> ../../sdb1
lrwxrwxrwx 1 root root 10 Mar 1 10:56
pci-0000:03:00.0-scsi-0:1:0:0-part2 -> ../../sdb2
lrwxrwxrwx 1 root root 9 Mar 1 10:56 pci-0000:03:00.0-scsi-0:1:0:1 ->
../../sda lrwxrwxrwx 1 root root 9 Mar 1 10:56 pci-0000:03:00.0-scsi-0:1:0:2 ->
../../sdd lrwxrwxrwx 1 root root 9 Mar 1 10:56 pci-0000:03:00.0-scsi-0:1:0:3 ->
../../sdc

Note that two things are strange:

1) the /dev/sd* nodes are in a random order after every restart.
# lsscsi
[1:0:0:0] storage HP P410i 6.64 –
[1:1:0:0] disk HP LOGICAL VOLUME 6.64 /dev/sdb
[1:1:0:1] disk HP LOGICAL VOLUME 6.64 /dev/sda
[1:1:0:2] disk HP LOGICAL VOLUME 6.64 /dev/sdd
[1:1:0:3] disk HP LOGICAL VOLUME 6.64 /dev/sdc

2) some symlinks created by udev are just wrong and therefore very dangerous to use:
scsi-SHP_LOGICAL_VOLUME_500143801722C0B0 -> ../../sda scsi-SHP_LOGICAL_VOLUME_500143801722C0B0-part1 -> ../../sdb1
scsi-SHP_LOGICAL_VOLUME_500143801722C0B0-part2 -> ../../sdb2

While 1 may be expected(???) I think 2 should really not happen.

I’ve tried to find out where things go wrong but the whole udev stuff started to hurt my brain :)

I’m quite sure HPE Smart Array based servers are quite common so my big question is: do others see that same?

While it’s possible to live with this mess I’d really like to fix it somehow.

Thanks, Simon

3 thoughts on - EL9/udev Generates Wrong Device Nodes/symlinks With HPE Smart Array Controller

  • Simon Matter

    I think it maybe caused by sd driver asynchronous scanning.
    I am lucky that I didn’t see this before. nvme may have similar issues, but nvme has boot parameter to avoid it.
    Suse has boot parameter to avoid it.
    with EL9 we will wait until EL 9.3 if we are lucky.
    I had report issue: https://bugzilla.redhat.com/show_bug.cgi?id!40017

  • Hi,

    Thanks for confirming that I’m not alone with this “feature”

    In the above example, it’s much fun if you want to wipe the two partitions on
    /dev/disk/by-id/scsi-SHP_LOGICAL_VOLUME_500143801722C0B0 and therefore wipe this device. You end up wiping the wrong disk!

    When I see such things my blood start boiling :(

    Regards, Simon