How To Know Hardware RAID Failure

Home » CentOS » How To Know Hardware RAID Failure
CentOS 7 Comments

Hi,

We have on an old Dell 2850 with 4 SCSI drives, where we put a Hardware RAID5 on 3 disks and left 1 as spare. I dont remember exactly, but the RAID was setup in the BIOS, and when installing CentOS6, I just saw “1 drive”.

If it was software RAID, disk failure could be seen in /proc/mdastat, and I could simulate failure with mdadm.

As far as it is hardware ATM, how, at least, to know the state of the RAID array? (syncing, healthy,…)

Thank you.

7 thoughts on - How To Know Hardware RAID Failure

  • —– Original Message —–
    | Hi,
    |
    | We have on an old Dell 2850 with 4 SCSI drives, where we put a
    | Hardware
    | RAID5 on 3 disks and left 1 as spare.
    | I dont remember exactly, but the RAID was setup in the BIOS, and when
    | installing CentOS6, I just saw “1 drive”.
    |
    | If it was software RAID, disk failure could be seen in /proc/mdastat,
    | and I could simulate failure with mdadm.
    |
    | As far as it is hardware ATM, how, at least, to know the state of the
    | RAID array? (syncing, healthy,…)
    |
    | Thank you.
    |
    | —
    | RMA.

    The IPMI controller if you have one, or the Dell diagnostic tools such as OSMA. Megacli can report these things back to you.

    James A. Peltier Manager, IT Services – Research Computing Group Simon Fraser University – Burnaby Campus Phone : 778-782-6573
    Fax : 778-782-3045
    E-Mail : jpeltier@sfu.ca Website : http://www.sfu.ca/itservices

    “A successful person is one who can lay a solid foundation from the bricks others have thrown at them.” -David Brinkley via Luke Shaw

  • Mihamina Rakotomandimby wrote:
    Um, no. I don’t *think* the older Dells had the Intel fakeraid; if you did it down there, does it have a PERC controller? If so, you need to look in there.

    mark

  • I’m pretty sure it’s a PERC4 controller in those models, but don’t hold me to that.

    Run lspci and find out for certain what CentOS 6.x says your RAID
    controller is.

    James and Todor are spot on.

    IPMI will indicate failed drives (“service ipmi start; chkconfig ipmi on;
    ipmitool sel list” from your shell and “ipmitool sel clear” to clear the ipmi log) and Open Manage Server Administrator (OMSA) gives you a webgui
    (https://
    If you have a network monitoring system set up, you might look for a script
    (such as this one for Nagios [0] which requires LSI daemons to be running).

    [0]
    http://exchange.nagios.org/directory/Plugins/Hardware/Storage-Systems/RAID-Controllers/check_snmp_raid–2F-check_sasraid_megaraid/details

  • I have a different system, a HP Proliant ML110

    00:1f.2 RAID bus controller: Intel Corporation 82801GR/GDH
    (ICH7R/ICH7DH) SATA Controller [RAID mode] (rev 01)

    any hints for me?

  • Donkey Hottie wrote:

    Do you have it set up with a RAID? If not, and you want one, you have, unfortunately, I think, Intel “fakeRAID” (yes, that’s googleable, easily). I was going to use it a couple of years ago, and after the aggro I had, gave up, turned it off, and set up Linux’s software RAID (md), and it works *very* well. In fact, a couple months after I turned it up on one user’s servers, a disk failed. I had no trouble identifying the disk, breaking the mirror, replacing it, and it rebuilt it nicely.

    mark

LEAVE A COMMENT