I started to receive this kind of messages a few days ago on one of my servers:
Message from syslogd@ at Mon Apr 29 08:02:55 2013 … server1 kernel: EDAC MC0: UE row 0, channel-a= 0 channel-b= 1 labels “-“:
(Branch=0 DRAM-Bank=0 RDWR=Read RAS=0 CAS=0, UE Err=0x2 (Aliased Uncorrectable Non-Mirrored Demand Data ECC))
I’ve never had ECC memory to fail on me before, so now I am wondering the following:
* The server is running CentOS 5.7 and is acting as Xen dom0. Is there any possibility this could be a kernel issue and upgrading would help, or would upgrading at this point just cause more trouble?
* Is there now a possibility that my data can get corrupt: should I
shutdown the server as soon as possible or can I keep running until I
replace the memories?
* This server has been running for several years in a datacenter without problems: what are your experiences, are these kind of problems most likely caused by a failing motherboard or the memories?