CentOS 6.6, Apparent Xfs Corruption

Home » CentOS » CentOS 6.6, Apparent Xfs Corruption
CentOS 3 Comments

Hi all –
After several months of worry-free operation, we received the following kernel messages about an xfs filesystem running under CentOS 6.6. The proximate causes appear to be “Internal error xfs_trans_cancel” and
“Corruption of in-memory data detected. Shutting down filesystem”. The filesystem is back up, mounted, appears to be working OK underlying a Splunk datastore. Does anyone have a suggestion on diagnosis or known problems? Many thanks…..Nick Geo

Sep 18 20:35:15 gries kernel: XFS (dm-2): Internal error xfs_trans_cancel at line 1948 of file fs/xfs/xfs_trans.c. Caller 0xffffffffa01f1388
Sep 18 20:35:15 gries kernel:
Sep 18 20:35:15 gries kernel: Pid: 24005, comm: splunkd Not tainted
2.6.32-504.8.1.el6.x86_64 #1
Sep 18 20:35:15 gries kernel: Call Trace:
Sep 18 20:35:15 gries kernel: [] ?
xfs_error_report+0x3f/0x50 [xfs]
Sep 18 20:35:15 gries kernel: [] ? xfs_rename+0x2d8/0x720
[xfs]
Sep 18 20:35:15 gries kernel: [] ?
xfs_trans_cancel+0xf5/0x120 [xfs]
Sep 18 20:35:15 gries kernel: [] ? xfs_rename+0x2d8/0x720
[xfs]
Sep 18 20:35:15 gries kernel: [] ? __do_fault+0x469/0x530
Sep 18 20:35:15 gries kernel: [] ?
xfs_vn_rename+0x66/0x70 [xfs]
Sep 18 20:35:15 gries kernel: [] ? vfs_rename+0x419/0x480
Sep 18 20:35:15 gries kernel: [] ?
sys_renameat+0x309/0x3a0
Sep 18 20:35:15 gries kernel: [] ?
_atomic_dec_and_lock+0x55/0x80
Sep 18 20:35:15 gries kernel: [] ?
mntput_no_expire+0x30/0x110
Sep 18 20:35:15 gries kernel: [] ?
audit_syscall_entry+0x1d7/0x200
Sep 18 20:35:15 gries kernel: [] ? sys_rename+0x1b/0x20
Sep 18 20:35:15 gries kernel: [] ?
system_call_fastpath+0x16/0x1b Sep 18 20:35:15 gries kernel: XFS (dm-2): xfs_do_force_shutdown(0x8) called from line 1949 of file fs/xfs/xfs_trans.c. Return address 0xffffffffa01f2e6e Sep 18 20:35:15 gries kernel: XFS (dm-2): Corruption of in-memory data detected. Shutting down filesystem Sep 18 20:35:15 gries kernel: XFS (dm-2): Please umount the filesystem and rectify the problem(s)
Sep 18 20:35:27 gries kernel: XFS (dm-2): xfs_log_force: error 5 returned.

3 thoughts on - CentOS 6.6, Apparent Xfs Corruption

  • I think you need to read this from the bottom up:

    “Corruption of in-memory data detected. Shutting down filesystem”
    so XFS calls xfs_do_force_shutdown to shut down the filesystem. The call comes from fs/xfs/xfs_trans.c which fails, and so reports
    “Internal error xfs_trans_cancel”.

    In other words, I would look at the memory corruption first. This
    _could_ be a kernel problem, but I would suggest starting with an extended memory check, it smells to me of a failing chip.

    Just my 2d worth!

    Martin

    —–BEGIN PGP SIGNATURE—

  • James Peltier wrote:
    nobarrier, etc?

    None.

    e?

    There are 2 xfs filesystems:

    /dev/mapper/vg_gries01-LogVol00 3144200 1000428 2143773 32% /opt/splunk
    /dev/mapper/vg_gries00-LogVol00 307068 267001 40067 87%
    /opt/splunk/hot

    You’ll notice that the larger just crossed the 1TB boundary.

    Thanks…..Nick Geovanis