Rare But Repeating System Crash In C7

Home » CentOS » Rare But Repeating System Crash In C7
CentOS 24 Comments

Hi all, I’m hoping someone can help me figure this out.

every now and then (less than monthly, maybe every 2-4 months, or so, I’ll walk up to my C7 box (my home PC) in the morning, wiggle the mouse to wake up the screen, and after a second or so, instead of a live screen, the keyboard shift-lock and scroll-lock keys light up. if I wait a few (tens of) second(s) I find it is rebooting, as the BIOS splash screen appears. it boots normally and comes up with everything apparently working fine.

Note that I tend to leave myself logged in 24/7/365 since there’s nobody here except my wife and myself, and she has her own Linux box.

as it happened again this morning, I grabbed some lines from
/var/log/messages that show the last few minutes before it rebooted and the first 3 or four statements as it began to reboot:

Jan 2 08:50:12 fcshome kernel: xhci_hcd 0000:03:00.0: Looking for event-dma 00000000a9f2cb10 trb-start 00000000a9f2cb20 trb-end
00000000a9f2cb20 seg-start 00000000a9f2c000 seg-end 00000000a9f2cff0
Jan 2 08:50:13 fcshome kernel: xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 8 comp_code 13
Jan 2 08:50:13 fcshome kernel: xhci_hcd 0000:03:00.0: Looking for event-dma 00000000164858d0 trb-start 00000000164858e0 trb-end
00000000164858e0 seg-start 0000000016485000 seg-end 0000000016485ff0
Jan 2 08:51:12 fcshome dbus[1192]: [system] Activating service name=’org.fedoraproject.Setroubleshootd’ (using servicehelper)
Jan 2 08:51:13 fcshome dbus[1192]: [system] Successfully activated service
‘org.fedoraproject.Setroubleshootd’
Jan 2 08:51:14 fcshome setroubleshoot: SELinux is preventing
/usr/sbin/smbd from read access on the sock_file cups.sock. For complete SELinux messages run: sealert -l e4620dcc-6cdc-460d-a8a4-db9ce9624646
Jan 2 08:51:14 fcshome python: SELinux is preventing /usr/sbin/smbd from read access on the sock_file cups.sock.#012#012***** Plugin catchall (100. confidence) suggests **************************#012#012If you believe that smbd should be allowed read access on the cups.sock sock_file by default.#012Then you should report this as a bug.#012You can generate a local policy module to allow this access.#012Do#012allow this access for now by executing:#012# ausearch -c ‘lpqd’ –raw | audit2allow -M
my-lpqd#012# semodule -i my-lpqd.pp#012
Jan 2 08:55:11 fcshome kernel: xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 8 comp_code 13
Jan 2 08:55:11 fcshome kernel: xhci_hcd 0000:03:00.0: Looking for event-dma 0000000016485d20 trb-start 0000000016485d30 trb-end
0000000016485d30 seg-start 0000000016485000 seg-end 0000000016485ff0
Jan 2 08:55:11 fcshome kernel: xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 8 comp_code 13
Jan 2 08:55:11 fcshome kernel: xhci_hcd 0000:03:00.0: Looking for event-dma 00000000a9f2cae0 trb-start 00000000a9f2caf0 trb-end
00000000a9f2caf0 seg-start 00000000a9f2c000 seg-end 00000000a9f2cff0
Jan 2 08:58:00 fcshome kernel: xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 8 comp_code 13
Jan 2 08:58:00 fcshome kernel: xhci_hcd 0000:03:00.0: Looking for event-dma 00000000a9f2c530 trb-start 00000000a9f2c540 trb-end
00000000a9f2c540 seg-start 00000000a9f2c000 seg-end 00000000a9f2cff0
Jan 2 08:59:51 fcshome kernel: xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 8 comp_code 13
Jan 2 08:59:51 fcshome kernel: xhci_hcd 0000:03:00.0: Looking for event-dma 00000000a9f2c340 trb-start 00000000a9f2c350 trb-end
00000000a9f2c350 seg-start 00000000a9f2c000 seg-end 00000000a9f2cff0
Jan 2 08:59:51 fcshome kernel: xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 8 comp_code 13
Jan 2 08:59:51 fcshome kernel: xhci_hcd 0000:03:00.0: Looking for event-dma 00000000a9f2cfb0 trb-start 00000000a9f2cfc0 trb-end
00000000a9f2cfc0 seg-start 00000000a9f2c000 seg-end 00000000a9f2cff0
Jan 2 09:00:02 fcshome systemd: Created slice User Slice of root. Jan 2 09:00:02 fcshome systemd: Started Session 7364 of user root. Jan 2 09:00:02 fcshome systemd: Removed slice User Slice of root. Jan 2 09:00:06 fcshome kernel: xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 8 comp_code 13
Jan 2 09:00:06 fcshome kernel: xhci_hcd 0000:03:00.0: Looking for event-dma 00000000a9f2cd20 trb-start 00000000a9f2cd30 trb-end
00000000a9f2cd30 seg-start 00000000a9f2c000 seg-end 00000000a9f2cff0
Jan 2 09:00:59 fcshome kernel: xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 8 comp_code 13
Jan 2 09:00:59 fcshome kernel: xhci_hcd 0000:03:00.0: Looking for event-dma 0000000016485400 trb-start 0000000016485410 trb-end
0000000016485410 seg-start 0000000016485000 seg-end 0000000016485ff0
Jan 2 09:00:59 fcshome kernel: xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 8 comp_code 13
Jan 2 09:00:59 fcshome kernel: xhci_hcd 0000:03:00.0: Looking for event-dma 00000000a9f2cbc0 trb-start 00000000a9f2cbd0 trb-end
00000000a9f2cbd0 seg-start 00000000a9f2c000 seg-end 00000000a9f2cff0
Jan 2 09:00:59 fcshome kernel: xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 8 comp_code 13
Jan 2 09:00:59 fcshome kernel: xhci_hcd 0000:03:00.0: Looking for event-dma 0000000016485c00 trb-start 0000000016485c10 trb-end
0000000016485c10 seg-start 0000000016485000 seg-end 0000000016485ff0
Jan 2 09:01:01 fcshome systemd: Created slice User Slice of root. Jan 2 09:01:01 fcshome systemd: Started Session 7365 of user root. Jan 2 09:01:01 fcshome systemd: Removed slice User Slice of root. Jan 2 09:03:45 fcshome dbus[1192]: [system] Activating service name=’org.fedoraproject.Setroubleshootd’ (using servicehelper)
Jan 2 09:03:46 fcshome dbus[1192]: [system] Successfully activated service
‘org.fedoraproject.Setroubleshootd’
Jan 2 09:03:47 fcshome setroubleshoot: SELinux is preventing
/usr/sbin/smbd from read access on the sock_file cups.sock. For complete SELinux messages run: sealert -l e4620dcc-6cdc-460d-a8a4-db9ce9624646
Jan 2 09:03:47 fcshome python: SELinux is preventing /usr/sbin/smbd from read access on the sock_file cups.sock.#012#012***** Plugin catchall (100. confidence) suggests **************************#012#012If you believe that smbd should be allowed read access on the cups.sock sock_file by default.#012Then you should report this as a bug.#012You can generate a local policy module to allow this access.#012Do#012allow this access for now by executing:#012# ausearch -c ‘lpqd’ –raw | audit2allow -M
my-lpqd#012# semodule -i my-lpqd.pp#012
Jan 2 09:04:31 fcshome kernel: xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 8 comp_code 13
Jan 2 09:04:31 fcshome kernel: xhci_hcd 0000:03:00.0: Looking for event-dma 00000000a9f2ca60 trb-start 00000000a9f2ca70 trb-end
00000000a9f2ca70 seg-start 00000000a9f2c000 seg-end 00000000a9f2cff0
Jan 2 09:04:58 fcshome kernel: xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 8 comp_code 13
Jan 2 09:04:58 fcshome kernel: xhci_hcd 0000:03:00.0: Looking for event-dma 00000000a9f2cd70 trb-start 00000000a9f2cd80 trb-end
00000000a9f2cd80 seg-start 00000000a9f2c000 seg-end 00000000a9f2cff0
Jan 2 09:04:59 fcshome kernel: xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 8 comp_code 13
Jan 2 09:04:59 fcshome kernel: xhci_hcd 0000:03:00.0: Looking for event-dma 0000000016485b30 trb-start 0000000016485b40 trb-end
0000000016485b40 seg-start 0000000016485000 seg-end 0000000016485ff0
Jan 2 09:04:59 fcshome kernel: xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 8 comp_code 13
Jan 2 09:04:59 fcshome kernel: xhci_hcd 0000:03:00.0: Looking for event-dma 00000000164856c0 trb-start 00000000164856d0 trb-end
00000000164856d0 seg-start 0000000016485000 seg-end 0000000016485ff0
Jan 2 09:05:03 fcshome kernel: xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 8 comp_code 13
Jan 2 09:05:03 fcshome kernel: xhci_hcd 0000:03:00.0: Looking for event-dma 00000000164851b0 trb-start 00000000164851c0 trb-end
00000000164851c0 seg-start 0000000016485000 seg-end 0000000016485ff0
Jan 2 09:10:01 fcshome systemd: Created slice User Slice of root. Jan 2 09:10:01 fcshome systemd: Started Session 7366 of user root. Jan 2 09:10:01 fcshome systemd: Removed slice User Slice of root. Jan 2 09:10:30 fcshome kernel: xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 8 comp_code 13
Jan 2 09:10:30 fcshome kernel: xhci_hcd 0000:03:00.0: Looking for event-dma 00000000164854c0 trb-start 00000000164854d0 trb-end
00000000164854d0 seg-start 0000000016485000 seg-end 0000000016485ff0
Jan 2 09:10:30 fcshome kernel: xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 8 comp_code 13
Jan 2 09:10:30 fcshome kernel: xhci_hcd 0000:03:00.0: Looking for event-dma 0000000016485eb0 trb-start 0000000016485ec0 trb-end
0000000016485ec0 seg-start 0000000016485000 seg-end 0000000016485ff0
Jan 2 09:11:55 fcshome kernel: xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 8 comp_code 13
Jan 2 09:11:55 fcshome kernel: xhci_hcd 0000:03:00.0: Looking for event-dma 00000000164852c0 trb-start 00000000164852d0 trb-end
00000000164852d0 seg-start 0000000016485000 seg-end 0000000016485ff0
Jan 2 09:11:55 fcshome kernel: xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 8 comp_code 13
Jan 2 09:11:55 fcshome kernel: xhci_hcd 0000:03:00.0: Looking for event-dma 0000000016485b70 trb-start 0000000016485b80 trb-end
0000000016485b80 seg-start 0000000016485000 seg-end 0000000016485ff0
Jan 2 09:11:56 fcshome kernel: xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 8 comp_code 13
Jan 2 09:11:56 fcshome kernel: xhci_hcd 0000:03:00.0: Looking for event-dma 00000000a9f2cbb0 trb-start 00000000a9f2cbc0 trb-end
00000000a9f2cbc0 seg-start 00000000a9f2c000 seg-end 00000000a9f2cff0
Jan 2 09:11:56 fcshome kernel: xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 8 comp_code 13
Jan 2 09:11:56 fcshome kernel: xhci_hcd 0000:03:00.0: Looking for event-dma 00000000164858d0 trb-start 00000000164858e0 trb-end
00000000164858e0 seg-start 0000000016485000 seg-end 0000000016485ff0
Jan 2 09:11:56 fcshome kernel: xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 8 comp_code 13
Jan 2 09:11:56 fcshome kernel: xhci_hcd 0000:03:00.0: Looking for event-dma 0000000016485a00 trb-start 0000000016485a10 trb-end
0000000016485a10 seg-start 0000000016485000 seg-end 0000000016485ff0
Jan 2 09:14:46 fcshome kernel: xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 8 comp_code 13
Jan 2 09:14:46 fcshome kernel: xhci_hcd 0000:03:00.0: Looking for event-dma 0000000016485450 trb-start 0000000016485460 trb-end
0000000016485460 seg-start 0000000016485000 seg-end 0000000016485ff0
Jan 2 09:15:09 fcshome kernel: xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 8 comp_code 13
Jan 2 09:15:09 fcshome kernel: xhci_hcd 0000:03:00.0: Looking for event-dma 00000000a9f2c170 trb-start 00000000a9f2c180 trb-end
00000000a9f2c180 seg-start 00000000a9f2c000 seg-end 00000000a9f2cff0
Jan 2 09:15:09 fcshome kernel: xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 8 comp_code 13
Jan 2 09:15:09 fcshome kernel: xhci_hcd 0000:03:00.0: Looking for event-dma 00000000a9f2c7a0 trb-start 00000000a9f2c7b0 trb-end
00000000a9f2c7b0 seg-start 00000000a9f2c000 seg-end 00000000a9f2cff0
Jan 2 09:15:10 fcshome kernel: xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 8 comp_code 13
Jan 2 09:15:10 fcshome kernel: xhci_hcd 0000:03:00.0: Looking for event-dma 0000000016485560 trb-start 0000000016485570 trb-end
0000000016485570 seg-start 0000000016485000 seg-end 0000000016485ff0
Jan 2 09:15:21 fcshome kernel: xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 8 comp_code 13
Jan 2 09:15:21 fcshome kernel: xhci_hcd 0000:03:00.0: Looking for event-dma 0000000016485b90 trb-start 0000000016485ba0 trb-end
0000000016485ba0 seg-start 0000000016485000 seg-end 0000000016485ff0
Jan 2 09:15:21 fcshome kernel: xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 8 comp_code 13
Jan 2 09:15:21 fcshome kernel: xhci_hcd 0000:03:00.0: Looking for event-dma 00000000a9f2cf90 trb-start 00000000a9f2cfa0 trb-end
00000000a9f2cfa0 seg-start 00000000a9f2c000 seg-end 00000000a9f2cff0
Jan 2 09:15:22 fcshome kernel: xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 8 comp_code 13
Jan 2 09:15:22 fcshome kernel: xhci_hcd 0000:03:00.0: Looking for event-dma 00000000a9f2c9e0 trb-start 00000000a9f2c9f0 trb-end
00000000a9f2c9f0 seg-start 00000000a9f2c000 seg-end 00000000a9f2cff0
Jan 2 09:15:22 fcshome kernel: xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 8 comp_code 13
Jan 2 09:15:22 fcshome kernel: xhci_hcd 0000:03:00.0: Looking for event-dma 0000000016485660 trb-start 0000000016485670 trb-end
0000000016485670 seg-start 0000000016485000 seg-end 0000000016485ff0
Jan 2 09:15:23 fcshome kernel: xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 8 comp_code 13
Jan 2 09:15:23 fcshome kernel: xhci_hcd 0000:03:00.0: Looking for event-dma 0000000016485330 trb-start 0000000016485340 trb-end
0000000016485340 seg-start 0000000016485000 seg-end 0000000016485ff0
Jan 2 09:15:24 fcshome kernel: xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 8 comp_code 13
Jan 2 09:15:24 fcshome kernel: xhci_hcd 0000:03:00.0: Looking for event-dma 0000000016485500 trb-start 0000000016485510 trb-end
0000000016485510 seg-start 0000000016485000 seg-end 0000000016485ff0
Jan 2 09:15:25 fcshome kernel: xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 8 comp_code 13
Jan 2 09:15:25 fcshome kernel: xhci_hcd 0000:03:00.0: Looking for event-dma 00000000a9f2c400 trb-start 00000000a9f2c410 trb-end
00000000a9f2c410 seg-start 00000000a9f2c000 seg-end 00000000a9f2cff0
Jan 2 09:15:25 fcshome kernel: xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 8 comp_code 13
Jan 2 09:15:25 fcshome kernel: xhci_hcd 0000:03:00.0: Looking for event-dma 00000000164851c0 trb-start 00000000164851d0 trb-end
00000000164851d0 seg-start 0000000016485000 seg-end 0000000016485ff0
Jan 2 09:15:26 fcshome kernel: xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 8 comp_code 13
Jan 2 09:15:26 fcshome kernel: xhci_hcd 0000:03:00.0: Looking for event-dma 0000000016485e30 trb-start 0000000016485e40 trb-end
0000000016485e40 seg-start 0000000016485000 seg-end 0000000016485ff0
Jan 2 09:15:26 fcshome kernel: xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 8 comp_code 13
Jan 2 09:15:26 fcshome kernel: xhci_hcd 0000:03:00.0: Looking for event-dma 00000000a9f2cbf0 trb-start 00000000a9f2cc00 trb-end
00000000a9f2cc00 seg-start 00000000a9f2c000 seg-end 00000000a9f2cff0
Jan 2 09:15:26 fcshome kernel: xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 8 comp_code 13
Jan 2 09:15:26 fcshome kernel: xhci_hcd 0000:03:00.0: Looking for event-dma 00000000a9f2c000 trb-start 00000000a9f2c010 trb-end
00000000a9f2c010 seg-start 00000000a9f2c000 seg-end 00000000a9f2cff0
Jan 2 09:15:27 fcshome kernel: xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 8 comp_code 13
Jan 2 09:15:27 fcshome kernel: xhci_hcd 0000:03:00.0: Looking for event-dma 00000000a9f2ce50 trb-start 00000000a9f2ce60 trb-end
00000000a9f2ce60 seg-start 00000000a9f2c000 seg-end 00000000a9f2cff0
Jan 2 09:15:31 fcshome kernel: xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 8 comp_code 13
Jan 2 09:15:31 fcshome kernel: xhci_hcd 0000:03:00.0: Looking for event-dma 00000000a9f2c4e0 trb-start 00000000a9f2c4f0 trb-end
00000000a9f2c4f0 seg-start 00000000a9f2c000 seg-end 00000000a9f2cff0
Jan 2 09:15:47 fcshome kernel: xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 8 comp_code 13
Jan 2 09:15:47 fcshome kernel: xhci_hcd 0000:03:00.0: Looking for event-dma 00000000a9f2cbb0 trb-start 00000000a9f2cbc0 trb-end
00000000a9f2cbc0 seg-start 00000000a9f2c000 seg-end 00000000a9f2cff0
Jan 2 09:16:15 fcshome dbus[1192]: [system] Activating service name=’org.fedoraproject.Setroubleshootd’ (using servicehelper)
Jan 2 09:16:16 fcshome dbus[1192]: [system] Successfully activated service
‘org.fedoraproject.Setroubleshootd’
Jan 2 09:16:17 fcshome setroubleshoot: SELinux is preventing
/usr/sbin/smbd from read access on the sock_file cups.sock. For complete SELinux messages run: sealert -l e4620dcc-6cdc-460d-a8a4-db9ce9624646
Jan 2 09:16:17 fcshome python: SELinux is preventing /usr/sbin/smbd from read access on the sock_file cups.sock.#012#012***** Plugin catchall (100. confidence) suggests **************************#012#012If you believe that smbd should be allowed read access on the cups.sock sock_file by default.#012Then you should report this as a bug.#012You can generate a local policy module to allow this access.#012Do#012allow this access for now by executing:#012# ausearch -c ‘lpqd’ –raw | audit2allow -M
my-lpqd#012# semodule -i my-lpqd.pp#012
Jan 2 09:17:11 fcshome kernel: xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 8 comp_code 13
Jan 2 09:17:11 fcshome kernel: xhci_hcd 0000:03:00.0: Looking for event-dma 0000000016485b50 trb-start 0000000016485b60 trb-end
0000000016485b60 seg-start 0000000016485000 seg-end 0000000016485ff0
Jan 2 09:17:11 fcshome kernel: xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 8 comp_code 13
Jan 2 09:17:11 fcshome kernel: xhci_hcd 0000:03:00.0: Looking for event-dma 00000000a9f2c050 trb-start 00000000a9f2c060 trb-end
00000000a9f2c060 seg-start 00000000a9f2c000 seg-end 00000000a9f2cff0
Jan 2 09:17:11 fcshome kernel: xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 8 comp_code 13
Jan 2 09:17:11 fcshome kernel: xhci_hcd 0000:03:00.0: Looking for event-dma 00000000a9f2c680 trb-start 00000000a9f2c690 trb-end
00000000a9f2c690 seg-start 00000000a9f2c000 seg-end 00000000a9f2cff0
Jan 2 09:17:11 fcshome kernel: xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 8 comp_code 13
Jan 2 09:17:11 fcshome kernel: xhci_hcd 0000:03:00.0: Looking for event-dma 00000000a9f2cb70 trb-start 00000000a9f2cb80 trb-end
00000000a9f2cb80 seg-start 00000000a9f2c000 seg-end 00000000a9f2cff0
Jan 2 09:17:21 fcshome kernel: xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 8 comp_code 13
Jan 2 09:17:21 fcshome kernel: xhci_hcd 0000:03:00.0: Looking for event-dma 00000000a9f2c520 trb-start 00000000a9f2c530 trb-end
00000000a9f2c530 seg-start 00000000a9f2c000 seg-end 00000000a9f2cff0
Jan 2 09:17:21 fcshome kernel: xhci_hcd 0000:03:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 8 comp_code 13
Jan 2 09:17:21 fcshome kernel: xhci_hcd 0000:03:00.0: Looking for event-dma 00000000a9f2ca10 trb-start 00000000a9f2ca20 trb-end
00000000a9f2ca20 seg-start 00000000a9f2c000 seg-end 00000000a9f2cff0
Jan 2 09:20:11 fcshome kernel: Initializing cgroup subsys cpuset Jan 2 09:20:11 fcshome kernel: Initializing cgroup subsys cpu Jan 2 09:20:11 fcshome kernel: Initializing cgroup subsys cpuacct Jan 2 09:20:11 fcshome kernel: Linux version 3.10.0-1160.6.1.el7.x86_64

The last four events are the start of reboot and everything above is leading up to the crash.

I note that there was a selinux event early in this list of events, but I
see those frequently and they normally don’t cause a crash. if I scroll upward from this list, I find more of the same above what is shown here.

The selinux event seems to be related to a SMB mount. since I have a Synology box on the LAN, and it stays mounted nearly all the time, I
suspect the selinux event has something to do with that, but I have no idea why it would cause a crash once every few months. Is that selinux event something I need to fix?

I’m further guessing that “xhci_hcd” has something to do with USB, am I
right? If so I don’t know what it would be…

Clues would be greatly appreciated!

Thanks in advance!

Fred

24 thoughts on - Rare But Repeating System Crash In C7

  • Yup: https://en.wikipedia.org/wiki/Extensible_Host_Controller_Interface#Virtualization_support

    My guess: you have USB-attached storage that’s waking up when you wiggle the mouse, and it’s crashing the bus, kicking the kernel driver over, so the system reboots to protect itself.

    If not storage, then something else sufficiently complicated, which wakes up when you wake the system.

    I’d exclude things like optical drives, unless they have disks in them at the time this happens.

  • Warren, thanks for the reply!

    Plantronics USB headset/microphone?
    Yottamaster RAID-1 storage (USB3)?
    Behringer USB audio interface?
    Logitech wireless mouse?
    Leopold USB keyboard?

    Does any one of them sound more likely than the others?

    Of all those, the yottamaster device is the most recent. Unfortunately I
    haven’t kept notes on when it occurs, so it’s possible it was occurring before I got that device. :(
    If it is that device (used primarily for nightly backups) might it help if I unmount it after each use?

    Thanks in advance!

    Fred

    CentOS mailing list CentOS@CentOS.org https://lists.CentOS.org/mailman/listinfo/CentOS

  • HID devices won’t go to sleep when the computer does, else they couldn’t wake it back up. (Keyboard & mouse, mainly.)

    The two audio interfaces may or may not sleep. Try checking their indicator LEDs when the computer goes to sleep: I’d expect them to visibly show that they’ve gone to sleep if they do. If they do, then on wake, they *could* do this sort of thing.

    I’d go after the RAID enclosure first, particularly if it’s hardware RAID, since that means it’s “clever,” thus suspect. Check that you’ve got the current firmware:

    https://www.yottamaster.com/?route=common/driver

    If it’s one of their JBOD models, requiring that you do some sort of software RAID, I’d expect a much different report in the kernel log if the corresponding software RAID component had a bug, which would mean it’s got some fundamental USB compatibility problem if that’s the device causing the problem. Again, check for firmware updates.

  • Just add “x-systemd.automount,x-systemd.idle-timeoutmin” in the fstab mount options , or create an “.mount” + “.automount” entries for it (autofs is also an option) and test.

    The “x-systemd.automount” option will tell systemd to create a
    “.automount” unit which will monitor the mount point and automatically mount your drive, while the idle-timeout will tell systemd to automatically umount the share when not in use (ls, df, du and others count as usage and reset the counter). Also , if you use 7.6 – there is a bug in sysstat that forces autofs and systemd’s automounter to mount the share.

    Best Regards, Strahil Nikolov

  • If you picked the systemd automounter as an option, you will have to run:
    systemctl restart local-fs.target

    Best Regards, Strahil Nikolov

  • 99% of NAS boxes, maybe, but not dumb RAID boxes like the one I believe you’re referring to.

    (And I doubt even that, with the likes of FreeNAS extending down from the enterprise space where consumer volume can affect that sort of thing.)

    I have more than speculation to back that guess: the available firmware images are far too small to contain a Linux OS image, their manuals don’t talk about Linux or GPL that I can see, and there’s no place to download their Linux source code per the GPL.

    While doing this exploration, I’ve run into multiple problems with their web site, which strengthens my suspicion that this box is your culprit. If they’re this slipshod with their marketing material, what does that say about their engineering department?

  • Hi Fred,

    do you use automatic umount for the map in /etc/auto.master (–timeout) ?

    If yes, then the systemd mount options probably won’t help.

    Best Regards, Strahil Nikolov

     

    В неделя, 3 януари 2021 г., 04:27:17 Гринуич+2, Fred написа:

    Yeah, and the instructions for setting RAID-1 or RAID-0 have the switch positions exactly reversed.

    Strahil: I’m using autofs to automount the unit. but just turned that off and enabled the xsystemd.automount in fstab, we’ll see how that works.

    Fred

    CentOS mailing list CentOS@CentOS.org https://lists.CentOS.org/mailman/listinfo/CentOS

  • Strahil:

    I WAS using that, but the automatic umount never worked, leaving it mounted all the time.

    I commented out those entries in /etc/auto.master before modifying the fstab entry:

    UUID=259ec5ea-e8a4-465a-9263-1c06217b9aaf /mnt/backup ext4,x-systemd.automount,x-systemd.idle-timeout=15min noauto 0 2

    which is exactly as it was before except for the x-systemd entries as you described.

    and the peculiar thing is it STILL does not automount. and yes, I did do systemctl restart local-fs.target.

    do I need to reboot (or something simpler, maybe) to fully disable the auto.master stuff?

    Thanks again!

    Fred

    CentOS mailing list CentOS@CentOS.org https://lists.CentOS.org/mailman/listinfo/CentOS

  • Are you still on 7.6 ? I recently discovered that a bug in sysstat was fixed in 7.7 that prevented autofs from umounting the filesystem.

    The following should show if it’s taking into action:
    systemctl status mnt-backup.mount mnt-backup.automount systemctl cat mnt-backup.mount mnt-backup.automount

    Are you sure that you got no “,” before that “noauto” ?

    Best Regards, Strahil Nikolov 

    В неделя, 3 януари 2021 г., 16:25:47 Гринуич+2, Fred написа:

    Strahil:

    I WAS using that, but the automatic umount never worked, leaving it mounted all the time.

    I commented out those entries in /etc/auto.master before modifying the fstab entry:

    UUID=259ec5ea-e8a4-465a-9263-1c06217b9aaf       /mnt/backup     ext4,x-systemd.automount,x-systemd.idle-timeout=15min   noauto  0       2

    which is exactly as it was before except for the x-systemd entries as you described.

    and the peculiar thing is it STILL does not automount. and yes, I did do systemctl restart local-fs.target.

    do I need to reboot (or something simpler, maybe) to fully disable the auto.master stuff?

    Thanks again!

    Fred

  • $ cat /etc/CentOS-release CentOS Linux release 7.9.2009 (Core)

    $ sudo systemctl status mnt-backup.mount mnt-backup.automount
    [sudo] password for fredex:
    ● mnt-backup.mount – /mnt/backup
    Loaded: loaded (/etc/fstab; bad; vendor preset: disabled)
    Active: active (mounted) since Sat 2021-01-02 22:20:05 EST; 14h ago
    Where: /mnt/backup
    What: /dev/sdc1
    Docs: man:fstab(5)
    man:systemd-fstab-generator(8)
    Tasks: 0

    ● mnt-backup.automount
    Loaded: loaded
    Active: inactive (dead)
    Where: /mnt/backup
    [fredex@fcshome Desktop]$ systemctl cat mnt-backup.mount mnt-backup.automount No files found for mnt-backup.automount.
    # /run/systemd/generator/mnt-backup.mount
    # Automatically generated by systemd-fstab-generator

    [Unit]
    SourcePath=/etc/fstab Documentation=man:fstab(5) man:systemd-fstab-generator(8)
    RequiresOverridable=systemd-fsck@dev-disk-by
    \x2duuid-259ec5ea\x2de8a4\x2d465a\x2
    After=systemd-fsck@dev-disk-by
    \x2duuid-259ec5ea\x2de8a4\x2d465a\x2d9263\x2d1c062

    [Mount]
    What=/dev/disk/by-uuid/259ec5ea-e8a4-465a-9263-1c06217b9aaf Where=/mnt/backup Type=ext4
    Options=noauto

    the fstab statement I put in my last posting was a copy/paste from
    /etc/fstab, so it should be correct as shown. I don’t see a comma before noauto.

    CentOS mailing list CentOS@CentOS.org https://lists.CentOS.org/mailman/listinfo/CentOS

  • Erm … the noauto should be part of the options column, so append it to the previous option (and of course delimit with a “,”).

    I see that the ‘.automount’ was not generated … Maybe it’s related to the noauto issue.

    By the way , “mount -a” should complain if fstab is not OK.

    Best Regards, Strahil Nikolov

    В неделя, 3 януари 2021 г., 21:01:29 Гринуич+2, Fred написа:

    $ cat /etc/CentOS-release CentOS Linux release 7.9.2009 (Core)

    $ sudo systemctl status mnt-backup.mount mnt-backup.automount
    [sudo] password for fredex:
    ● mnt-backup.mount – /mnt/backup
       Loaded: loaded (/etc/fstab; bad; vendor preset: disabled)
       Active: active (mounted) since Sat 2021-01-02 22:20:05 EST; 14h ago
        Where: /mnt/backup
         What: /dev/sdc1
         Docs: man:fstab(5)
               man:systemd-fstab-generator(8)
        Tasks: 0

    ● mnt-backup.automount
       Loaded: loaded
       Active: inactive (dead)
        Where: /mnt/backup
    [fredex@fcshome Desktop]$ systemctl cat mnt-backup.mount mnt-backup.automount No files found for mnt-backup.automount.
    # /run/systemd/generator/mnt-backup.mount
    # Automatically generated by systemd-fstab-generator

    [Unit]
    SourcePath=/etc/fstab Documentation=man:fstab(5) man:systemd-fstab-generator(8)
    RequiresOverridable=systemd-fsck@dev-disk-by\x2duuid-259ec5ea\x2de8a4\x2d465a\x2
    After=systemd-fsck@dev-disk-by\x2duuid-259ec5ea\x2de8a4\x2d465a\x2d9263\x2d1c062

    [Mount]
    What=/dev/disk/by-uuid/259ec5ea-e8a4-465a-9263-1c06217b9aaf Where=/mnt/backup Type=ext4
    Options=noauto

    the fstab statement I put in my last posting was a copy/paste from /etc/fstab, so it should be correct as shown. I don’t see a comma before noauto.

  • Reboot is not necessary as long as local-fs.target is restarted, but a fix for the /etc/fstab might be needed.

    Best Regards, Strahil Nikolov

    В неделя, 3 януари 2021 г., 21:18:30 Гринуич+2, Simon Matter написа:

    Did you already try a reboot?

    Don’t ask me why I ask this.

    Regards, Simon

  • The first question I would have is this: Has the auto-reboot occurred since the machine was last built or did this begin at some point after the build?

    Apologies if I missed this in the many threads stemming from your OP…

    – – –

  • That’s not correct. See ‘man fstab’. It should be

    device mount-point filesystem-type options dump fsck

    So you should have:

    UUID%9ec5ea-e8a4-465a-9263-1c06217b9aaf /mnt/backup ext4 x-systemd.automount,x-systemd.idle-timeoutmin,noauto 0 2

    Yeah, you put them in the wrong place.

    P.

  • OK, I think I’ve got it set up as described here, while fixing the misplaced fields in /etc/fstab:

    UUID%9ec5ea-e8a4-465a-9263-1c06217b9aaf /mnt/backup ext4
    x-systemd.automount,x-systemd.idle-timeoutmin,noauto 0 2

    now when I do, e.g., “ls /mnt/backup”

    I get:

    $ sudo !!
    sudo ls /mnt/backup ls: cannot open directory /mnt/backup: No such file or directory

    if I do:

    ls /mnt

    I see:

    backup

    use su to become root, then:
    ls -l /mnt shows:

    # ls -al total 4
    drwxr-xr-x. 3 root root 0 Jan 2 13:24 . dr-xr-xr-x. 21 root root 4096 Jan 2 09:22 .. dr-xr-xr-x. 2 root root 0 Jan 2 13:24 backup

    ls backup shows:

    # ls -al backup ls: cannot open directory backup: No such file or directory

    why? it clearly appears to exist ????

    the FS isn’t mounted, but /mnt/backup exists, so it should be visible as an entry directory. also, I can mount it manually:

    mount UUID%9ec5ea-e8a4-465a-9263-1c06217b9aaf /mnt/backup

    and then access it. but it doesn’t automount with, e.g. “ls /mnt/backup” or
    “ls /mnt/backup/backups”.

    I must still be doing something wrong but maybe I’m too stupid to see it.
    (Please don’t agree with me publicly…! :=) )

    Fred

  • Verify that:
    1. Autofs is not running
    2. Systemd has created ‘.mount’ and ‘.automount’ units systemctl status mnt-backup.mount mnt-backup.automount systemctl cat mnt-backup.mount mnt-backup.automount

    3. Verify that there are no errors in local-fs.target systemctl status local-fs.target

    4. Check for errors via:
    mount -a journalctl -e

    Best Regards Strahil Nikolov

    В понеделник, 4 януари 2021 г., 01:29:25 Гринуич+2, Fred написа:

    OK, I think I’ve got it set up as described here, while fixing the misplaced fields in /etc/fstab:

    UUID=259ec5ea-e8a4-465a-9263-1c06217b9aaf      /mnt/backup    ext4
    x-systemd.automount,x-systemd.idle-timeout=15min,noauto 0      2

    now when I do, e.g., “ls /mnt/backup”

    I get:

    $ sudo !!
    sudo ls /mnt/backup ls: cannot open directory /mnt/backup: No such file or directory

    if I do:

    ls /mnt

    I see:

    backup

    use su to become root, then:
    ls -l /mnt shows:

    # ls -al total 4
    drwxr-xr-x.  3 root root    0 Jan  2 13:24 . dr-xr-xr-x. 21 root root 4096 Jan  2 09:22 .. dr-xr-xr-x.  2 root root    0 Jan  2 13:24 backup

    ls backup shows:

    # ls -al backup ls: cannot open directory backup: No such file or directory

    why? it clearly appears to exist ????

    the FS isn’t mounted, but /mnt/backup exists, so it should be visible as an entry directory. also, I can mount it manually:

    mount UUID=259ec5ea-e8a4-465a-9263-1c06217b9aaf      /mnt/backup

    and then access it. but it doesn’t automount with, e.g. “ls /mnt/backup” or
    “ls /mnt/backup/backups”.

    I must still be doing something wrong but maybe I’m too stupid to see it.
    (Please don’t agree with me publicly…! :=) )

    Fred

    CentOS mailing list CentOS@CentOS.org https://lists.CentOS.org/mailman/listinfo/CentOS

  • OK, here’s where I stand now:
    1. I stopped and disabled autofs. (I have 2 SMB filesystems out on the LAN
    that have also been automounting with autofs, do I need to do similar changes in fstab for them?)
    2. yes it has.
    3. none I can see.
    4. nothing that leaps out at me. there are a couple about /mnt/backup not existing but they appear to be old ones, aren’t happening anymore.

    So, I’ve made a minor tweak to /etc/fstab, nothing that should matter. rebooted, and when it comes up /mnt/backup is mounted. TWICE, according to the output of mount:

    $ mount | grep backup systemd-1 on /mnt/backup type autofs
    (rw,relatime,fd=25,pgrp=1,timeout=900,minproto=5,maxproto=5,direct,pipe_ino=9840)
    /dev/sdc1 on /mnt/backup type ext4
    (rw,relatime,seclabel,stripe=8191,data=ordered)

    is this really a double mount, or is this what I’m supposed to be seeing?

    doesn’t seem to timeout and auto umount.

    Thanks again for your assistance!

    Fred

    CentOS mailing list CentOS@CentOS.org https://lists.CentOS.org/mailman/listinfo/CentOS

  • Hi Fred, no I was asking about the auto mount and umount issue you had. Did you get it to work correctly?

    Simon