Old Kernel Bug Back In CentOS 6.10?

Home » CentOS » Old Kernel Bug Back In CentOS 6.10?
CentOS 1 Comment

I updated a few hypervisors and their VMs to CentOS 6.10 on Monday;
today I awoke to an alert saying all VMs are down. It looks like a very old bug crept back in.

The machine is a ProLiant DL380 G7 with Xeon X5675 and 96 GB, running half a dozen smallish VMs. Hypervisor and all VMs have kernel
2.6.32-754.2.1.el6.x86_64. Around the time the VMs must have gone down, there are quite a few error messages like the following in the system log:

Aug 16 03:10:13 hyper-7 kernel: [265397.382552] vmwrite error: reg 6000 value fffffffffffffff7 (err -9)
Aug 16 03:10:13 hyper-7 kernel: [265397.421372] Pid: 9375, comm: qemukvm Not tainted 2.6.32-754.2.1.el6.x86_64 #1
Aug 16 03:10:13 hyper-7 kernel: [265397.464985] Call Trace:
Aug 16 03:10:13 hyper-7 kernel: [265397.481530] [] ? vmwrite_error+0x2c/0x30 [kvm_intel]
Aug 16 03:10:13 hyper-7 kernel: [265397.520737] [] ? vmcs_writel+0x20/0x30 [kvm_intel]
Aug 16 03:10:13 hyper-7 kernel: [265397.560028] [] ? vmx_fpu_activate+0x93/0xc0 [kvm_intel]
Aug 16 03:10:14 hyper-7 kernel: [265397.600072] [] ? kvm_arch_vcpu_create+0x37/0x50 [kvm]
Aug 16 03:10:14 hyper-7 kernel: [265397.638183] [] ? kvm_vm_ioctl+0x601/0x1050 [kvm]
Aug 16 03:10:14 hyper-7 kernel: [265397.674367] [] ? free_one_page+0x191/0x440
Aug 16 03:10:14 hyper-7 kernel: [265397.708101] [] ? vfs_ioctl+0x29/0xc0
Aug 16 03:10:14 hyper-7 kernel: [265397.739124] [] ? __free_pages+0x46/0xa0
Aug 16 03:10:14 hyper-7 kernel: [265397.773193] [] ? do_vfs_ioctl+0x3aa/0x590
Aug 16 03:10:14 hyper-7 kernel: [265397.805774] [] ? free_pages+0x49/0x50
Aug 16 03:10:14 hyper-7 kernel: [265397.839147] [] ? sys_ioctl+0x81/0xa0
Aug 16 03:10:14 hyper-7 kernel: [265397.870109] [] ? __audit_syscall_exit+0x25e/0x290
Aug 16 03:10:14 hyper-7 kernel: [265397.909358] [] ? system_call_fastpath+0x2f/0x34

Curiously, the messages don’t seem to indicate anything fatal in and of themselves; there are a two like this a minute after bootup and like a dozen more after about a day, none of which seems to have crashed anything. However, it’s the only obvious anomaly I could find around the time and as it’s VT-x related, I reckon there’s a connection.

The stack trace closely resembles this bug that turned up in 2015 and was fixed long ago: https://lkml.org/lkml/2015/7/3/288

Has anyone seen this recently and could confirm or refute any of my guesswork?

Cheers, Matthias

One thought on - Old Kernel Bug Back In CentOS 6.10?