Network Hangs After Several Hours (CentOS 6 Recently Upgraded Kernel/glibc)

Home » CentOS » Network Hangs After Several Hours (CentOS 6 Recently Upgraded Kernel/glibc)
CentOS 7 Comments

Hi all,

We have a development server we have just tried updating the kernel & glibc after recent recommendations. Its been stable previously for a few years with only scheduled reboots.

Its running CentOS 6.6(final)
2.6.32-573.18.1.el6.x86_64
GNU libc 2.12

Upgraded via YUM, rebooted, all fine for several hours, and then network seemed to hang. Not much happening as its a dev server we are testing before moving to production.

Googling, I see there is some history of e100e driver having issues, and I’m wondering if it could be related.

Does anyone have any thoughts on where to do with it, as I’m assuming it will hang again later.

Thanks, Ian

Feb 18 05:04:36 kernel: WARNING: at net/sched/sch_generic.c:261
dev_watchdog+0x26d/0x280() (Not tainted)
Feb 18 05:04:36 kernel: Hardware name: X9SCL/X9SCM
Feb 18 05:04:36 kernel: NETDEV WATCHDOG: eth0 (e1000e): transmit queue 0
timed out Feb 18 05:04:36 kernel: Modules linked in: ip6t_REJECT nf_conntrack_ipv6
nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 ext4
jbd2 e1000e serio_raw i2c_i801 i2c_core sg iTCO_wdt iTCO_vendor_support shpchp ext3 jbd mbcache raid1 sd_mod crc_t10dif ahci dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
Feb 18 05:04:36 kernel: Pid: 0, comm: swapper Not tainted
2.6.32-220.4.2.el6.x86_64 #1
Feb 18 05:04:36 kernel: Call Trace:
Feb 18 05:04:36 kernel: [] ?
warn_slowpath_common+0x87/0xc0
Feb 18 05:04:36 kernel: [] ? warn_slowpath_fmt+0x46/0x50
Feb 18 05:04:36 kernel: [] ? dev_watchdog+0x26d/0x280
Feb 18 05:04:36 kernel: [] ? insert_work+0x6d/0xb0
Feb 18 05:04:36 kernel: [] ? dev_watchdog+0x0/0x280
Feb 18 05:04:36 kernel: [] ? run_timer_softirq+0x197/0x340
Feb 18 05:04:36 kernel: [] ? tick_sched_timer+0x0/0xc0
Feb 18 05:04:36 kernel: [] ? lapic_next_event+0x1d/0x30
Feb 18 05:04:36 kernel: [] ? __do_softirq+0xc1/0x1d0
Feb 18 05:04:36 kernel: [] ? hrtimer_interrupt+0x140/0x250
Feb 18 05:04:36 kernel: [] ? call_softirq+0x1c/0x30
Feb 18 05:04:36 kernel: [] ? do_softirq+0x65/0xa0
Feb 18 05:04:36 kernel: [] ? irq_exit+0x85/0x90
Feb 18 05:04:36 kernel: [] ?
smp_apic_timer_interrupt+0x70/0x9b Feb 18 05:04:36 kernel: [] ?
apic_timer_interrupt+0x13/0x20
Feb 18 05:04:36 kernel: [] ? intel_idle+0xde/0x170
Feb 18 05:04:36 kernel: [] ? intel_idle+0xc1/0x170
Feb 18 05:04:36 kernel: [] ? cpuidle_idle_call+0xa7/0x140
Feb 18 05:04:36 kernel: [] ? cpu_idle+0xb6/0x110
Feb 18 05:04:36 kernel: [] ? rest_init+0x7a/0x80
Feb 18 05:04:36 kernel: [] ? start_kernel+0x424/0x430
Feb 18 05:04:36 kernel: [] ?
x86_64_start_reservations+0x125/0x129
Feb 18 05:04:36 kernel: [] ?
x86_64_start_kernel+0xfa/0x109
Feb 18 05:04:36 kernel: —[ end trace 21915186e9d87b29 ]-

7 thoughts on - Network Hangs After Several Hours (CentOS 6 Recently Upgraded Kernel/glibc)

  • Just noticed that in the trace, it shows an old kernel, so I don’t think grub was automatically selecting the latest kernel. Just wondering what process updates the default to be the latest kernel, and if a problem could be an update but grub selecting an older kernel, but other packages updated
    ?

  • If your machine is “running CentOS 6.6(final)”, but you’ve installed the new kernel and glibc that implies that you are selectively applying updates. The 6.7 point release came out last fall. In addition to the security implications of not fully updating the system you may have missed packages that are impacting networking.

    You may want to do a full updating of the system and then see how it acts — it’s hard to debug a system that may have mis-matched pieces.

    To see which kernel your grub is set to load by default, look at the grub.conf — the “default=” line (normally “0”) indicates which of the listed kernels will be selected.

    If the “default” value isn’t “0”, and/or the newest kernel isn’t the first entry, then you have something mucking with things. Check your
    /etc/sysconfig/kernel file for starters.

  • Thanks Richard,

    We currently do all security updates at short notice (as opposed to everything), via a script. I’ve amended the grub config and rebooted to make sure it will reboot into the correct kernel now, and yes
    /etc/sysconfig/kernel was different to production servers. We may try all packages if it continues to be unstable now and maybe whatever as its on a dev server to test.

    Thanks again,

    Ian

  • Am 19.02.2016 um 13:47 schrieb Ian B :

    Why being selective about updates (despite the already mentioned implications that obviously where not recognized while doing it)?
    What is your scenario that requires this? I’m just curious …

  • I’m with you on this one!

    It so irritates that I’ve been searching for mail readers that have a plug-in to fold quoted sections – know of any? Haven’t found one yet, but I’m still looking.

    ak.