Loss Of Ethernet Adaptor

Home » CentOS » Loss Of Ethernet Adaptor
CentOS 16 Comments

At ~07:40 (UTC-4:00) this morning our gateway host lost its WAN Ethernet adaptor. Subsequent to recovery, which required a reboot, the following entries were find in /var/log/messages:

Jun 6 07:39:50 gway02 kernel: PING_FLOOD: IN=eth0 OUT= MAC:25:90:61:74:c0:00
:24:14:2b:f2:80:08:00 SRCt.205.112.125 DST!6.185.71.33 LENd TOS=0x00 PREC0x00 TTLP ID0954 PROTO=ICMP TYPE=8 CODE=0 ID%496 SEQ=0
Jun 6 07:39:53 gway02 kernel: PROBE_BLACKIST: IN=eth0 OUT=eth1 SRC2.235.101.
24 DST!6.185.71.249 LENR TOS=0x08 PREC=0x20 TTLE ID&123 DF PROTO=TCP SPT
T197 DPTD5 WINDOW

16 thoughts on - Loss Of Ethernet Adaptor

  • James B. Byrne wrote:

    Well, let’s start with you being probed/attacked from China: whois
    122.235.101.24

    inetnum: 122.235.0.0 – 122.235.127.255
    netname: CHINANET-ZJ-HZ
    country: CN
    descr: CHINANET-ZJ Hangzhou node network descr: Zhejiang Telecom
    <...>
    role: CHINANET-ZJ Hangzhou address: No.352 Tiyuchang Road,Hangzhou,Zhejiang.310003
    country: CN
    phone: +86-571-85157929
    fax-no: +86-571-85102776
    e-mail: anti_spam@mail.hz.zj.cn remarks: send spam reports to anti_spam@mail.hz.zj.cn remarks: and abuse reports to anti_spam@mail.hz.zj.cn

    And whois reports the puppy above is not only from Hong Kong, but remarks: -+-+-+-+-+-+-+-+-+-+-+-++-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    remarks: This object can only be updated by APNIC hostmasters. remarks: To update this object, please contact APNIC
    remarks: hostmasters and include your organisation’s account remarks: name in the subject line. remarks: -+-+-+-+-+-+-+-+-+-+-+-++-+-+-+-+-+-+-+-+-+-+-+-+-+-+

    which suggests that the IP or range or domain is an ex….

    So, next question is, is the card working again? If so, then this is an attack I’ve not heard of, that affects what’s this, layer 0?

    mark

  • Hi,

    We ran into this problem also – the interface would disappear. There is newer e1000e driver that fixes it or you could add pcie_aspm=off to your kernel command line.

    HTH, Steve

  • Re: Packet of Death attack: a deadly DoS against Intel NICs

    It appears that my problem is caused by something else as the EPROM
    fingerprint matches the ‘good’ version (mostly).

    ethtool -e eth0
    . . .
    0x0010:01 01 ff ff 6b 02 d3 10 d9 15 d3 10 ff ff 58 a5
    . . .
    0x0030:c9 6c 50 31 3e 07 0b 46 84 2d 40 01 00 f0 06 07
    . . .

    However this matches neither the known ‘bad’ nor the reputed ‘good’ EPROM image:

    0x0060:00 01 ff ff ff ff ff ff ff ff ff ff ff ff ff ff

    But it seems a lot closer to the ‘bad:

    0

  • Hi,

    Don’t know if you saw my prior email, but we experienced this exact same problem see log excerpts below:
    … Jul 31 17:05:18 wolfpac kernel: pciehp 0000:00:1c.5:pcie04: Card not present on Slot(37)
    Jul 31 17:05:18 wolfpac kernel: pciehp 0000:00:1c.5:pcie04: Card present on Slot(37)
    Jul 31 17:05:18 wolfpac kernel: device eth5 left promiscuous mode Jul 31 17:05:19 wolfpac kernel: e1000e 0000:07:00.0: PCI INT A disabled Jul 31 17:05:20 wolfpac ntpd[2726]: Deleting interface #7 eth5, 192.168.198.95#123, interface stats: received=517, sent=522, dropped=0, active_time=108106 secs Jul 31 17:05:20 wolfpac ntpd[2726]: Deleting interface #8 eth5, fe80::290:bff:fe2a:acf3#123, interface stats: received=0, sent=0, dropped=0, active_time=108039 secs

    This would randomly happen on systems that weren’t connected directly to the internet. We experienced this on multiple systems. Since we upgraded to the latest elrepo driver and added pcie_aspm=off to our kernel command line we have never experienced the issue again.

  • Thank you. I did get your message and I simply have not had time to test its implementation as it necessarily involves a restart of the test system. I am trying to discover if there is some way of restarting a headless server and use a specific grub entry instead of the default. I want to leave the default unchanged until I can prove that any manual changes I make do not negatively impact a system restart.

    If anyone knows if this is possible and if so, how it is done, I would welcome the information.

    Regards,

  • James B. Byrne wrote:

    That’s a no-brainer: change the default= line in grub from 0 to whatever the entry number is. Note that I’m not sure what happens if you add a kernel update in there, whether the post-install scripts will increment the number so as to continue to point to the correct kernel.

  • James B. Byrne wrote:

    That’s a no-brainer: change the default= line in grub from 0 to whatever the entry number is. Note that I’m not sure what happens if you add a kernel update in there, whether the post-install scripts will increment the number so as to continue to point to the correct kernel.

    mark

  • If you happen to be fortunate enough and have (ipmi v2) Serial over LAN
    configured, you can reboot and change the boot selection.

    Not really. James wrote that he does not want to “negatively impact a system restart”. If I was in his shoes, I wouldn’t change the grub default boot item without serial-over-lan access, a KVM switch with network access, or
    “remote hands” on site. Otherwise you just changed your default boot item and it could cyclically crash and (possibly) reboot.

  • Thank you. Based on my readings of this reference there are two mechanisms available: 1. boot once, 2. fallback.

    The critical step seems to be issuing the command ‘grub-set-default n’ where n is a value between 0 and the number of entries in boot.conf less one.

    Reading the boot-once fallback documentation it recommends fallback as the superior alternative.

    <-

  • This is a return to an issue I first raised back in June. We had a similar occurrence in September while I was away and so I am revisiting the entire matter.

    Steve Clark on 6 Jun 16:02 2014 wrote:

    I have run into other reports of similar occurrences and some of these refer to this bug report: https://bugzilla.redhat.com/show_bug.cgi?idc2650

    However, that report is closed as being a duplicate of:
    https://bugzilla.redhat.com/show_bug.cgi?idV2273

    Which is not available to viewing by the great unwashed.

    Nonetheless, following the discussion thread in the bug report that I can view it appears that this issue was supposedly resolved sometime in late 2012. From what I can gather the fix was to disable ASPM L1 for this model adaptor in the e1000e driver module.

    * Upstream commit d4a4206ebbaf48b55803a7eb34e330530d83a889 – e1000e: Disable ASPM L1 on 82574

    However, when I run lspci -vvv on the host that exhibited the problem I see this:

    . . .
    03:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection
    Subsystem: Super Micro Computer Inc Device 10d3
    Physical Slot: 0-2
    Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
    Stepping- SERR- FastB2B- DisINTx+
    Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL

  • I’m the one who did the submission. Some of my comments (which I
    thought were helpful) have been hidden by Red Hat.

    I don’t have access, either.

    My suggestion for you is to give ELRepo’s kmod-e1000e a try. It has the latest version from Intel (3.1.0.2) as opposed to the version in the EL kernels (2.3.2-k). There are known cases in which a later version resolved issues.

    Akemi

  • Both BZs above are RHEL 5 specific, being 562273 a “driver update” one. Did you report this against any RHEL6 too?

    Marcelo