Hello Mailing List
I got a severe network error message at a HP DL360 Server. The kernel log says:
———————————– /var/log/messages —————————————————————
If that’s a DL360 G7 server, make sure you’ve applied all of the latest firmware patches from HP on it. The G7 version has been almost notorious for firmware issues with drive controllers, ethernet interfaces, etc.
It is a G6 Server and the firmware is more or less the latest version:
# bash CP017428.scexe -c MAC PCI-ID NIC
18A90576C820 14E4-1639-103C-7055 HP NC382i DP Multifunction Gigabit Server Adapter
(Installed) (Available) Interface
Image Version Image Version eth0
What’s the irq number you can find for the device? You may have to find the driver development guide to figure out what the debug message says.
Just the first line points out there is no irq for the device. You can check it in /proc/interrupts, then find a match in /proc/irq/
How often are you getting these crashes ?
I had simular problem on my HP DL380 G7 server.
I disabled Active State PowerManagement on the PCI-E express.
Add pcie_aspm=off as optional boot option.
Reykjavik – Iceland
This was the first time that this problem occurred – with 60 Servers and about half a year of CentOS 6 (5 before). But because the interfaces have a permanent load – really 24×7 – problems with power management would be a disaster. I will try to switch off.
After you have tried the pcie_aspm boot option, also try :
echo performance > /sys/module/pcie_aspm/parameters/policy This will disable ASPM on PCIe and operate with maximum performance.
This is what I use today on the DL380 G7.