Why is irqbalance not balancing?
I am running irqbalance with default configuration on an Atom 330 machine. This CPU has 2 physical cores + 2 SMT (aka Hyperthreading) cores.
As shown below the interrupt for the eth0 device is always on CPUs 0 and 1, with CPUs 2 and 3 left idle. But why?
Maybe irqbalance prefers physical cores? My understanding, though, is that the even-numbered CPUs are the physical cores, with the odd-numbered one being the SMT cores. If this understanding is correct, it means that irqbalance is toggling between a single physical core and its SMT sibling.
Any thoughts on why irqbalance is not using all 4 CPUs to distribute the eth0 interrupts?
Thanks.
——————-
2 thoughts on - Why is irqbalance not balancing?
Steve Snyder wrote:
I believe the hyperthreading cores are enumerated after the ‘real’ cores – you can see this by using ‘lstopo’ – part of the hwloc package (‘yum install hwloc’) i.e. logical CPUs 0 and 1 are the ‘real’ cores and logical CPUs 2 and 3 are the HT cores
I suspect interrupts only have meaning on the real cores – hence you not seeing any on CPUs 2 and 3
James Pearson
Maybe this utility will be useful to you.
http://www.open-mpi.org/projects/hwloc/
Portable Hardware Locality (hwloc)
“The Portable Hardware Locality (hwloc) software package provides a
portable abstraction (across OS, versions, architectures, …) of the
hierarchical topology of modern architectures, including NUMA memory
nodes, sockets, shared caches, cores and simultaneous multithreading.
It also gathers various system attributes such as cache and memory
information as well as the locality of I/O devices such as network
interfaces, InfiniBand HCAs or GPUs. It primarily aims at helping
applications with gathering information about modern computing
hardware so as to exploit it accordingly and efficiently.
The democratization of multicore processors and NUMA architectures
leads to the spreading of complex hardware topologies into the whole
server world. Nodaways every single cluster node may contain tens of
cores, hierarchical caches, and multiple memory nodes, making its
topology far from flat. Such complex and hierarchical topologies have
strong impact of the application performance. The developer must take
hardware affinities into account when trying to exploit the actual
hardware performance. (…)”