CentOS 7 As Gateway – UDP Performance Is Busted/awful?

Home » CentOS » CentOS 7 As Gateway – UDP Performance Is Busted/awful?

August 14, 2014 Tom Horsley CentOS 14 Comments

I just replaced a dead system disk on my KVM host that was running an ancient fedora 13. Since CentOS 7 was available, I decided to go with it to get some long term stability.

The problem is that NFS mounts inside the virtual machines don’t work for spit when talking to older NFS servers that must speak UDP.

Is there something about UDP traffic that requires tweaks I don’t know about for CentOS 7 to serve as a gateway machine?
I’ve got the ip forwarding settings and other sysctl stuff that was set in the old fedora 13 system.

I’ve got the bridges defined that same way as the old f13
system.

I’ve got TCP stream connections working flawlessly, it is just the UDP traffic that seems to barf.

Does this strike a familiar note with anyone?

When I run wireshark on the KVM host machine, I see NFS packets retransmitting a lot and I also see ICMP
messages about Destination Unreachable, Fragmentation Needed. (I don’t know what any of it means though :-).

This is an intel motherboard with these ethernets:
04:00.0 Ethernet controller: Intel Corporation 80003ES2LAN Gigabit Ethernet Controller (Copper) (rev 01)
04:00.1 Ethernet controller: Intel Corporation 80003ES2LAN Gigabit Ethernet Controller (Copper) (rev 01)

14 thoughts on - CentOS 7 As Gateway – UDP Performance Is Busted/awful?

Tony Mountifield says:

August 14, 2014 at 12:06 pm

In article <20140814120002.16440e86@tomh>, Tom Horsley wrote:

This means that either the host or one of the guests is trying to send packets with a larger MTU than part of the path to the destination will allow.

If you look inside the ICMP packet in wireshark, it will tell you who sent it and what MTU they said was acceptable.

For TCP, the protocol stack is able to adapt by reducing its MSS
dynamically in response to those ICMPs and retry. I don’t think UDP is able to do that.

Also examine the MTU settings for your network interfaces on both the host and the guests, using ifconfig -a.

Cheers Tony
Tom Horsley says:

August 14, 2014 at 1:19 pm

Well, I’m definitely drowning in network confusion here :-).

Everyone’s MTU is the default 1500, I checked all systems in the path.

The wireshark display says 1516 in the Length column for the NFS packet that always shows up before the ICMP errors. If I
expand the “IP V4” line in the packet, it says “Total Length: 1500”
for that READDIRPLUS Reply which says 1516 for the capture length. It also has the “Don’t fragment” flag set.

It looks like the 16 byte extra is confusing it, but I have no idea why that is different than the IPv4 length info.
Les Mikesell says:

August 14, 2014 at 1:35 pm

I thought NFS defaulted to writing 8192 blocks and let the network stack fragment as needed, so having DF set doesn’t make much sense. Also, some firewalling schemes have issues with fragments, especially if they arrive out of order – not sure about the new stuff in C7.
Tom Horsley says:

August 14, 2014 at 1:53 pm

I think it is those fragments I’m looking at in wireshark.

I just did another experiment – If I mount the same NFS
filesystem on the CentOS 7 host, and do the same “ls”
command, it works perfectly and the wireshark trace shows the same 1516 capture length for the NFS readdir messages.

Somehow it is just the idea of forwarding the UDP packets to the virtual machine that the host objects to. The exact same size packets destined for it to use directly have no problems.
Les Mikesell says:

August 14, 2014 at 2:09 pm

Seems like a horrible thing to do, but does it fix it if you mount with rsize00, wsize00 – or maybe 1484?

Are you just bridging to the NIC interface? I don’t see why that would need to change the packets at all. What happens if you ping with a large -s value through the bridge (host or external box to guest)?
Tom Horsley says:

August 14, 2014 at 2:48 pm

I already tried that – no change :-).

There are two NICs. The one with the bridge is also running a subnet with the virtual machines and one real machine on the NIC. The other NIC is connected to the wider world of our local LAN where the NFS
servers reside, so the host has to operate as a gateway for the traffic from the LAN to the virtual machine subnet.

I did just try the ping experiment, and on the outer NFS server, if I
try to ping the virtual machine with a big size, I get the error about the packet fragmentation:

dino> ping -c 1 -s 1500 ubu14d04x PING ubu14d04x.ccur.kvm (192.168.118.52) from 10.134.30.46 : 1500(1528) bytes of data.

But weirdly, I don’t get that from every machine I try out here on the LAN, some can ping it just fine, others get the error.

Whatever I discover just makes me more confused :-).
Les Mikesell says:

August 14, 2014 at 3:14 pm

It just seems very wrong for the NFS device to be sending 1516 bytes –
and to set DF on the packet. What OS is it and what does it say about its own MTU? Physically, ethernet will accommodate 1518-1522 to allow VLAN tagging but you shouldn’t have that without knowing about it (and your swiitch ports configured to trunk).

I think dropping the packet is actually the correct thing in that scenario. It should not forward something larger than the next interface’s MTU and if the DF bit is set it can’t fragment there. If you have IP’s to spare on the NFS subnet, you might get away with bridging there and adding a virtual NIC to the guest(s) that need access.
Gordon Messmer says:

August 14, 2014 at 6:33 pm

Try turning off TSO:

# ethtool -K eth0 tso off
Tony Mountifield says:

August 15, 2014 at 4:07 am

In article <20140814141900.777d6f0c@tomh>, Tom Horsley wrote:

The 1516 is the total length of the ethernet frame, and is normal for a 1500 MTU. The 16 bytes is the link-layer header.

When looking at the ICMP Frag-needed packet in Wireshark, look particularly at (a) its source and destination addresses, (b) the
“MTU of next hop” field (in expansion of ICMP), and (c) the source and destination addresses of the packet it was complaining about.

Here’s an example from one of my recent traces:

Frame 235: 72 bytes on wire (576 bits), 72 bytes captured (576 bits)
Linux cooked capture Internet Protocol Version 4, Src: 10.30.0.245 (10.30.0.245), Dst: 172.22.21.48 (172.22.21.48)
(a) ^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^
Internet Control Message Protocol
Type: 3 (Destination unreachable)
Code: 4 (Fragmentation needed)
Checksum: 0x81df [correct]
MTU of next hop: 1476
(b) ^^^^
Internet Protocol Version 4, Src: 172.22.21.48 (172.22.21.48), Dst: 172.27.60.31 (172.27.60.31)
(c) ^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^
Transmission Control Protocol, Src Port: SSH (22), Dst Port: 56199 (56199)

Cheers Tony
Tom Horsley says:

August 15, 2014 at 6:50 am

I think I have my answer: The kernel is busted (or something isn’t loaded that I need, but don’t know about :-).

I copied my Fedora 20 desktop 3.15.8-200.fc20.x86_64 kernel and /lib/module files to the CentOS7 KVM host, rebuilt grub.cfg, and rebooted into the 3.15.8-200 kernel, and with no other changes the UDP packet forwarding is now working perfectly.

I guess it is time to make yet another bugzilla account and submit a bug…
Akemi Yagi says:

August 15, 2014 at 7:04 am

It is much easier if you use ELRepo’s kernel-ml
(http://elrepo.org/tiki/kernel-ml).

Yes, good idea.

Akemi
Tom Horsley says:

August 15, 2014 at 7:31 am

Does look like a better long term solution, fedora was just a hack for testing :-).

And here it is:
http://bugs.CentOS.org/view.php?idu05
David Both says:

August 15, 2014 at 8:19 am

Nope. The kernel is not busted.

You just need to add a few rules to your firewall in order to tell it to forward the packets appropriately. While you do need “net.ipv4.ip_forward = 1” line in
/etc/sysctl.conf, and you also need to set /proc/sys/net/ipv4/ip_forward to 1 if you have not rebooted after setting the line in sysctl.conf, firewall rules are required to make it work.

Unfortunately the specific firewall rules you require will depend upon the release level of the distribution you use. IPTables has changed a bit over the years and so the specific rules and their syntax has changed as well. Here is what I use now with CentOS 6.5+ on my own network.

# Generated by iptables-save v1.4.7 on Fri Aug 15 09:11:28 2014
*filter
:INPUT ACCEPT [0:0]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [825:47118]
:fail2ban-SSH – [0:0]
-A INPUT -p tcp -m tcp –dport 22 -j fail2ban-SSH
-A INPUT -m conntrack –ctstate RELATED,ESTABLISHED -j ACCEPT
-A INPUT -p icmp -j ACCEPT
-A INPUT -i lo -j ACCEPT
-A INPUT -i eth+ -j ACCEPT
-A INPUT -p tcp -m conntrack –ctstate NEW -m tcp –dport 22 -j ACCEPT
-A INPUT -j REJECT –reject-with icmp-host-prohibited
-A FORWARD -m conntrack –ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -p icmp -j ACCEPT
-A FORWARD -i lo -j ACCEPT
-A FORWARD -i eth0 -j ACCEPT
-A FORWARD -i eth1 -j ACCEPT
-A FORWARD -j REJECT –reject-with icmp-host-prohibited
-A fail2ban-SSH -j RETURN
COMMIT
# Completed on Fri Aug 15 09:11:28 2014
# Generated by iptables-save v1.4.7 on Fri Aug 15 09:11:28 2014
*nat
:PREROUTING ACCEPT [80965:6238336]
:POSTROUTING ACCEPT [37811:2251658]
:OUTPUT ACCEPT [838:63592]
-A PREROUTING -d 24.199.159.56/29 -p tcp -m tcp –dport 80 -j DNAT
–to-destination 192.168.0.53:80
-A PREROUTING -d 24.199.159.56/29 -p tcp -m tcp –dport 25 -j DNAT
–to-destination 192.168.0.53:25
-A POSTROUTING -s 192.168.0.0/24 -j MASQUERADE
COMMIT
# Completed on Fri Aug 15 09:11:28 2014

The FORWARD rules in the filter table allow forwarding from your internal networks on eth0 and eth1 to the outside world. The Destination NATing PREROUTING rules allow incoming packets for SMTP and HTTP to be routed to the appropriate server on my inside network.

I hope this helps.
Tom Horsley says:

August 15, 2014 at 8:31 am

Nah, all the forwarding rules were in place. They all worked before I switched to CentOS7, and they all worked after I booted the fedora kernel. No sysctl or iptables changes were made when switching from CentOS to fedora kernel, yet the forwarding started working after booting fedora.

I suspect if I backed up to the kernel CentOS 6.5 uses that would work as well. I betcha someone has a < that should be a <= somewhere in an MTU size check in the CentOS7 kernel :-).

CentOS 7 As Gateway – UDP Performance Is Busted/awful?

14 thoughts on - CentOS 7 As Gateway – UDP Performance Is Busted/awful?

Recommended

Recent Posts

Recent Comments

Archives

Categories

Meta