We encountered a weird problem today, and I thought some of you might like to hear the solution.
The underlying change was listed in the 7.4 changelog, so it’s not a bug, but it may drive you buggy.
The majority of our HPC cluster nodes run CentOS 7, though the exact patch levels vary from node to node. None is older than 7.3, but a few newer nodes were kickstarted right to 7.4.
The problem was that our mounts of Isilon NFS exports were failing randomly among the nodes. Routing was fine. Network connectivity was fine.
The short answer is that the default in 7.4, and I think in the nfs-utils-1.3.0-0.48.el7 package in particular, has changed. While NFS
v4.0 was the default up to 7.3, the 7.4 protocols are subtly different:
1. Try NFS v4.1 first
2. Fail down to NFS v3
3. Fail down to NFS v2
The problem is that our Isilon works with NFS v4.0, not 4.1, but 4.0
is not in the fail-down path.
The short-term answer is to specify nfsvers=4.0 in our autofs configuration files, which works like a charm.
Like I said, this was an announced change, but the implications escaped us until now. So this little writeup is just for the record.