We have 1 system ruining CentOS7 that is the NFS server. There are 50
external machines that FTP files to this server fairly continuously.
We have another system running CentOS6 that mounts the partition the files are FTP-ed to using NFS.
There is a python script running on the NFS client machine that is reading these files and moving them to a new dir on the same file system (a mv not a cp).
Almost daily this script hangs while reading a file – sometimes it never comes back and cannot be killed, even with -9. Other times it hangs for 1/2
hour then proceeds on.
Coinciding with the hanging I see this message on the NFS server host:
nfsd: peername failed (error 107)
And on the NFS client host I see this:
nfs: V4 server returned a bad sequence-id nfs state manager – check lease failed on NFSv4 server with error 5
The first client message is always at the same time as the hanging starts. The second client message comes 20 minutes later. The server message comes 4 minutes after that. Then 3 minutes later the script un-hangs (if it’s going to).
Can anyone shed any light on to what could be happening here and/or what I
could do to alleviate these issues and stop the script from hanging?
Perhaps some NFS config settings? We do not have any, so we are using the defaults.