Hung Nfs Mount

Home » CentOS » Hung Nfs Mount
CentOS 3 Comments

What is the best approach when an nfs mount hangs on a client but the server is OK? I have mount options of:
rw,bg,soft,intr,rsize2768,wsize2768
but whatever it did was not interruptable and would not shut down.

There were some:
Oct 15 09:08:32 dev-ngf-l-01 kernel: INFO: task gnome-settings-:19169
blocked for more than 120 seconds. Oct 15 09:08:32 dev-ngf-l-01 kernel: “echo 0 >
/proc/sys/kernel/hung_task_timeout_secs” disables this message.

messages on the console and /var/log/messages.

Is this a bug or there a way to avoid it?

3 thoughts on - Hung Nfs Mount

  • Did you also check /var/log/messages on the nfs server side ?

    I had some NFS troubles with lockd some times ago and it was a firewall problem on the client:

    Try:
    – log on the NFS server and check in /var/log/messages which client is responsible for the problem (it could be an other one than your client).
    – on this client stop iptables (service iptables stop) and check if the problem still exist.

    In my configs, client iptables fully trust my NFS server.

    Patrick

  • No jumbo frames, no firewalling, no server side issues. This is a lab setup with on server holding home directories and about 10 other hosts and VMs mounting it as /home. There is heavy network testing on some of the servers but the NFS connection runs over a different interface/subnet. I think the issue is triggered by a user running NX/freenx sessions on multiple hosts and something gnome is trying to lock in the common home directory, but regardless it is a kernel hang to the point that I had to pull the plug to get the machine to shut down. And now one user (perhaps the only one with Gnome sessions on multiple hosts) has things hanging again – even an SSH login by this users hangs with this in the logs:

    Oct 16 09:24:25 dev-l-01 kernel: INFO: task bash:20785 blocked for more than 120 seconds. Oct 16 09:24:25 dev-l-01 kernel: “echo 0 >
    /proc/sys/kernel/hung_task_timeout_secs” disables this message. Oct 16 09:24:25 dev–01 kernel: bash D 0000000000000008 0
    20785 20784 0x00000080
    Oct 16 09:24:25 dev-l-01 kernel: ffff882066ecfba8 0000000000000082
    0000000000000000 ffff881064998740
    Oct 16 09:24:25 dev-l-01 kernel: ffff882066ecfb28 ffffffff8119b30a ffff881065d12200 ffff881064998740
    Oct 16 09:24:25 dev-l-01 kernel: ffff8820665ef058 ffff882066ecffd8
    000000000000fb88 ffff8820665ef058
    Oct 16 09:24:25 dev-l-01 kernel: Call Trace:
    Oct 16 09:24:25 dev-l-01 kernel: [] ? dput+0x9a/0x150
    Oct 16 09:24:25 dev-l-01 kernel: []
    __mutex_lock_slowpath+0x13e/0x180
    Oct 16 09:24:25 dev-l-01 kernel: [] mutex_lock+0x2b/0x50
    Oct 16 09:24:25 dev-l-01 kernel: [] do_lookup+0x11b/0x230
    Oct 16 09:24:25 dev-l-01 kernel: []
    __link_path_walk+0x734/0x1030
    Oct 16 09:24:25 dev-l-01 kernel: [] ?
    handle_pte_fault+0xf7/0xb50
    Oct 16 09:24:25 dev-l-01 kernel: [] path_walk+0x6a/0xe0
    Oct 16 09:24:25 dev-l-01 kernel: [] do_path_lookup+0x5b/0xa0
    Oct 16 09:24:25 dev–l-01 kernel: [] user_path_at+0x57/0xa0
    Oct 16 09:24:25 dev–l-01 kernel: [] vfs_fstatat+0x3c/0x80
    Oct 16 09:24:25 dev-l-01 kernel: [] vfs_stat+0x1b/0x20
    Oct 16 09:24:25 dev-l-01 kernel: [] sys_newstat+0x24/0x50
    Oct 16 09:24:25 dev-l-01 kernel: [] ?
    audit_syscall_entry+0x1d7/0x200
    Oct 16 09:24:25 dev–l-01 kernel: [] ?
    __audit_syscall_exit+0x265/0x290
    Oct 16 09:24:25 dev-l-01 kernel: []
    system_call_fastpath+0x16/0x1b

LEAVE A COMMENT