Andrew Mobbs (mobbsy) wrote,
Andrew Mobbs

Argh. To record this one for posterity.

Take one NFS mount on a RHEL 3 box.

It used to Just Work.

One day, "ls -l" consistently hangs, as does "mv", and "cp". Many other things work, including "ls" and "echo hello > foo", and "lsof".

It turns out that the call that is hanging is "getxattr".

Another RHEL 3 box, installed from the same image doesn't have this problem. "ls -l" on the same mount doesn't bother calling getxattr.

The processes on the problem machine can be recovered with the following procedure:
kill -9 <PID>
(process is still alive, still hung on disk wait)
umount -f <MOUNTPOINT>
(errors claiming fs is busy, but hung processes die)
umount -f <MOUNTPOINT>
(yes, again, but no errors this time)
(Ta-da - filesystem reappears, hung processes are dead. However, all commands that call getxattr still exhibit the same problem.)

Attempting a forced unmount without the kill doesn't do anything useful.

Rinse - repeat - get same result time and again - fiddle - write one-line test program for bug report - everything mysteriously starts working again. Even the getxattr test program just returns EOPNOTSUPP rather than hanging.


[Oh, and for u.c.o.l readers, no this is a different NFS problem to the one I was talking about there]

  • (no subject)

    Last week I poured the cremated remains of my father into a river. From there, that material will flow through the town he lived in, into the sea,…

  • Moving house!

    We're moving house soon… details to follow in a less public post, or email me. However, we're getting rid of some bits and pieces of…

  • (no subject)

    Yesterday, I made sausages. This was sufficiently exciting to cause me to actually write something on LJ for once. One of our wedding gifts was a…

  • Post a new comment


    default userpic

    Your IP address will be recorded 

    When you submit the form an invisible reCAPTCHA check will be performed.
    You must follow the Privacy Policy and Google Terms of use.