home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!zaphod.mps.ohio-state.edu!menudo.uh.edu!menudo.uh.edu!usenet
- From: sears@tree.egr.uh.edu (Paul S. Sears)
- Newsgroups: comp.sys.next.sysadmin
- Subject: Re: NFS problems
- Date: 20 Nov 1992 15:29:27 GMT
- Organization: University of Houston
- Lines: 65
- Message-ID: <1ej08nINNdrn@menudo.uh.edu>
- References: <1992Nov20.125808.3461@email.tuwien.ac.at>
- Reply-To: sears@tree.egr.uh.edu
- NNTP-Posting-Host: thanatos.egr.uh.edu
-
- In article <1992Nov20.125808.3461@email.tuwien.ac.at>
- rainer@ruble.fml.tuwien.ac.at (Rainer Staringer) writes:
- =>I have an annoying problem that (so it seems to me) started after we
- =>upgraded two of our NeXTs to 3.0 and started using one of them as an
- =>NFS server.
- =>
- =>In random intervals (approx 1/day) all the machines in our little network
- =>(2 NeXTs running 2.1, one of them serving /Users and /usr/spool/mail,
- =>2 NeXTs running 3.0, one of them serving /LocalApps and /usr/local) will
- =>hang with the 'NFS server xxx not responding' message. Sometimes the
- =>problem goes away, sometimes you have to reboot the servers, sometimes
- =>one of the machines will panic. I found the following in /usr/adm/messages
- =>(ruble/mailhost/128.130.167.130 is the 2.1 server, moolah is the 3.0
- server):
- =>
- =>Nov 20 13:31:25 moolah mach: NFS server ruble not responding still trying
- =>Nov 20 13:31:29 moolah mach: nfs_server: bad sendreply from 128.130.167.130
- =>Nov 20 13:31:36 moolah last message repeated 2 times
- =>Nov 20 13:31:42 moolah mach: NFS server mailhost not responding still trying
- =>Nov 20 13:31:44 moolah mach: nfs_server: bad sendreply from 128.130.167.130
- =>Nov 20 13:32:02 moolah mach: nfs_server: bad sendreply from 128.130.167.130
- =>Nov 20 13:33:42 moolah last message repeated 9 times
- =>
- =>The panic happened here (said something about ns_timeout table overflow).
- =>
- =>Nov 20 13:33:57 moolah syslogd: going down on signal 15
- =>Nov 20 13:34:51 moolah mach: Killing all processes NFS server ruble ok
- =>Nov 20 13:34:51 moolah mach:
- =>Nov 20 13:34:51 moolah mach: continuing
- =>Nov 20 13:34:51 moolah mach: unmounting / ... done
- =>Nov 20 13:34:51 moolah last message repeated 4 times
- =>Nov 20 13:34:51 moolah mach: unmounting /Users ... done
- =>Nov 20 13:34:51 moolah mach: unmounting /server ... done
- =>Nov 20 13:34:51 moolah mach: unmounting / ... done
- =>Nov 20 13:34:51 moolah mach: unmounting /nn ... done
- =>Nov 20 13:34:51 moolah mach: unmounting / ... done
- =>Nov 20 13:34:51 moolah mach: unmounting / ... done
- =>Nov 20 13:34:51 moolah mach: rebooting Mach...
- =>
- =>Does anybody have a hint what could be causing this?? It really starts to
- =>get annoying, and I have not the slightest idea what we did wrong.
- =>
- => Rainer
- =>--
- =>Rainer Staringer | rainer@fml.tuwien.ac.at
- =>Financial Markets Lab, TU Vienna | +43 (1) 58801/8138
-
- This sounds like the problem we were having here for awhile. First, do a "ps
- -aux" on your servers when your clients get the NFS server not responding
- message. See which process is using up the most cpu. My hunch is that
- lookupd might be the culprit. It this is indeed the case, then your problem
- most likely involves netgroups, if you are using them. Please post more
- information about your problem.
-
- When the server panicked, there should have been a reason for the panic in the
- little window. It is very helpful to post the panic messages so we have a
- better idea of what was going on.
-
- --
- Paul S. Sears * sears@uh.edu (NeXT Mail OK)
- The University of Houston * suggestions@tree.egr.uh.edu (NeXT
- Engineering Computing Center * comments, complaints, questions)
- NeXT System Administration * DoD#1967 '83 NightHawk 650SC
- >>> SSI Diving Certification #755020059 <<<
- "Programming is like sex: One mistake and you support it a lifetime."
-