NetNews Usenet Archive 1992 #27

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #27 / NN_1992_27.iso / spool / comp / sys / next / sysadmin / 6619 < prev next >

Wrap

Internet Message Format | 1992-11-20 | 3.7 KB

Path: sparky!uunet!zaphod.mps.ohio-state.edu!menudo.uh.edu!menudo.uh.edu!usenet From: sears@tree.egr.uh.edu (Paul S. Sears) Newsgroups: comp.sys.next.sysadmin Subject: Re: NFS problems Date: 20 Nov 1992 15:29:27 GMT Organization: University of Houston Lines: 65 Message-ID: <1ej08nINNdrn@menudo.uh.edu> References: <1992Nov20.125808.3461@email.tuwien.ac.at> Reply-To: sears@tree.egr.uh.edu NNTP-Posting-Host: thanatos.egr.uh.edu In article <1992Nov20.125808.3461@email.tuwien.ac.at> rainer@ruble.fml.tuwien.ac.at (Rainer Staringer) writes: =>I have an annoying problem that (so it seems to me) started after we =>upgraded two of our NeXTs to 3.0 and started using one of them as an =>NFS server. => =>In random intervals (approx 1/day) all the machines in our little network =>(2 NeXTs running 2.1, one of them serving /Users and /usr/spool/mail, =>2 NeXTs running 3.0, one of them serving /LocalApps and /usr/local) will =>hang with the 'NFS server xxx not responding' message. Sometimes the =>problem goes away, sometimes you have to reboot the servers, sometimes =>one of the machines will panic. I found the following in /usr/adm/messages =>(ruble/mailhost/128.130.167.130 is the 2.1 server, moolah is the 3.0 server): => =>Nov 20 13:31:25 moolah mach: NFS server ruble not responding still trying =>Nov 20 13:31:29 moolah mach: nfs_server: bad sendreply from 128.130.167.130 =>Nov 20 13:31:36 moolah last message repeated 2 times =>Nov 20 13:31:42 moolah mach: NFS server mailhost not responding still trying =>Nov 20 13:31:44 moolah mach: nfs_server: bad sendreply from 128.130.167.130 =>Nov 20 13:32:02 moolah mach: nfs_server: bad sendreply from 128.130.167.130 =>Nov 20 13:33:42 moolah last message repeated 9 times => =>The panic happened here (said something about ns_timeout table overflow). => =>Nov 20 13:33:57 moolah syslogd: going down on signal 15 =>Nov 20 13:34:51 moolah mach: Killing all processes NFS server ruble ok =>Nov 20 13:34:51 moolah mach: =>Nov 20 13:34:51 moolah mach: continuing =>Nov 20 13:34:51 moolah mach: unmounting / ... done =>Nov 20 13:34:51 moolah last message repeated 4 times =>Nov 20 13:34:51 moolah mach: unmounting /Users ... done =>Nov 20 13:34:51 moolah mach: unmounting /server ... done =>Nov 20 13:34:51 moolah mach: unmounting / ... done =>Nov 20 13:34:51 moolah mach: unmounting /nn ... done =>Nov 20 13:34:51 moolah mach: unmounting / ... done =>Nov 20 13:34:51 moolah mach: unmounting / ... done =>Nov 20 13:34:51 moolah mach: rebooting Mach... => =>Does anybody have a hint what could be causing this?? It really starts to =>get annoying, and I have not the slightest idea what we did wrong. => => Rainer =>-- =>Rainer Staringer | rainer@fml.tuwien.ac.at =>Financial Markets Lab, TU Vienna | +43 (1) 58801/8138 This sounds like the problem we were having here for awhile. First, do a "ps -aux" on your servers when your clients get the NFS server not responding message. See which process is using up the most cpu. My hunch is that lookupd might be the culprit. It this is indeed the case, then your problem most likely involves netgroups, if you are using them. Please post more information about your problem. When the server panicked, there should have been a reason for the panic in the little window. It is very helpful to post the panic messages so we have a better idea of what was going on. -- Paul S. Sears * sears@uh.edu (NeXT Mail OK) The University of Houston * suggestions@tree.egr.uh.edu (NeXT Engineering Computing Center * comments, complaints, questions) NeXT System Administration * DoD#1967 '83 NightHawk 650SC >>> SSI Diving Certification #755020059 <<< "Programming is like sex: One mistake and you support it a lifetime."