home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!elroy.jpl.nasa.gov!news.claremont.edu!nntp-server.caltech.edu!SOL1.GPS.CALTECH.EDU!CARL
- From: carl@SOL1.GPS.CALTECH.EDU (Carl J Lydick)
- Newsgroups: comp.os.vms
- Subject: Re: Mount verification timed out
- Date: 30 Dec 1992 12:38:22 GMT
- Organization: HST Wide Field/Planetary Camera
- Lines: 79
- Distribution: world
- Message-ID: <1hs57uINNnqs@gap.caltech.edu>
- References: <1992Dec29.152302.1716@das.harvard.edu>,<1992Dec29.105010.1@slacvx.slac.stanford.edu>
- Reply-To: carl@SOL1.GPS.CALTECH.EDU
- NNTP-Posting-Host: sol1.gps.caltech.edu
-
- In article <1992Dec29.105010.1@slacvx.slac.stanford.edu>, fairfield@slacvx.slac.stanford.edu writes:
- =In article <1992Dec29.152302.1716@das.harvard.edu>, chen@speed.uucp (Lilei Chen) writes:
- => In our VAX/VMS-cluster a bunch of disks are cross mounted. Somtimes when
- => a node goes down, its disks are marked mount verification timed out. I
- => haven't found a way to get those disks remounted without rebooting the
- => nodes. I am wondering if someone on the net has a solution for that problem.
- =
- = For each node in the cluster for which the disk(s) have gone into
- =mount verification time-out, do:
- =
- = $ DISMOUNT/ABORT device
- =
- =You can do this most easily from SYSMAN after doing
- =
- = SYSMAN> SET ENVIRONMENT/NODE=(NODE1,NODE2,...,NODEN)
- =
- =where the node list, above, includes all/only those nodes needing to
- =remount the disk(s). Follow the DISMOUNT with a MOUNT (with appropriate
- =parameters). Note that it _is_ allowed to do a second MOUNT/CLUSTER of
- =the device in question from it's local host (and perhaps from any node in
- =the cluster, I haven't tried): all nodes for which the device is already
- =mounted ignore the request, while nodes that don't have the device mounted
- =will mount it.
- =
- = As another followup pointed out, if there are any open channels to the
- =disk(s) that "went away", they must be closed before the disk(s) can be
- =dismounted. That's why you need to use /ABORT. I am not sure of the
- =current situation (VMS V5.5-1), but in earlier versions of VMS, if you
- =issued a DISMOUNT to the device _without_ the /ABORT qualifier, the disk
- =would NOT dismount successfully, even if you subsequently issued a
- =DISM/ABORT (again, this is most likely dependent on having "open" files on
- =the device in question from the node you're trying to do the dismount). I
- =recall getting devices into a state of "Mount/Dismount"! The only way of
- =recovering those devices for _that_ node was to reboot. :-(
-
- At last! An intelligent and responsible answer to the question. Several other
- responses recommended actions that would've only made matters worse.
-
- PLEASE: WHEN ANSWERING QUESTIONS IN THIS NEWSGROUP, DON'T GIVE ANSWERS
- UNLESS YOU'VE TESTED THEM!
-
- I've held off on answering because, though I thought I knew the answer, I knew
- that some actions would certainly make things worse (once you've got a disk in
- the state "Mounted, marked for dismount" you can be in *REAL* trouble; there's
- no simple way to figure out who's got what open; at best, you can use
- ANALYZE/SYSTEM on each node in your cluster and check EVERY process's channel
- listings to figure out who's got channels open).
-
- FWIW, here's what the HELP utility (those of you who gave harmful answers HAVE
- heard of HELP, haven't you?) has to say about DISMOUNT/ABORT:
-
- $ HELP DISMOUNT/ABORT
-
- DISMOUNT
-
- /ABORT
-
- Requires volume ownership or the user privilege VOLPRO (volume
- protection) to use this qualifier with a volume that is mounted
- neither group nor system.
-
- Specifies that the volume is to be dismounted, regardless of who
- mounted it. The primary purpose of the /ABORT qualifier is to
- terminate mount verification. The DISMOUNT/ABORT command also
- cancels any outstanding I/O requests. If the volume was mounted
- with the /SHARE qualifier, the /ABORT qualifier causes the volume
- to be dismounted for all of the users who mounted it.
-
- The idiots who suggested using DISMOUNT without the /ABORT qualifier apparently
- didn't know (or care) that DISMOUNT without that qualifier can put the disk
- into a state where outstanding I/O requests *CANNOT* be canceled.
- --------------------------------------------------------------------------------
- Carl J Lydick | INTERnet: CARL@SOL1.GPS.CALTECH.EDU | NSI/HEPnet: SOL1::CARL
-
- Disclaimer: Hey, I understand VAXen and VMS. That's what I get paid for. My
- understanding of astronomy is purely at the amateur level (or below). So
- unless what I'm saying is directly related to VAX/VMS, don't hold me or my
- organization responsible for it. If it IS related to VAX/VMS, you can try to
- hold me responsible for it, but my organization had nothing to do with it.
-