home *** CD-ROM | disk | FTP | other *** search
- Xref: sparky vmsnet.sysmgt:451 vmsnet.networks.misc:150 comp.os.vms:22071
- Path: sparky!uunet!stanford.edu!ames!saimiri.primate.wisc.edu!sdd.hp.com!swrinde!emory!gatech!destroyer!cs.ubc.ca!van-bc!jeslacs!psmode
- From: Peter Smode <psmode@jeslacs.wimsey.bc.ca>
- Newsgroups: vmsnet.sysmgt,vmsnet.networks.misc,comp.os.vms
- Subject: Re: Lost LAT sessions
- Message-ID: <L28ZXB1w165w@jeslacs.wimsey.bc.ca>
- Date: Tue, 26 Jan 93 13:54:32 PST
- Organization: JES Library Automation, Coquitlam BC, CANADA
- Lines: 179
-
- This message is being cross-posted to vmsnet.sysmgt, comp.os.vms and
- vmsnet.networks.misc. Please post reposnses in vmsnet.sysmgt or e-mail
- the author.
-
- People from WordPerfect Corp. are encourraged to read and respond to
- this post (see below).
-
- It would appear that this problem I have perviously described is alot
- more widespread than I thought. I have received a number of responses
- to my query; many from sites that have experienced this problem now or
- in the past. For this reason, I would like to return this discussion
- to the group.
-
- I summarize the problem again along with some of the latest information
- collected.
-
- From time to time, we will get one or more interactive terminal sessions on
- LTAxxxx: terminals getting "lost in space". We have not been able to
- reproduce this problem in a controlled fashion. However, the affected
- sessions show ALL of the following symptoms:
-
-
- - A SHOW USERS FULL will be missing the (server/port) address for the
- LTAxxx: terminal
- $ SH US /FU SEXSMITH
- VAX/VMS User Processes at 25-JAN-1993 16:45:48.57
- Total number of users = 1, number of processes = 1
-
- Username Node Process Name PID Terminal
- SEXSMITH_M VAXB SEXSMITH_M 20601743 LTA6716:
-
- - The process will INHALE all available CPU time, effectively turning the
- system into a dog (it turns out the jobs are NOT at elivated priority; my
- memory had failed me). Doing a SHOW PROCESS on the job shows that it is
- locked in some sort of CPU loop, doing no I/O whatsoever; the job state is
- always COM.
-
- - The application running is one writen and supported by us. We have seen
- two applications involved here, both written in VAX BASIC. One application
- uses SMG exclusively for terminal I/O; the other simply invokes the PRINT
- and INPUT verbs built into the VAX BASIC language.
-
- - We have no reliable reports of what the physical terminal is displaying
-
- - SHOW ERROR reveals nothing
-
- - Happens for both direct and dialup connections on any port, any server,
- DEC and non-DEC. One site uses a Vitalink bridge (with the problem appearing
- on both sides); the other has nothing but a single Ethernet segment and a
- number of terminal servers.
-
- - The problem tends to happen during peak hours. We cannot recall, but
- cannot rule out occurances during off-peak hours.
-
- - A SHOW PORT LTAxxx command from LATCP shows that the port is inactive
- and does not show the server/port name for the device. Even more
- interesting is that the port type is indicated as 'Forward', with no
- target service or actual service indicated. This port started out as an
- interactive!
-
- - Server error counters do not seem to indicate trouble. A SHOW PORT COUNT
- command on the server indicates some framing errors, but not an excessive
- number:
- Current Counters for Port 19 qqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqq
- x
- x Seconds since zeroed: 1780534 (20 14:35:34)
- x
- x Port Statistics: Port Errors:
- x Local Accesses: 74 Framing: 71
- x Remote Accesses: 0 Parity: 0
- x Inactivity Logoff: 0 Overrun: 0
- x Password Logoff: 0
-
- - NCP counters do no show anything out of the ordinary. A SHOW EXECUTOR
- COUNT display for one instance follows:
- Node Counters as of 25-JAN-1993 16:45:35
-
- Executor node = 1.11 (VAXB)
-
- >65534 Seconds since last zeroed
- 1448613 Bytes received
- 1448396 Bytes sent
- 31035 Messages received
- 31243 Messages sent
- 208 Connects received
- 208 Connects sent
- 3 Response timeouts
- 0 Received connect resource errors
- 11 Maximum logical links active
- 0 Aged packet loss
- 0 Node unreachable packet loss
- 0 Node out-of-range packet loss
- 0 Oversized packet loss
- 0 Packet format error
- 0 Partial routing update loss
- 0 Verification reject
-
-
- **************************************************
- In response to my query I have received the following:
-
- From: "FLOWERS HARRY" <FLOWERS@memstvx1.memst.edu>
- Subject: Re: Lost LAT sessions
- To: "psmode" <psmode@jeslacs.wimsey.bc.ca>
-
- We've had the same problems. Typically with SMG-based applications,
- but also happens with LISP. I've got a command procedure that hunts
- them down and kills them. I'll include it at the end.
-
- >So far, DEC has sent the latest patch kits and had us install them, but
- >to no avail.
-
- Yea, we went around with DEC on this, but they've decided it's a feature.
- Evidently, if you try to do I/O to a LAT port after it's disconnected, this
- happens. They suggested checking all return statuses carefully, and avoiding
- any I/O to a port once you get a disconnect error. Evidently, SMG isn't
- handling this correctly. They've got a fix for VAX LISP to keep it from
- happening there. If you ever get them to admit that there's actually a
- problem with either SMG or LAT software in this regard, please let me know.
-
- --
- Harry Flowers Internet: FLOWERS@MEMSTVX1.MEMST.EDU
- Memphis State University & Bitnet: FLOWERS@MEMSTVX1
-
- **************************
- From: grant%mighty.dnet@gw.wmich.edu (NORM GRANT)
- To: psmode@jeslacs.wimsey.bc.ca
- Subject: LAT looping?
-
- I don't know if this relates to your problem or not, but we had a version
- of WordPerfect for VMS which did this if a user hung up the phone while
- in it. It was a MAJOR nuisance. Apparently the program attempted to
- trap the hang up or forced exit condition and went crazy when the terminal
- was gone. You could have a similar problem, possibly with different
- software.
-
- -------------
- Norman D. Grant INTERNET: grant@gw.wmich.edu
- Western Michigan University Voice: (616) 387-5430
- University Computing Services
- Kalamazoo, MI 49008
-
- **************************
- From: SYSMGR@bigvax.alfred.edu (Jim Walker)
- Message-Id: <930126125507.30a0373a@bigvax.alfred.edu>
- Subject: LAT sessions getting lost
- To: @jeslacs.wimsey.bc.ca, psmode@jeslacs.wimsey.bc.ca
- X-Vmsmail-To: SMTP%"psmode@jeslacs.wimsey.bc.ca"
-
- I have encountered that problem. What happens to me is a user logs out of
- the terminal server and VMS doesn't just delete the process any more. It
- aborts all $QIOs to the terminal with SS$_HANGUP or something like that.
- Applications are supposed to detect the fatal error and exit, but some just
- keep retrying and sucking up CPU time. SHOW DEVICE LTAxxx: reports that
- the device is offline. Now I run a detached process that wakes up once a
- minute and does $DEVICE_SCAN and $GETJPI looking for offline LTAxxx:s and
- deletes processes. The fortran program follows my .SIG. It's overly
- complicated because at the time I wrote it I thought it was going to
- evolve into a general idle process killer.
-
- Jim Walker
- VAX System & Network manager, Alfred University Computer Center,
- Alfred, NY 14802 USA +1-607-871-2222, Using VAX/VMS 5.4-3
- <SYSMGR@bigvax.alfred.edu>, SYSMGR@CERAMICS.bitnet, WALKER@ALFREDU.bitnet
-
- **************************************************
-
-
- Any insight from WPCorp would be appreciated. Also, If somebody could
- check the fiche to see if there is any common element to the $QIOs that
- are triggered by calling input routines from SMG, VAX BASIC and LISP.
-
-
- -- Peter
- **************************************************************************
- * Peter Smode E-mail: psmode@jeslacs.wimsey.bc.ca *
- * JES Library Automation Voice: (604)939-6775 *
- * Coquitlam, BC, CANADA Fax: (604)939-9970 *
- **************************************************************************
-