home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.os.vms
- Path: sparky!uunet!mcsun!news.funet.fi!aton.abo.fi!usenet
- From: HEGE@FINABO.ABO.FI (Kaj H{ggman DC)
- Subject: Summary: RECNXINTERVAL
- Message-ID: <1992Dec28.082250.3770@abo.fi>
- Sender: usenet@abo.fi (Usenet NEWS)
- Organization: Abo Akademi University, Finland
- Date: Mon, 28 Dec 1992 08:22:50 GMT
- X-News-Reader: VMS NEWS 1.24
- Lines: 86
-
- Hi!
-
- I sent the following to the net a couple of weeks ago
-
- #Every now and then there are som "bursts" on our ethernet.
- #The SUN workstations don't give a s*it, but the VAXstations keep shouting
- #to each other quite a lot and eventually reboot.
- #The value for RECNXINTERVAL is 120 in our cluster. How far up could
- #I crank it, i.e. what impact could it have on the MI-cluster?
- #Any other "time-out"-parameters that are worth checking out?
- #As I'm not responsible for our network I won't go into that (waiting for
- #the move towards routing within the net), but I'd really appreciate any
- #ideas concerning the VAXes. Thanx!
-
- and got 8 answers.
-
- 6 people had RECNXINTERVAL set to 120 like me, 1 had it set to 180 and
- 1 didn't mention any value for it.
-
- Here is a short descirption of the impact of raising the value of
- RECNXINTERVAL:
- The effect of a long RECNXINTERVAL is if a node crashes or hangs and it holds
- a lock on a critical resource (the UAF file for example) it will take
- RECNXINTERVAL seconds before the other nodes determine that the failed node
- should be removed from the cluster. Any applications that need the resource
- would hang. This could result in the entire cluster hanging for RECNXINTERVAL.
- But that may be better than having your all your LAVC nodes crash with a
- CLUEXIT during a network storm.
-
- It seems like I've checked out most of the parameters that are worth checking
- out. For those of you who'd like to experiment with cluster parameters, also
- check out QDSKINTERVAL, PRCPOLINTERVAL, PASTIMOUT, PAPOLLINTERVAL (as Ehud
- Gavron mentioned).
-
- I also got many comments about our net, having it fixed first, and so on.
-
- It seems to be quite common that local area networks built and configured
- at the time when there were no workstations around (just a few hosts and
- dump terminals) just can't put up with added load. Then the problem just
- gets worse as new hosts are added. Routing seems to be a good solution.
-
- Carl Karcher also told me these things about LAT/LAST:
- Careful, the LAVC protocol is not routable. However, the cicso's can be setup
- to selectively bridge protocols that can't be routed (Like LAVC, LAT, MOP and
- LAST). We don't do that here since our network police don't allow selective
- bridging. One more thing, be sure your cisco's have the latest firmware. We
- just discovered a problem where the router pasted corrupted packets as good
- packets which was affecting Novell and NFS traffic. Here's a brief description:
-
- [A bug in the Cisco interface firmware caused it to ignore CRC errors.
- A corrupted packet received on the routers Ethernet interface would
- have it's CRC recomputed and would be forwarded toward it's
- destination. So the next node receiving the packet would not be able
- to detect the corruption.
-
- Ethernet devices (with the exception of ethernet analyzer equipment)
- are supposed to drop packets containing CRC errors. Higher layer
- protocols then trigger retransmission of dropped packets. Some
- protocols like TCP have an end-to-end checksum so that even if a
- corrupted packet manages to get through, the destination machine can
- detect the corruption. The Netware protocol assumes that corrupted
- packets will be dropped and that errors created by intermediate nodes
- are extremely rare, so it doesn't include an end-to-end checksum.]
- ......
- The infoserver uses LAST protocol which is also not routable. Pathworks
- disk and file services use LAST too (but file services can use decnet). If you
- use infoservers for holding bookreader documentation CD's, decnet can be used
- (to a node that has the DAD disks mounted) instead of mounting DADn disks.
-
- I'm not sure if I can say that "nobody uses a default value of 20 for
- RECNXINTERVAL" (based on my own experiences and 7 answers...), but it seems
- like many people has set it a little higher.
-
- Many thanks to George Burns
- Ehud Gavron
- Carl Karcher
- Tom Miller
- Malcolm Newman
- Jeff Rossiter
- Frank Shorter
- Erik Sosman
-
- Kaj Haggman Internet: Hege@abo.fi Phone: +358-21-654467
- Abo Akademi University Bitnet: Hege@finabo FAX: +358-21-654497
- Computing Center PSImail: 22101410::HEGE
- SF-20500 Abo, FINLAND X.400: s=hege o=abo prmd=inet admd=fumail c=fi
-