NetNews Usenet Archive 1992 #27

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #27 / NN_1992_27.iso / spool / comp / sys / isis / 312 < prev next >

Wrap

Text File | 1992-11-20 | 4.2 KB | 90 lines

Newsgroups: comp.sys.isis Path: sparky!uunet!caen!batcomputer!cornell!ken From: ken@cs.cornell.edu (Ken Birman) Subject: How do we test Isis applications Message-ID: <1992Nov20.151406.10383@cs.cornell.edu> Organization: Cornell Univ. CS Dept, Ithaca NY 14853 Date: Fri, 20 Nov 1992 15:14:06 GMT Lines: 80 I am sure people will find this interesting: > From es_teo@rcvie.co.at Fri Nov 20 05:50:51 1992 > Date: Fri, 20 Nov 92 11:50:46 +0100 > From: es_teo@rcvie.co.at (Dan Teodosiu) > To: ken@cs.cornell.edu > Subject: ISIS question > Cc: es_teo@rcvie.co.at > Dear Dr. Birman, > I am currently investigating the possibilities for using ISIS as a basic > layer for a distributed expert systems environment. While reading your > papers on ISIS, I was wondering how you perform system testing - do you > have any environment which allows you to simulate the various conditions > which may occur (delays, crashes, etc.)? Or do you simply use a number of > "test applications"? > Kind regards, > Dan Teodosiu > ------------------------------------------------------------------------------- > Alcatel - ELIN Research Centre RISC (Research Institute for > A-1210 Wien, Ruthnergasse 1-7 Symbolic Computation) > Schloss Hagenberg > Voice: +43-1-391621-350 A-4232 Hagenberg i.M. > Fax: +43-1-391452 > E-mail: Dan.Teodosiu@rcvie.co.at E-mail: danteo@risc.uni-linz.ac.at > ------------------------------------------------------------------- To answer your question, let me first assume that you work under Lisp. We do have a common lisp version of Isis (Lucid or Allegro), which works much as you would expect from Lisp -- you can unwind stacks and so forth, and as much as possible Isis tries to undo whatever it did in getting to the place where a problem occured. For non-lisp applications, we do a mixture of things. First, because the Isis execution environment is so simple ("closely synchronous"), it is easy to merge logs printed by processes in the system. So, debug printouts into log files are very useful -- much more than in other systems, where you couldn't easily match them up. A second tool is the "cmd snap" feature, which makes an Isis state dump showing what was happening at that instant in all parts of your applications. This lists all the active tasks, what multicasts they are doing, replies are needed, etc. A third tool is the debugger itself. However, we have one problem with the debugger: if you stop an Isis task for long, the system thinks the program has crashed! We are adding a "being debugged" option now but for the next few months Isis will continue to have the feature that stopping a program for more than about 45 seconds causes it to be dropped from the system by Isis. We have to do this to keep the rest of the system live -- the core problem is that the UNIX debugger doesn't leave any indication around that we can use to detect that dbx or gdb has stopped the process, so we have no way to distinguish a dead process from one in trace-wait. Finally, for testing, we tend to do exhaustive runs (often for days at a time) with very demanding work loads, to test code coverage. We are looking at path coverage tools and we use things like Purify all the time. By now, we have a large application set and these really put Isis through a wringer. A shame we don't have a better way to do it... We have started to add performance tests too, since we recently got burned by this change in TCP delay functionality which we hadn't noticed, as well as the Isis bug with "x" mode multicast in overlapping groups. Of course, there are an unlimited number of things one might test and we can't cover everything. But, we have steadily increased the coverage of our tests for 18 months now. To simulate crashes, we just kill things. Some of our test scripts kill something every second (actually, every <random number of seconds>) for 2 or 3 days. We cross our fingers and hope that if we can survive this, we can survive most realworld situations. So far, this is usually true. -- Kenneth P. Birman E-mail: ken@cs.cornell.edu 4105 Upson Hall, Dept. of Computer Science TEL: 607 255-9199 (office) Cornell University Ithaca, NY 14853 (USA) FAX: 607 255-4428