NetNews Usenet Archive 1992 #27

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #27 / NN_1992_27.iso / spool / comp / software / 4400 < prev next >

Wrap

Text File | 1992-11-19 | 3.5 KB | 75 lines

Newsgroups: comp.software-eng Path: sparky!uunet!zaphod.mps.ohio-state.edu!wupost!cs.uiuc.edu!marick From: marick@cs.uiuc.edu (Brian Marick) Subject: Re: Test Completion Criteria Message-ID: <Bxyzuz.G2r@cs.uiuc.edu> Organization: University of Illinois, Dept. of Comp. Sci., Urbana, IL References: <1efq9dINN5ps@griffin.orpington.sgp.slb.com> Date: Thu, 19 Nov 1992 15:47:22 GMT Lines: 64 monteiro@edda.nl (Francisco Monteiro,207) writes: > One of the most difficult question to answer when testing a program is > determining when to stop, since there is no way of knowing if the > error just detected is the last remaining error. What criteria is the > common for industry, as the ones i know of are both meaningless and > counterproductive. They're all meaningless, mostly, but that doesn't mean they're counterproductive. 1. Most common? Probably estimated completion times based on past experience with similar projects and similar people. Seems terribly imprecise, but can work well. Read Scott MacGregor's postings - he has good things to say on this and related topics. 2. Also common? Path-based coverage measures. 80% or 85% branch coverage is a typical stopping criterion. These measures are meaningless without context. Both I and a novice tester can reach 95% coverage, but I guarantee my tests will be better, because I know a lot more about test design. I use coverage as a stopping criterion, but the actual percentages are of only passing interest. Rather, I look carefully at each branch (etc.) that I missed, ask why my test design missed it, asked what else "nearby" my test design missed, and make a judgement about whether I need more tests (almost always deciding that I do). 3. Fault-based coverage measures, most particularly mutation coverage, can guarantee the absence of certain kinds of faults in your program (modulo the chance of your making a mistake detecting equivalent mutants). Mutation testing is somewhat expensive, and it doesn't offer guarantees about all other kinds of faults in your program. In particular, I don't know of any studies about how well a mutation-complete test suite detects faults of omission (to my mind the most important class of faults). 4. Statistical testing does offer a stopping criterion: stop when the predicted mean-time-to-failure reaches a threshold. The problem is that the prediction is relative to a particular input distribution. Capturing an input distribution is hard. Dealing with different users with different input distributions is hard. I've been hearing that statistical testing / reliability people are now fiercely grappling with these issues, and I'm optimistic that they'll improve the state of the practice. 5. Given requirements or specifications in a rigorous format, you can generate a set of tests by (for example) cause-effect testing rules. Implement those tests, and you're done. But the sufficiency of the generation procedure is certainly open to question. 6. You can seed "representative" errors and stop only when your tests have detected all seeded errors. However, like mutation testing, the kinds of errors you can practically seed are only a smallish subset of the total kinds of errors. Also, seeded errors are not necessarily as hard (or as easy) to detect as native errors. (At least one study has found that they're not.) So: no hard and fast stopping criteria. A lot of life is like that. Brian Marick, marick@cs.uiuc.edu, testing!marick@uunet.uu.net Testing Foundations: Consulting, Training, Tools. Freeware test coverage tool: see cs.uiuc.edu:pub/testing/GCT.README