home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.software-eng
- Path: sparky!uunet!zaphod.mps.ohio-state.edu!wupost!cs.uiuc.edu!marick
- From: marick@cs.uiuc.edu (Brian Marick)
- Subject: Re: Test Completion Criteria
- Message-ID: <Bxyzuz.G2r@cs.uiuc.edu>
- Organization: University of Illinois, Dept. of Comp. Sci., Urbana, IL
- References: <1efq9dINN5ps@griffin.orpington.sgp.slb.com>
- Date: Thu, 19 Nov 1992 15:47:22 GMT
- Lines: 64
-
- monteiro@edda.nl (Francisco Monteiro,207) writes:
-
- > One of the most difficult question to answer when testing a program is
- > determining when to stop, since there is no way of knowing if the
- > error just detected is the last remaining error. What criteria is the
- > common for industry, as the ones i know of are both meaningless and
- > counterproductive.
-
- They're all meaningless, mostly, but that doesn't mean they're
- counterproductive.
-
- 1. Most common? Probably estimated completion times based on past
- experience with similar projects and similar people. Seems terribly
- imprecise, but can work well. Read Scott MacGregor's postings - he
- has good things to say on this and related topics.
-
- 2. Also common? Path-based coverage measures. 80% or 85% branch
- coverage is a typical stopping criterion. These measures are
- meaningless without context. Both I and a novice tester can reach 95%
- coverage, but I guarantee my tests will be better, because I know a
- lot more about test design. I use coverage as a stopping criterion,
- but the actual percentages are of only passing interest. Rather, I
- look carefully at each branch (etc.) that I missed, ask why my test
- design missed it, asked what else "nearby" my test design missed, and
- make a judgement about whether I need more tests (almost always
- deciding that I do).
-
- 3. Fault-based coverage measures, most particularly mutation coverage,
- can guarantee the absence of certain kinds of faults in your program
- (modulo the chance of your making a mistake detecting equivalent
- mutants). Mutation testing is somewhat expensive, and it doesn't
- offer guarantees about all other kinds of faults in your program. In
- particular, I don't know of any studies about how well a
- mutation-complete test suite detects faults of omission (to my mind
- the most important class of faults).
-
- 4. Statistical testing does offer a stopping criterion: stop when the
- predicted mean-time-to-failure reaches a threshold. The problem is
- that the prediction is relative to a particular input distribution.
- Capturing an input distribution is hard. Dealing with different users
- with different input distributions is hard. I've been hearing that
- statistical testing / reliability people are now fiercely grappling
- with these issues, and I'm optimistic that they'll improve the state
- of the practice.
-
- 5. Given requirements or specifications in a rigorous format, you can
- generate a set of tests by (for example) cause-effect testing rules.
- Implement those tests, and you're done. But the sufficiency of the
- generation procedure is certainly open to question.
-
- 6. You can seed "representative" errors and stop only when your tests
- have detected all seeded errors. However, like mutation testing, the
- kinds of errors you can practically seed are only a smallish subset of
- the total kinds of errors. Also, seeded errors are not necessarily as
- hard (or as easy) to detect as native errors. (At least one study has
- found that they're not.)
-
- So: no hard and fast stopping criteria. A lot of life is like that.
-
- Brian Marick, marick@cs.uiuc.edu, testing!marick@uunet.uu.net
- Testing Foundations: Consulting, Training, Tools.
- Freeware test coverage tool: see cs.uiuc.edu:pub/testing/GCT.README
-
-
-