home *** CD-ROM | disk | FTP | other *** search
- Xref: sparky comp.benchmarks:1694 comp.arch.storage:774
- Path: sparky!uunet!news.centerline.com!noc.near.net!news.bbn.com!usc!zaphod.mps.ohio-state.edu!cs.utexas.edu!sun-barr!news2me.EBay.Sun.COM!exodus.Eng.Sun.COM!sun!amdcad!BitBlocks.com!bvs
- From: bvs@BitBlocks.com (Bakul Shah)
- Newsgroups: comp.benchmarks,comp.arch.storage
- Subject: Re: Disk performance issues, was IDE vs SCSI-2 using iozone
- Message-ID: <Bxs0CD.642@BitBlocks.com>
- Date: 15 Nov 92 21:14:36 GMT
- References: <1992Nov11.064154.17204@fasttech.com> <1992Nov11.210749.3953@igor.tamri.com> <36995@cbmvax.commodore.com> <1992Nov12.193308.20297@igor.tamri.com>
- Organization: Bit Blocks, Inc.
- Lines: 72
-
- jbass@igor.tamri.com (John Bass) writes:
-
- > When it was demanded by the powers
- >that be at Fortune Systems ... we put it in ... only to take it when Field
- >Service grew tried of the bad block tables overflowing and taking a big loss
- >on good drives being returned due to "excessive bad blocks" as the result
- >of normal (or abnormal) soft error rates due to other factors.
-
- Uh, John, that was because we *screwed up* the bad block algorithm
- the first time around. The block forwarding was done on *read*
- errors, which was quite the wrong thing to do and rather than fix
- that we *marked* the entire spare table as unusable and as a result
- even a single block going bad in field could not be handled. A
- disk driver that handled bad blocks correctly was never released....
-
- But I agree with you (& Peter da Silva) that the filesystem code
- should know how to deal with bad blocks. Note that regardless of
- where you handle bad blocks, the issues remain the same:
-
- 1. A read error should *never* be hidden from the layer above.
- Ultimately the user (or an applicatio on his behalf) should
- know if a block in a file is bad as only he (or that
- application) may have enough context to know what to do with
- such an error.
-
- 2. If a block could be read on a retry, *tell* the layer above
- that a soft error was seen. This allows the upper layer code
- to either avoid this block by forwarding the data somewhere or
- try to somehow remove the error by reformatting or whatever.
-
- 3. The best policy is to avoid using a permanently bad block and
- an upper layer *ought* to be able to do so intelligently. For
- example, a FS can put a bad block on a list of known bad
- blocks so it is never reused. In case the block is in the
- inode area, there ought to be a list of *bad* inodes so that
- they are never reallocated, etc. A user can remove a file
- with a bad block rather than try to rewrite it (and the FS
- should avoid putting a known bad block on the free list when a
- file is removed). A database application should be smart enough
- to shuffle things around so that a bad index block can be safely
- removed or avoided.
-
- 4. If 3. is not possible and the upper layer insists on rewriting
- a known bad block, *that* is when block forwarding needs be
- done. So I am *not* against lower layers (e.g. the disk
- driver, disk controller and the disk) doing block forwarding;
- it is just that a) this facility should be a last resort and
- b) I should not be forced to use it when I may have better
- means of avoiding it.
-
- Ultimately no single layer is capable of handling errors properly
- under most conditions and no single layer should be depended upon
- for handling recovery from such errors.
-
- >Write buffering requires automatic remapping ... A good filesystem design
- >should not see any benefits from write buffering, and doesn't need/want
- >remapping. Nor do customers want random/unpredictable performance/response
- >times.
-
- It is not hard to avoid driver/controller level remapping even in
- presence of write buffering (see the 4 points above) though I
- don't think any Unix FS currently does so (but that is not the
- only problem with Unix FS designs as we all know). I do not
- agree that a good filesystem would not see any benefits from
- write buffering. Write-back or delayed write caching does
- improve performance. Error reporting/recovery and caching are
- two of many concepts that are relevant at every level/layer.
- Some customers will accept good but somewhat unpredicable
- performance if it is at a low cost. Some won't. No one design
- will satisfy them all.
-
- Bakul Shah <bvs@BitBlocks.com>
-