home *** CD-ROM | disk | FTP | other *** search
- CACHECHK
- v2 11/2/95
- Copyright (c) 1995 by Ray Van Tassle.
-
- This is NOT freeware. This is postcard-ware. Send me a nice
- picture post card as the registration fee.
- Ray Van Tassle
- 1020 Fox Run Lane
- Alqonuin, Il. 60102 USA
- (708)658-4941
-
-
- CACHECHK performs memory access timing tests, to see if you have a
- cache, how many caches, and to check the access speed.
-
- There are two kinds of caches, the on-chip cache (level one, or L1)
- which is in the CPU processor chip, and off-chip cache (level two, or
- L2) which is on the motherboard.
-
- Some 386 motherboards have 64KB of off-chip cache.
- Some CYRIX 386/486 chips have a very small L1 cache.
- 486's have an on-chip cache, and most new motherboards also have a L2
- cache.
- The AMD 486 has 8KB of L1 cache.
- The Intel 486 has 16KB of L1 cache.
- Intel Pentium has 16KB of L1 cache, 8KB for data and 8KB for code.
-
- The typical 486 MB has 256KB of L2 cache, although many will let you
- install 512KB or even 1MB.
-
- Q: HOW MUCH CACHE DO I NEED?
- For a 486, get a 256K cache.
- All the writeups that I have seen say:
- 64KB cache gets a LOT of improvement.
- 128KB gets a bit more.
- 256KB gets a teeny bit more.
- 512KB gets only a teensy weensy bit more.
- (Note: these are for DOS and WINDOWS3. A real OS, like LINUX, OS/2,
- and perhaps WIN-95 and WIN-NT may be different.)
- In general, this seems to be the way all caches work---the first bit
- gets a lot of bang, and each additional bit gives smaller and smaller
- improvements.
- It is claimed that, as long as you have a decent L2 cache, that an AMD
- (with an 8k L1) is virtually identical in performance to an Intel 486
- (with a 16K L1).
-
- 256K uses 8-(32K x 8) SRAM chips (8 * 32K = 256KB), in two banks,
- and the MB can interleave accesses to the two banks.
- 512K uses 4-(128K x 8) SRAM chips (4 * 128K = 512KB, but this is only
- one bank, so the access time is slower, because the MB can't do bank
- interleaving.
- So, 512K costs a LOT more than 256K and gives only a marginal improvement
- in performance, so stick with 256K.
-
- Q: Why does my main memory show faster with the cache enabled, even
- out beyond the cache size?
- A: The L1 cache (on a 486) is filled in granularity of 16 bytes (this
- is the "cache line size"). When you read a byte, all 16 bytes of that
- line are read into the cache, in 4 quadbyte (32 bit) units. The quadbyte
- which is addressed is read from memory first, put into the cache, and
- transferred to the CPU. Then the other 3 quadbytes are read into the cache.
- So, if you are accessing the memory in sequential order (like CACHECHK does),
- the next bytes have been (or are being) automatically sucked into the cache,
- a "read-ahead", if you will.
- If the cache is disabled, this read-ahead does not take place.
-
-
- CACHECHK will run the access tests using all the memory in your
- machine, so that you can check to make sure that all the memory is
- cached.
-
-
- Usage: CACHECHK -tn -hfvwqz? [Optional comments]
- Cache memory detector & timer. Runs only on 80386 (or better) CPU.
- -h? = Print this help text.
- -f = Perform tests with cache disabled.
- -n = Don't calibrate timer.
- -q = Quick. Faster but not as accurate.
- -qq = Each 'q' is 2 times quicker. But less accurate.
- -tn = Top of memory to test. n = nth MB
- -v = Verbose
- -w = Do memory write (otherwise memory read).
- -z = Slower. Like q, but the other direction.
- Probably won't be needed on anything less than a 586DX4/200.
-
- The "optional comments" just get logged, so you can identify test information
- along with the results. Mostly useful if you redirect the output to
- a file. For example:
- cachechk -q Test with bios set to 1 wait state >before.dat
-
- The basic timing loop is 1/2 second per size. Each 'q' cuts the time
- in half. Each 'z' doubles the time. If the timing figures aren't
- steady, you probably have too many q's.
-
- You can run the tests with the cache disabled, with the 'f' option (on
- a 486 or Pentium). Naturally, it is re-enabled again when it's done.
- This generally turns off BOTH the L1 and L2 cache. Your bios setup
- may (or may not) let you individually enable/disable the caches.
-
- Times cache & memory access, and figures out cache size(s).
- Results are in:
- microseconds per KB, MB per seconds, and nanoseconds per byte.
-
- Memory is access in quadbytes, in flat 32-bit protected mode. For base
- memory (first MB, MB#0) only 640KB is accessed. Memory accesses are in
- various block sizes, from 1KB to 2MB. Each megabyte is tested starting
- at the beginning of that megabyte. CACHECHK will work under a memory
- manager (HIMEM, EMM386, QEMM, Windows, etc.), but the results may be
- inaccurate, the machine might crash, and it won't be able to test all
- of the memory. It will run under WINDOWS, but results are wildly
- inaccurate. For best results, boot clean--on DOS 6 & above, hit F5
- while it boots.
-
- It will NOT touch extended memory that is already allocated or in use.
- If you have a memory manager installed, it usually occupies the first
- portion of the 2nd megabyte, so CACHECHK will not be able to check that.
-
-
- SOME TIMINGS I HAVE TAKEN
-
- CPU L1 L1speed L2 L2Speed Mem Speeds
- type siz ns/byte size ns/byt ns/byt µsec/KB
- ------ ---- ------- ---- ------ ------ ----------
- 386/25 0 n/a 64k 59.2 90 62......94
- 486/33 8k 30.7 128k 43.6 70 31..45..73 (Intel)
- 486/66 8k 16.1 0 n/a 50 15......52 (Intel)
- 486/100 8k 11.1 0 n/a 46 10......48 (AMD)
- 486/100 16k 10.0 256k 18.8 26 10..19..27 (Intel)
- P-75 8k 10.2 256k 16.4 24 10..17..24
-
-
- Timer
- -----
- CACHECHK re-programs the timer chip to get a high-precision timer (about
- 1,200,000 ticks per second). In some motherboards (notibly reported
- to be "UMC with fake cache chips"), there is a flaw with this timer.
- I worked around this in version 2, but there may be some boards where my
- work-around still doesn't work.
-
- Normally, CACHECHK calibrates this timer against the Real-Time-Clock, which
- is in all systems 286 and above. If your RTC isn't standard, this may hang
- or give bogus values.
- I count the number of timer ticks in 1/16th second. When four in a row
- give the same value (within .01%), I take the average of those four counts,
- and use this as the calibration value.
- If there is some problem with this calibration, use the "-n" switch, and
- it will use the pre-defined value of 1,193,728 (0.84 microseconds per tick).
-