home *** CD-ROM | disk | FTP | other *** search
-
- Prelude for those that don't read the documentation:
- Do not mail me bug reports. I can't fix them... Other opinions on the
- program are welcome.
- I do not know if this program works on a CPU without math co-proc (like the
- 486-SX)
-
- System Benchmark "SysBench" 0.9.0
- ---------------------------------
-
- (C) 1994 Henrik Harmsen
- The disk IO code: (C) 1994 Kai Uwe Rommel
-
- Contents:
-
- 1 Introduction
- 2 Tests
- 3 Copyright notice
- 4 Thanks
-
- Appendix A : Todo
- Appendix B : Building
- Appendix C : Example results
-
- ---
-
-
-
- 1 Introduction
-
-
- I thought OS/2 needed a benchmark program, so I wrote one. This
- program is not quite finished, and probably never will be, not by me
- anyway, since I'm saying goodbye to OS/2 and turning my attention to
- Linux. The reasons for this has not so much to do with OS/2, which is
- still a great OS, as it has to do with Linux. Linux is slick,
- super-fast, finally has drivers for my Viper card, has free TCP/IP and
- last but not least, Linux is Unix.
-
- This means I am probably not going to make updates to this program,
- since I won't have OS/2 on my disk anymore. I'm saying probably, since
- I can't read the future. Maybe one day my whimsical mind will think
- OS/2 is more fun that Linux, who knows ? :-)
-
- It also means that I am donating this program to anyone who is willing
- to continue working on it. If you think you want to continue working
- on this program, make sure you clearly note that this is released by
- you, not me. To do this, change the version number to 0.9.0xxx, where
- xxx are your initials. For example 0.9.0hch, which would indicate that
- I (Henrik C Harmsen) has made this release. The version numbering
- scheme should follow that of GCC. The first number is the major
- release number, to be increased when major enhancements have been made
- to the program or it is considered out of beta. The second number is
- the minor release number, increase it when you have made small changes
- to the program. The last number should be increased when making
- bug-fixes only.
-
- Take a look at the appendices for more information on what needs to be
- done, what's not quite finished yet, and how to re-build the
- program. Among other things, this document needs rewriting.
-
- Do not send me complaints about bugs and errors, since I will have no
- way of fixing them...
-
- Now, that said, let's take a look at what this program tests.
-
-
-
-
- 2 Tests
-
- HANDLE WITH CARE! DO NOT BLINDLY TRUST BENCHMARK VALUES. THEY ARE ONLY
- GOOD IF YOU KNOW WHAT THEY ARE TESTING AND KNOW WHAT THEY ARE NOT
- TESTING...
-
- The values obtained here are not useful for comparing against values
- obtained from other benchmarks programs. Even though one of the tests
- for example measure Linpack performance and yields a value in MFLOPS,
- this value is not useful in comparing with other values from a
- different benchmark program. The only exception here is the dhrystone
- 2.1 value which might possibly be compared to values from other
- dhrystone 2.1 benchmarks. As a rule: Only compare values with people
- running this same benchmark program.
-
- Almost all tests are adaptive in that they will first measure the
- approximate speed of your computer so the test will take about 10-15
- seconds in total, no matter how slow or fast your computer is. The
- ones that are not adaptive are the floating point tests and the
- CPU integer tests with the exception of the dhrystone test.
-
-
- 2.1 Graphic tests
-
- These tests test how fast the video hardware/display driver
- combination can pump pixels to the screen. OS/2 has long had abysmal
- display drivers for many cards, these tests are meant to sort out
- whether they really are bad, good or stink.
-
- Most window operations are using only a few key operations of the
- video card accelerator. Take a look at your windows, they're mostly
- built from filled rectangles, with some text and vertical and
- horizontal lines. Maybe a few bitmaps here and there (icons and such).
-
- The PM-marks are calculated from the other values as a weighted
- arithmetic mean-value.
-
-
- 2.1.1 BitBlit S->S Copy
-
- Tests the speed of the bitblit screen->screen copy operation. One of
- the most important values, since it affects how fast you can scroll
- text, and move large windows.
-
-
- 2.1.2 BitBlit M->S Copy
-
- Tests the speed of the bitblit memory->screen copy operation. This
- affects how fast updates of large bitmaps are and all operations that
- copy data from RAM to Video RAM.
-
-
- 2.1.3 Filled rectangle, patterned filled rectangle.
-
- Tests how fast the blitter can blank areas with a color or stipple
- pattern. When updating a window, the background is usually blanked
- with a single color or pattern before text or other things are drawn
- on it.
-
-
- 2.1.4 Lines
-
- Tests the speed of line-drawing in different directions. The
- horizontal and vertical line drawing speed is important when drawing
- frames around windows and such.
-
-
- 2.1.5 Text render
-
- Extremely important function for speedy updates in text editors, shell
- windows, word processors etc.
-
-
-
- 2.2 CPU Integer tests
-
- The CPU tests are divided into two sections, one to test 'integer'
- performance, meaning not only integer arithmetics but also every other
- 'normal' program that does some kind of data processing. 99% of all
- applications do not use floating-point arithmetic. Those that do are
- usually ray-tracers, scientific engineering type of programs etc.
-
- The CPU-int marks are calculated as a weighted mean average of the other
- tests.
-
- 2.2.1 Dhrystone VAX MIPS
-
- When reading about how many MIPS a computer performs, that is usually
- tested by running this Dhrystone test and adjusting the result to be
- relative to one VAX 11/780 MIPS. That means, this test does not
- benchmark the number of million instructions per second (MIPS) as
- defined by machine instructions, but rather a weighted value against
- the base reference of one VAX 11/780 MIPS.
-
- This test uses very little memory, meaning it will measure the CPU
- performance only, not taking into account other vital parts as memory
- speed etc.
-
- Here is an excerpt from the sources from where I got this program:
-
- "Dhrystone is a short synthetic benchmark program intended to be
- representative for system (integer) programming. Based on published
- statistics on use of programming language features: see original
- publication in CACM 27,10 (Oct 1984). Orginally published in ADA, now
- mostly used in C. Version 2 (in C) published in SIGPLAN Notices 23,8
- (Aug 1988), together with measurement rules. Version 1 is no longer
- recommended since state-of-the-art compilers can eliminate too much
- 'dead code' from the benchmark (However, quoted MIPS numbers are often
- based on version 1). Problems: Due to its small size (100 HLL
- statements, 1-1.5 KB code), the memory system outside the cache is not
- tested; compilers can too easily optimize for Dhrystone; string
- operations are somewhat over-represented. Recommendation: Use it for
- controlled experiments only; don't blindly trust single Dhrystone MIPS
- numbers quoted somewhere (don't do this for any benchmark)."
-
- This test is based on the C-version of Dhrystone 2.1.
-
- 2.2.2 Hanoi
-
- An integer program which solves the Towers of Hanoi puzzle using
- recursive function calls. It uses very little memory, and thus does
- not test memory speed.
-
- 2.2.3 Heapsort
-
- Tests how fast your computer can sort a large array of random values
- using the heapsort algorithm. Tests both CPU and memory speed. The
- MIPS are just a measurement against some arbitrary base MIPS
- reference. This test uses about 1 MB memory.
-
- 2.2.4 Sieve
-
- Tests how fast your computer can find lots of prime numbers using the
- sieve of Eratosthenes using arrays from 8 kB to 1.2 MB. The result is
- a weighted mean value of the different speeds. Tests both CPU and
- memory speed.
-
-
-
- 2.3 CPU floating point tests
-
- These tests measure how fast your computer is at floating point
- arithmetics. (Floating point means non-integer numbers like 2.3,
- 0.24 etc.)
-
- The CPUfloat-marks are calculated as a weighted mean average of the
- other values.
-
- 2.3.1 Linpack
-
- This is the Linpack program (floating-point) converted to C. Results
- here are sensitive to cache effects and memory speed. This version
- tests only the rolled double precision version.
-
-
- 2.3.2 Flops
-
- Estimates MFLOPS rating for specific FADD, FSUB, FMUL, and FDIV
- instruction mixes. Four distinct MFLOPS ratings are provided based on
- the FDIV weightings from 25% to 0% and using register-register
- operations. Works with both scalar and vector machines. Since the
- program trys to maximize register usage the results are NOT sensitive
- to main memory speed. In this sense flops yields a peak rating. The
- four different values are used to get a weighted mean average.
-
- 2.3.3 The Fast Fourier Transform
-
- This program performs FFT's using the Duhamel-Hollman method for FFT's
- from 32 to 262,144 points in size.
-
-
-
- 2.4 DIVE tests
-
- DIVE means Direct Interface to video extensions. It is a library in
- OS/2 that gives fast access to video routines used for programming
- games or other very demanding graphic applications. It gives the games
- programmer access to the Holy Graal - a pointer to the frame buffer.
- The tests here are not incorporated into the benchmark since the DIVE
- functionality will not actually appear until OS/2 3.0. I will describe
- them, nonetheless.
-
- The DIVE-marks are calculated as a weighted mean average of the other values.
-
- 2.4.1 Video bus bandwidth
-
- This test makes a copy of the frame buffer and copies it back to the
- screen a lot of times in order to measure how many bytes per second
- you can pump data to the video RAM. On my 486-66 machine with a
- Diamond Viper card this amounts to about 13 MB/s! That means about 42
- frames per second in 640x480x256...
-
- 2.4.2 DIVE fun
-
- This was an entry I added since I had a few ideas on fun screen hacks
- you can do with DIVE. One of them is smoothly turning the screen
- upside down and back again. The value obtained here will be highly
- correlated with the Video Bus Bandwidth test.
-
- 2.4.3 Memory to screen copy with DIVE
-
- DIVE has built-in routines for copying a large amount of data from RAM
- or Video RAM to the display with the help of an hardware blitter (if
- one is available), or software. There are three such tests. The first
- test just blits an image to the screen, the second performs
- pixel-doubling, effectivly doubling the size of the display. The third
- test tests arbitrary stretching of the bitmap when displaying it on
- screen. If you have Warp II or OS/2 3.0 you will have seen the ability
- to stretch a running video clip to any size you want. These tests are
- not finished yet.
-
- 2.5 Disk IO tests
-
- These tests were programmed by Kai Uwe Rommel, although I have made a
- lot of changes to his source code. Thanks Kai Uwe!. The tests are
- available as a free-standing package called diskio14.zip at
- ftp.cdrom.com. If there are any errors or strange behaviour in these
- tests then blame me, not Kai Uwe.
-
- The test can test all you fixed disks in your system. There is a menu
- choice to change which disk to test.
-
- The DiskIO-marks are calculated as a weighted mean average of the
- other values.
-
- 2.5.1 Average seek time
-
- Tests the average seek time of the currently selected disk. I have
- seen that this is often a bit higher than what the disk manufacturers
- promise... This is most likely due to different ways of testing
- things.
-
- 2.5.2 Disk transfer speed.
-
- Measures how fast the disk can be read NOT using the cache. When I
- first came across the diskio program by Kai Uwe, my disk performed at
- about 1.0 MB/s. I thought that was not very good, but perhaps
- acceptable. Then I started to muck around with the CMOS parameters and
- by changing the IO block read delay (I think that is what it was
- called) the speed of the disk jumped from 1.0 to 1.5 MB/s ! Not bad, I
- thought. But when I upgraded to Warp II the disk performance suddenly
- jumped to 2.2 MB/s. This is probably due to OS/2 using multiple mode
- block transfer mode. Then finally, I changed the AT bus speed from 8.3 MHz
- to 11 MHz and the disk transfer speed jumped again from 2.2 to
- 2.6 MB/s !
-
- From this can be learned that there seems to be a lot that can be done
- about slow IO. Just be careful when you muck around with the CMOS
- parameters though, since there is a very high likelyhood of making
- mistakes that can make the machine unusable or prone to strange
- errors. Usually, this is not dangerous, just reset the value to the
- old one and your machine should perform as before. Sometimes, though,
- you _can_ destroy your computer by changing values incorrectly. Be
- warned...
-
-
-
- 2.6 Memory speed tests
-
- Memory speed seems to be a forgotten area when talking about the speed
- of a computer. You hear a lot about CPU speed and disk speed and video
- speed and such, but rarely of memory speed. This is wrong IMHO, since
- a lot of the performance of a computer has to do with memory IO. When
- PC Magazine measured memory speed in one of their grande tests they
- discovered a lot of difference between the good and bad performers. I
- would like to bring this fact into focus: Memory IO speed is a vital
- part of the performance of your computer, even more so with faster and
- faster processors. A really fast RISC processor can execute as much as
- 40 instructions in one memory read...
-
- Of course, memory speed timing is a complex issue. How fast a memory
- access is depends on:
- The pattern of the access : Random, sequential, local, global ?
- Cache : Primary and secondary cache size and type.
- Virtual memory : Paging algorithm, disk IO performace.
- Motherboard Memory controller : This is the key component to fast mem IO
- Speed of SIMMS : 60, 70 or 100 ns?
-
- etc. etc.
-
- These tests are also limited. They cannot test the whole truth about
- the speed of your memory IO.
-
- The Mem-marks are calculated as a weighted mean average of the other values.
-
- 2.6.1 Memory copy
-
- This test first allocates a chunk of memory and then reads and writes
- it back and forth a few times to "activate" the memory: Initialize the
- physical pages, and read it into the caches. This is done to obtain as
- stable as possible value between measures. It also has the effect of
- maximizing the access speed.
-
- Then it proceeds to copy the first half of the memory to the second
- and then the second half to the first. This is to diminish the strange
- effects you get from write-through and copy-back caches. When it says
- 5 kB copy, that means copying 2.5 kB back and forth.
-
- You can clearly see the effects of your caches. As long as the access
- is within the cache, it is a lot faster. There is also another factor
- that will make the larger (80-160kB) values jump up and down, and that
- is the effect of virtual memory. The second level cache performs well
- on a sequential memory range, but the virtual memory will chop the
- physical memory into 4kB pages and shuffle them around in physical
- memory. If you are lucky, the physical pages are sequential but they
- don't have to be. When they are not, the pages are scattered around
- and the second level cache (which is almost always a direct-mapped
- cache) will have a larger probability of mapping several physical
- pages to the same area. Higher level cache (2-way, 4-way) techniques
- should help here, but that is not certain.
-
- Again, CMOS settings can very much affect the speed of your memory
- access. Be sure to use as low value as possible on the various wait
- state entries and make sure the whole memory is cached, not just the
- first 16 MB if you have more.
-
- 2.6.2 Memory read
-
- Tested by calculating the checksum over the specified amount of bytes
- over and over again.
-
- 2.6.3 Memory write
-
- Tested by writing a value into all longwords of the specified amount
- of memory.
-
-
-
- 3 Copyright notice
-
- There is no warranty. Use this software at your own risk. Due to the
- complexity and variety of today's hardware and software which may be
- used to run this program, I am not responsible for any damage or loss
- of data caused by use of this software. It was tested and is expected
- to work correctly, but nobody can actually guarantee this for any
- circumstances. And because this software is free, you get what you pay
- for...
-
- This program can be used freely for non-commercial purposes.
-
-
- 4. Thanks
-
- Thanks to Kai Uwe Rommel (rommel@ars.muc.de) for supplying the disk IO
- benchmark code and to Al Aburto (aburto@marlin.nosc.mil) for supplying
- the CPU integer and CPU float benchmark code.
-
-
-
-
-
- -- Henrik Harmsen
-
-
- Email: harmsen@eritel.se
-
-
-
-
-
- Appendix A - TODO
-
- 1 Make the CPU integer and CPU float tests adaptive to the speed of the
- computer.
-
- 2 DIVE: Support for bank-switched cards. Better error handling. Finish the
- Memory->Screen bitblit tests.
-
- 3 Graphics test: The Memory to screen bitblit copy is probably not
- correct for 16 and 24 bit displays.
-
-
-
-
-
- Appendix B - Building
-
- You need Cset++ 2.1. Cd src, run nmake. It is probably quite easy to
- port to emx-gcc.
-
- Why are all the source code files named pmb_* ? Well I first wanted
- to call it PMBench, as a play with WinBench, but it turned out that
- PC Magazine already had a PMBench program... So I changed the name
- to SysBench, but I did not have time to change all the 'pmb' to 'sysb'...
-
-
-
-
-
- Appendix C - Example results
-
-
- Example of a result file, when benchmarking my own system, which is:
-
- Software:
- --------------
- OS/2 2.11
- Diamond Viper display drivers 1.02beta running 1024x768x8
-
- Hardware:
- --------------
- CPU : 486DX2-66
- Chipset : UMC
- Cache : 8 kB level 1, 256 kB copy-back level 2.
- Memory : 20 MB 70ns.
- Harddisk: disk 1: Seagate 340 MB. disk 2: Conner CFA540A 540 MB.
- Video : Diamond Viper VLB, 2MB VRAM, 2.02 BIOS.
-
- -------
-
- Sysbench 0.9.0 result file created Sat Oct 22 14:31:27 1994
-
-
- Graphics
- BitBlt S->S cpy : 52.640 Mpixels/s
- BitBlt M->S cpy : 15.581 Mpixels/s
- Filled Rectangle : 356.366 Mpixels/s
- Pattern Fill : 90.477 Mpixels/s
- Vertical Lines : 6.233 Mpixels/s
- Horizontal Lines : 9.656 Mpixels/s
- Diagonal Lines : 7.545 Mpixels/s
- Text Render : 18.553 Mpixels/s
- ------------------------------------------------------------
- Total : 73.835 PM-marks
-
- CPU integer
- Dhrystone : 39.800 VAX 11/780 MIPS
- Hanoi : 27.083 moves/25 usec
- Heapsort : 19.290 MIPS
- Sieve : 37.741 MIPS
- ------------------------------------------------------------
- Total : 32.938 CPUint-marks
-
- CPU float
- Linpack : 2.535 MFLOPS
- Flops : 3.572 MFLOPS
- Fast Fourier Tr. : 4.291 VAX FFT's
- ------------------------------------------------------------
- Total : 3.472 CPUfloat-marks
-
- Direct Interface to video extensions - DIVE
- Video bus bandw. : --.--- MB/s (on Warp II, this was ca. 13 MB/s)
- DIVE fun : --.--- fps
- M->S, DD, 1.00:1 : --.--- fps
- M->S, DD, 2.00:1 : --.--- fps
- M->S, DD, 2.43:1 : --.--- fps
- ------------------------------------------------------------
- Total : --.--- DIVE-marks
-
- Disk I/O - disk 2: 528 MB
- Average seek time : 16.852 ms
- Transfer speed : 1.990 MB/s
- ------------------------------------------------------------
- Total : 1.465 DiskIO-marks
-
- Memory
- 5 kB copy : 61.561 MB/s
- 10 kB copy : 49.211 MB/s
- 20 kB copy : 33.167 MB/s
- 40 kB copy : 25.707 MB/s
- 80 kB copy : 25.571 MB/s
- 160 kB copy : 17.578 MB/s
- 320 kB copy : 15.526 MB/s
- 640 kB copy : 13.385 MB/s
- 1280 kB copy : 11.941 MB/s
- 5 kB read : 70.885 MB/s
- 10 kB read : 42.156 MB/s
- 20 kB read : 42.970 MB/s
- 40 kB read : 32.170 MB/s
- 80 kB read : 31.747 MB/s
- 160 kB read : 21.777 MB/s
- 320 kB read : 19.533 MB/s
- 640 kB read : 17.150 MB/s
- 1280 kB read : 15.710 MB/s
- 5 kB write : 50.263 MB/s
- 10 kB write : 47.512 MB/s
- 20 kB write : 49.802 MB/s
- 40 kB write : 50.763 MB/s
- 80 kB write : 48.561 MB/s
- 160 kB write : 47.028 MB/s
- 320 kB write : 44.140 MB/s
- 640 kB write : 44.034 MB/s
- 1280 kB write : 42.258 MB/s
- ------------------------------------------------------------
- Total : 28.007 Mem-marks
-
-
-
-
-
-