home *** CD-ROM | disk | FTP | other *** search
- The DATAMAGE data management system: Comments on execution speed.
-
- Speed of execution is a topic of interest in any program that performs
- significant tasks. Databases can be said to live or die by their ability to do
- MASSIVE processing in literally no time flat. Many of the programs that
- compete with DATAMAGE waste incredible amounts of disk space and include
- methods that render their files and indexes unimpregnable to any other program
- in order to accomplish, or to SEEM to accomplish, this impossible goal.
-
- There are ways to speed things up, and it is the purpose of this document to
- explain them. These methods fall into three basic categories: Get a fast
- computer, use the computer's memory instead of the disk drive, and keep your
- files all in one place on the disk drive.
-
- HARDWARE:
-
- Certainly, the raw power of the target computer is a very significant
- ingredient in the equation. If you are using an old PC or PCXT whose
- microprocessor is the original 8088 and runs at 4.77 Mhz you will be quite dis-
- satisfied with the performance of this, or any other program. (A Mhz, by the
- way, is one million clock cycles per second.) If you can not afford better
- hardware one thing that will enhance your current computer is to pull the 8088
- chip out and replace it with a NEC V-20. This will almost double your
- processing speed but your computer will still be, in modern terms, VERY SLOW.
-
- The original PCAT, running at 6 Mhz, and the later "enhanced" version of same
- running at 8 Mhz were fast machines in their day. Their time has passed. A
- modern AT-class machine should run at a minimum of 12 Mhz, and hardware is
- available that runs all the way up to 44 Mhz! There is little, if any
- difference between an 80286, 80386 or even the latest 80486 microprocessors
- when shuffling datafiles. The CLOCK rules the computer, and dictates it's
- speed. No other enhancement can approach the clock speed for doing it fast.
-
- SUBSTITUTING MEMORY FOR THE DISK DRIVE:
-
- Memory above MS-DOS's 1,024,000 byte (1 MEGABYTE) limit is available ONLY on
- AT-class machines, or with special E.E.M.S cards on XT-class machines. MS-DOS
- can not use this memory, but your computer can make use of it via DRIVERS which
- switch the microprocessor into it's native mode, make use of the memory, then
- switch it back into 8088 emulation mode and continue to execute the program
- that runs under MS-DOS. That's right - when you are running MS-DOS your
- microprocessor, be it a '286, '386 or '486, is NOT running it's potential
- instruction set, but is emulating an 8088. The newer microprocessors CAN NOT
- run MS-DOS, nor can MS-DOS access or manage memory over 1 megabyte.
-
- There are two types of drivers that can add significant speed to DATAMAGE, or
- any other program that makes heavy usage of the disk drive: RAM-DISKS and DISK
- CACHETING. These drivers allow the computer to use memory above the 1 megabyte
- limit and substitute it for the slower disk drives.
-
- With the RAM DISKS a "fake" disk drive is added to your computer. You can use
- it just as you would any real drive, you can do everything except remove and
- replace the disk, which is also not possible with a hard disk. DATAMAGE will
- prompt you for the disk drive to use for various tasks. If you have sufficient
- memory you can set up a ram-disk and greatly profit in terms of speed.
-
- With the DISK CACHETING drivers the data in the files currently open on your
- computer is read from the disk where it resides and moved into memory above the
- 1 megabyte limit. When you read or write data it is moved from/to this memory
- which takes quite a bit less time than doing the same operation from the disk.
-
- BENCH MARKS:
-
- Before beginning it may be helpful to explain that the standard benchmark of PC
- execution speed has become Peter Norton's SYSINFO program. The version of
- SYSINFO used for these illustrations was 3.0. This program assigns a numeric
- rating to the computer tested, 1.0 being the speed of a standard PC running an
- 8088 at 4.77 Mhz. So, 1.0 is DEAD SLOW.
-
- In order to demonstrate just how much speed can be gained by the various
- methods detailed above here are some operations, done with and without disk
- cacheting and at different clock speeds. They were all done on the same
- computer, a CHIPSET (NEAT) '286 running at 10 or 20 Mhz. This machine is
- certainly not the fastest computer that money can buy, but it aint too shabby.
-
- This machine consistently attains SYSINFO ratings of 11.5 or 15.5 (depending on
- external or internal bus timing) at it's 10 Mhz speed and 23.0 at it's 20 Mhz
- speed. There are utilities that will slow a computer down in order to play a
- video game designed for the older, slower hardware. Such a program:
- VARISLOW.EXE was used to decrease the speed of the computer used for testing as
- much as possible, to 3.6 or 4.0 SYSINFO rating.
-
- Of the many operations that can be done by DATAMAGE, sorting the records into
- order is the most demanding and time-consuming. The sorts were, therefore,
- used for the benchmark tests. The datafile used for the bench mark tests
- comprises 3,332 records. The size of this file after it was converted to the
- DATAMAGE format (DATAMAGE also found and rejected all the duplicate records it
- contained during the conversion!) is 2,050,512 bytes. It was imported from the
- dBase format, and is distributed with the SHAREWARE MARKETING SYSTEM from Jim
- Hood. Thanks, Jim!
-
- HARDWARE:
-
- For the hardware test the records were first arranged into an order that was
- the same for each test, and completely random in terms of the desired order on
- RECORD NUMBERS. The record numbers are in the computer's memory, so the disk
- drive was not accessed.
-
- SI RATE 1ST 1,000 2ND 1000 3RD 1000 TIME
- ===========================================================
- 3.6 1:15 3:20 5:25 12:11
- 15.5 :23 1:00 2:23 3:09
- 23.0 :14 :36 :58 2:11
-
- As you can see, the few extra dollars spent to procure a modern, fast computer
- PAY OFF! As I have learned, long ago: There's NO economy in buying junk! And
- there is no point of reference that will demonstrate this axiom more clearly
- and frequently than computer hardware.
-
- HARDWARE and DISK CACHETING:
-
- Even more difficult than the sorting of numbers already in the computer's
- memory is the sorting of string data. This must be gotten from the disk drive
- as the sort progresses, and takes far longer to compare than numeric data.
-
- The same file contains a string field: COMPANY NAME. The file was arranged in
- an order that placed the company name at as near as you could come to random
- intervals within the file, then sorted into alpha order. As I watched the sort
- progress I realized that each and every record was moved.
-
- DBASE AND SORTING ALPHA:
-
- While completing this portion of the documentation I asked a young friend who
- has, like a lot of other people, been FAKED OUT by dBase, how long it would
- take the industry standard to complete this task. I pointed him to the file in
- dBase format, and he got right back to me and, with swollen chest and in a VERY
- loud voice, informed me that dBase did this in less than a minute.
-
- I KNEW BETTER!
-
- So, I went over there and asked him to do it for me. He gave dBase the
- commands and it did SOMETHING in less than a minute. What it did was to create
- a whopping 355K b-tree file containing the data in the company name field. It
- did NOT sort the records nor, apparently, is it able to do so!
-
- As we attempted to USE this index we noticed that, while getting the next
- record in the order, there was a delay. This time was spent by dBase in
- climbing it's b-tree in order to FIND said record.
-
- DATAMAGE, on the other hand, produced a MARKER file of only a little over 6K.
- The reading of the next record in the order was INSTANTANEOUS, as the MARKER
- file contained it's location in the file. DATAMAGE did ALL of the work
- involved in the sort at one time. You could use the results of this work for
- the next thousand years and not have to wait one additional nanosecond to see
- your records in the desired order.
-
- In the immortal words of one Laut Su (Chinese profit crica 1,000 A.D.): "The
- foremost thoughts in a wise man's mind are these: THINGS ARE SELDOM AS THEY
- SEEM!" Maybe there is a program that will do a REAL alpha-sort faster than
- DATAMAGE. Be that as it may, I will NOT lie or cheat to make my program SEEM
- faster than it is. But soon, maybe next year, I will re-write the BASE program
- in C, and thereby honestly increase it's speed.
-
- HONEST TIMES TO SORT THE DATA:
-
- With these times the speed of access to data on disk becomes a very important
- factor, and disk cacheting begins to have great value. The proceeding table
- relates the computer's speed as well as the presence of cacheting.
-
- NORTON BYTES 1ST RECORDS 2ND RECORDS 3RD RECORDS TOTAL RECORDS
- SI RATE CACHET 1000 SECOND 1000 SECOND 1000 SECOND TIME SECOND
- ==============================================================================
- 4.0 0 20:14 .78 29:53 .55 33:26 .50 1:37:14 .57
- 11.5 0 11:30 1.45 15:45 1.01 17:15 .97 50:13 1.10
- 11.5 1024 7:35 2.20 9:01 1.85 10:29 1.59 30:40 1.81
- 23.0 0 9:02 1.83 12.20 1.35 13:17 1.41 39:15 1.41
- 23.0 1024 5:08 3.25 5:29 3.03 7:23 2.26 18:07 3.07
-
- You are quite likely more mathematically reclined than myself. Hoards of
- numeric data could be extrapolated from the above - have fun! Just one point:
- If you have a computer that actually rates 4.0 on SYSINFO you will be hard-
- pressed to match the first line. Though the time wasting utility slowed the
- microprocessor down this machine STILL communicates with a drive that's 3 times
- as fast as the drive that came with the original PC or XT, down a 16-bit bus.
-
- The point should be made, here, that DATAMAGE can, and usually will, sort your
- datafile MUCH faster than this. Look at the times above as sort of a worst-
- case scenario. See the file MAIN.DOC, under the heading F-10, MARKER FILE and
- the sub-heading RESTORE REMAINING RECORDS to discover how to get your work done
- fast. You will only need to sort the raw order ONCE, after that you'll make
- much better time via the use of a MARKER FILE.
-
- KEEPING THE FILES TOGETHER - FRAGMENTATION:
-
- As DATAMAGE files grow your disk drive is bound to get FRAGMENTED. The
- datafiles will grow piece by piece. You have a DATAMAGE file that is 100K in
- length; you enter several documents with your word processor, download a couple
- hundred Kbytes with your modem, then add some records to your DATAMAGE file.
- All the things that happened in between will BE in between on your disk drive,
- resulting in DATAMAGE having to hunt all over the drive to read the file.
-
- There are utilities galore to defragment hard disks. Two GOOD ones that are
- available as SHAREWARE from your disk vendor or on a local BBS are DOG
- (DiskOrGanizer) and SST. DOG is really safe, but also kinda slow. SST is
- really fast but, should your power fail during the session, bye bye disk!
-
- Commercial programs such as NORTON UTILITIES also include defragmentation in
- their bag of tricks. Whatever your choice of vehicle you should defragment
- your hard disk AT LEAST once a month, and for heavy usage once a week. Doing
- so will have you working more and waiting less.