home *** CD-ROM | disk | FTP | other *** search
Text File | 1988-07-31 | 76.0 KB | 1,453 lines |
-
- TBTREE13 Copyright (c) 1988 Dean H. Farwell II
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- TBTREE13
-
- ( Turbo BTree version 1.3 )
-
-
-
- Yet another implementation of BTrees for Turbo Pascal 4.0
-
-
- Copyright (c) 1988 Dean H. Farwell II
-
-
-
- All source code and documentation are considered to be shareware and
- are copyrighted. The following rules of engagement apply to all
- documentation and code which make up TBTREE13:
-
- All users can and are encouraged to distribute this software freely.
- This includes the uploading of this software onto applicable Bulletin
- Boards etc. The entire TBTREE13 product must be distributed as a
- complete product. Pieces can not be distributed individually. The
- documentation must be distributed with the source code. No copyright
- statements may be modified.
-
- All users can and are encouraged to use these utilities to develop
- and/or enhance their private, shareware, or commercial software
- applications.
-
- All users are encouraged to suggest enhancements, make suggestions and
- generally lead to the development of a better product to the benefit
- of all. Users may change the code and documentation as desired for
- their own use but may not then distribute those changes.
-
- Users may distribute these routines as part of another SHAREWARE
- package. For instance, a user could develop a DBMS using this as a
- basis and can freely distribute the source code provided that:
-
- 1. None of this documentation or code is changed
- 2. ALL code and documentation is distributed (not just parts)
- 3. The user is not selling or distributing this for profit
- 4. The user be registered.
-
- The restriction that the code and documentation not be changed is to
- ensure that I can maintain some control over version releases. I do
-
- 1
-
-
-
- TBTREE13 Copyright (c) 1988 Dean H. Farwell II
-
-
- not want several versions running around that I had nothing to do
- with.
-
- The bottom line is that you can not sell this or distribute it for
- profit (except for a small fee not to exceed $10 to cover cost of
- mailing, duplication and handling). A user developing a commercial
- application can not distribute the SOURCE code or documentation but a
- user can use these routines to compile a commercial application.
-
- Users are encouraged to register their copy of this software. The
- registration fee is a very nominal $15. The registration is not going
- to make me rich but will let me know if anybody out there is at all
- interested in this. I intend to support this product and add many
- capabilities. I am looking for registered owners to provide feedback
- which will help me prioritize enhancements. If few people register
- then I will assume that there is not wide spread interest in this
- product.
-
- Anyone using this software to develop a commercial application must
- register and state their intentions to use it in a commercial
- application. I will not charge royalties (except for the registration
- fee). I just want to know what people are doing with it. Also, if
- someone is distributing it as part of a shareware product, the same
- restrictions apply.
-
- Most of all I want any feedback that you have the time and desire to
- submit. Suggestions, complaints, etc are welcome. I especially want
- feedback on errors. I do not know of any bugs (or I'd fix them) but I
- probably missed an item or two. Send me notice immediately so that I
- can distribute a fix if required. Since you have the source code, do
- your best to isolate the problem. I'll try to take it from there.
-
- To reach me use Compuserve ID - 73240,3335 or drop me a line at:
-
- 1834 Windsor Downs Ct
- Montgomery AL 36117
-
- My phone number is:
-
- 205-279-9552
-
- For hopefully obvious reasons, I will not accept collect calls.
- Otherwise, feel free to call.
-
- As a final motivation to register, all registered users will receive
- one upgrade free (via mail - no postage due or anything!) when it
- becomes available. I will mail these well before I put the version
- out on CompuServe or anywhere else. The registered users will, in
- effect, become beta testers to make sure that all is well before I
- release to the world. After the first upgrade, the user can request
- to receive the next upgrade for a nominal $5 to cover postage , etc.
- This fee will ensure that you always have the latest version and is
- probably less than the cost to download off of a bulletin board. I
- will do my best to not release trivial changes but only significant
- upgrades.
-
- Liability:
-
-
- 2
-
-
-
- TBTREE13 Copyright (c) 1988 Dean H. Farwell II
-
-
- This software is not warranted. There are no warranties expressed or
- implied as to the quality, correctness, or applicability of this
- software. The author is not responsible for any loss or damages due
- to the use of any software or documentation distributed as TBTREE13.
-
- Product Description
-
- TBTREE13 is version 1.1 of a shareware btree and database product for
- use with Turbo Pascal 4.0. It is intended to give a user the
- capabilities to develop pascal applications for the creation and
- manipulation of databases. This product is extremely powerful and
- FAST!!! However, it is not the easiest thing to get a grasp on. If
- you understand the concept of btrees or at least indexing then this
- will not be complicated to use. I guarantee that it is the best
- product of its type for the money and maybe the best period. If you
- have any feedback on how I can take the mystery out of using this
- please let me know.
-
- I am presently working on a second product which will be called TBASE
- and will use these routines as the database engine but will have a
- much easier to understand interface for code developers. It will use
- SQL like calls instead of those used in this product. All registered
- users of TBTREE13 will be shipped a beta version of TBASE as soon as
- one becomes available and will get the first whack at using it. This
- in itself is worth the $15.
-
- Although this product provides many of the same general database
- capabilities as Borland's Turbo Pascal Database Toolbox it is not
- intended to be compatible. There are several reasons for this. First,
- I do not want to create any impression of any type of copyright
- infringement. Secondly, I needed to implement things differently than
- they were implemented in Borland's product to be able to do the
- enhancements I am planning. Thirdly, I wanted to implement things
- that Borland does not. For instance, I support data types in my
- indexes. Other differences will be more apparent as you read through
- this documentation. Lastly, I did not have a copy of Borland's
- product when I was doing this. This really started out as more of an
- academic exercise than something that I was going to release for
- others. This turned out to be a more extensive job than I thought
- (big surprise there, huh?). I figured that I might as well make it
- available for others.
-
- This product has been greatly enhanced since version 1.0 was released.
- As it presently stands, it has the following features:
-
- 1. The ability to create data files and indexes for those files.
-
- 2. Any number of indexes can be used for a single data file.
-
- 3. Any number of data files can be used.
-
- 4. All data files and index files use a virtual memory scheme in which
- the most recently used pages are kept in memory using the heap. This
- is different than many other implementations which use a buffer or
- stack only for index pages. The buffer is shared by all files. In
- this way separate space does not need to be set aside for different
- files. Space utilization is maximized and performance is enhanced.
-
-
- 3
-
-
-
- TBTREE13 Copyright (c) 1988 Dean H. Farwell II
-
-
- 5. All management of pages in memory is automatically controlled
- although the user can exercise limited control if desired.
-
- 6. Routines are provided to dump the page buffer to the printer to
- facilitate viewing pages if a user is curious or masochistic.
-
- 7. The size of the buffer is determined by the user and can be changed
- dynamically as the program runs. For example a 128K buffer could be
- used initially. If a user had a large heap requirement the buffer
- size could be reduced. The buffer size could later be increased if
- desired. This can all be done while the application is running.
-
- 8. Problems related to keeping track of numbers of open files are
- alleviated. The user initially specifies the maximum number of files
- allowed to be open at once and never has to address it again.
- Routines are provided to manage the opening and closing of files
- dynamically. This increases performance because files do not have to
- be continually opened and closed unnecessarily.
-
- 9. All data types supported by Turbo Pascal are explicitly supported.
- In other words, values do not have to be converted to strings. This
- is a great improvement over previous systems (MY UNBIASED OPINION).
- For example an index can be set up for type Byte, another for type
- ShortInt, another of type String and even for type Real (but wait you
- say .. Reals in an index?). See next note!
-
- 10. Indexes can be searched for equality, inequality, greater than,
- greater than or equal to, less than, less than or equal to, and
- existence. In other words you can ask for a list of all records for
- which some field of type real is LE (less than or equal to) XXX where
- XXX is a var of type real and XXX := 3.141592 etc. An index can also
- searched on a range of values. For instance an index of type byte can
- be checked for a range >= 10 and < 20 etc. Indexes containing strings
- can be searched for partial string matches. Partial string matches
- can be done three ways: find a string which is at the beginning of a
- string, anywhere in the string, or at the end of a string.
-
- 11. As physical records are deleted space is reused automatically on
- future inserts. Files should never have to be reorganized to recover
- space (unless a large number of deletes were performed without any
- inserts and even in this case there is not really a good reason to
- reorganize unless your disk is filling up).
-
- 12. Included with the package is a very crude but effective source
- code printing routine which can be used to dump the code in a more
- usable format. Specifically, it handles page breaks and prints a
- header of useful information.
-
- 13. The user does not need to perform any calculations or make any
- estimates prior to data file or index creation.
-
- 14. Code was designed using an Object Oriented Design approach to a
- large extent (Turbo 4.0 lends itself to this approach much more than
- Turbo Pascal 3.0 did). This is one reason that there are so many
- units. Much of the code is reusable for other applications.
-
- 15. Added to the product is a sorting capability. Sorts can be
- accomplished using any number of fields in any order of precedence as
-
- 4
-
-
-
- TBTREE13 Copyright (c) 1988 Dean H. Farwell II
-
-
- determined by you. A Quicksort algorithm is used to give high
- performance. Sort fields can be any type supported (Byte, Real,
- Integer, String etc).
-
- 16. Added the following set operations: Union, Intersection and
- Difference. These allow easy queries on multiple fields. For
- example, retrieving all personnel who have a last name starting with
- 'N' and who's job is 'MANAGER', etc is greatly simplified. This
- enhancement is a definite plus and not available in many other
- products.
-
- List Of Files
-
- The following files are provided as part of the TBTREE13 product:
-
- Units - Source Code For All Units
-
- BTREE.PAS - This unit contains the interface and
- implementation for the btrees themselves. Although the
- implementation is fairly complex, only a few routines are
- visible to the user and give the user all the functionality
- required. The actual working of the btree unit itself is
- not critical to the user. The user is provided with
- routines to add an entry to an index, delete one entry from
- an index, delete all entries with a matching key value from
- an index and also retrieve record numbers which match some
- selection criterion. The ability to create an index is also
- provided. To delete an index just delete the file using a
- routine provided in FILES.PAS. The BTREE unit is different
- than any other unit in the product in that it consists of
- not one but three source files. See entry for BTREE1.INC
- and BTREE2.INC.
-
- BTREE1.INC and BTREE2.INC - These are additional source
- files needed for the BTREE unit. I split up the source code
- for the unit because the source code ended up greater than
- 64K bytes. The two additional files are include files and
- will automatically be compiled when BTREE.PAS is compiled.
-
- COMPARE.PAS - This unit contains no inherent object although
- a couple of types are declared and put in the interface. It
- is primarily made of one routine which will give the
- capability of comparing two values of the same type using
- one routine regardless of that type. The address of the two
- values and the type of the values are passed in at run time.
- This routine alleviates the need for the user to develop a
- routine which can be used for comparisons. All Turbo Pascal
- 4.0 scaler types are handled. Strings are the only
- structured (composite) type handled. Several other routines
- were added to facilitate partial string searches.
-
- FILEBUFF.PAS - This unit contains the interface and
- implementation for a files open list. This list will
- contain information on all files opened (using this unit).
- A user can specify how many files can be on the list at once
- (how many the unit can allow to be open). The unit takes it
- from there. From then on all requests to use a file must be
- preceded by a call to the unit. The unit will determine if
-
- 5
-
-
-
- TBTREE13 Copyright (c) 1988 Dean H. Farwell II
-
-
- the file is open and will open it if it is not. The routine
- is exceptionally useful for alleviating the need to open and
- close files for every operation. Continually opening and
- closing files causes unacceptable performance problems. The
- file types handled are only Untyped files and Text files.
- Typed files are not currently handled. This is due to the
- fact that strong type checking precludes the use of typed
- files (or at least I couldn't break the code and figure out
- how to make it work. If anyone can figure out how to make
- it work, it would make this even more useful).
-
- FILEDECS.PAS - This unit is nothing more than a collection
- of declarations required to support several of the other
- units. There is no code in the implementation section.
-
- FILES.PAS - This unit contains the interface and
- implementation for the manipulation of files. This fulfills
- the role of file manager (mid level management of files as
- opposed to the low level management found in FILEBUFF.PAS).
- It is the place where the different file types required in
- TBTREE13 are created, deleted, and names are adjusted
- (extensions are attached). Additionally, there are several
- routines to set records to used and to check the status of
- records and to determine what the last record in use is.
- This is the only place where Bitmap files are manipulated
- (created, referenced, updated and deleted).
-
- HEX.PAS - This unit contains the interface and
- implementation for dealing with a hex array which is a 2
- byte array used for storing a hex character (such as 7F). A
- couple of routines are provided for converting bytes and
- bits to hex array representations.
-
- LOGICAL.PAS - This unit contains the interface and
- implementation for dealing with the conversion of logical
- data records to physical records. In TBTREE13 all physical
- records (records stored on disk) are the same size (512
- bytes). This is how all files can share the same page
- buffer (See PAGE.PAS). However, the logical records (data
- records in the data files) can be any size from one byte up
- to 65520 (although they will not be that large in practice).
- This unit takes the users data records stored in a variable
- and puts them in the number of physical records required.
- It also takes the appropriate data from physical record(s)
- and puts it in the variable of the users choice. A great
- deal of flexibility is available here. One caution however!
- TBTREE13 uses FIXED LENGTH LOGICAL RECORDS. All data
- records (within one data file) must be of the same size
- (obviously each data file can have a different logical
- record size). If variable length data is required then the
- size of the logical record is the largest number of bytes
- required as determined by the user. A good example of this
- is a record containing a string declared as String[20]. The
- logical record size is 21 bytes (one byte for the length).
- 21 bytes should be stored and retrieved even though a string
- such as 'howdy' is stored (howdy uses 6 bytes). Most
- products of this nature work in the same way.
-
-
- 6
-
-
-
- TBTREE13 Copyright (c) 1988 Dean H. Farwell II
-
-
- LRECLIST.PAS - This unit contains the interface and
- implemetation to handle logical record lists. A logical
- record list is a concept which is different than you will
- find in most other implementations of btrees. When a query
- is done on an index, most btrees set a cursor to the node
- where the first matching value is found (equality). This is
- very efficient but does not lead to the flexibility of my
- system. In TBTREE13 I build a list of logical record numbers
- corresponding to the logical records which meet the criteria
- of the query. I can handle all boolean operations such as
- equality, inequality, greater than, greater than or equal
- to, etc. Therefore, I could get a list of all logical
- records which meet some criteria and deal with that list
- independently of the btree. Once I have the list I can
- compare it to other lists and do intersections, unions,
- joins, etc. This gives the user some of the power of a real
- database management system. A user can relate a file to
- itself or to other files. This system is the basis of my
- future work to bring true relational capabilities to
- programs written in Turbo Pascal. It gives the capability to
- do set at a time processing. Unfortunately, nothing is
- free. A query on a database of 10000 logical records on
- existence will give a list of 10000 logical records (40000
- bytes required to store the list). Lists like this take
- time to build. A fact of life is that it takes time to
- process a lot of data in a large database. The user is
- cautioned to structure queries in such a way that lists are
- not huge. Queries on equality are not normally a problem
- (unless the index provides little selectivity). Queries on
- inequality will be faster if a range is used. No matter
- what type of query is used, the page buffer alleviates a
- great deal of performance problems.
-
- MATH.PAS - This unit contains the interface and
- implementation to manipulate bytes at the bit level. Only
- three routines are visible as that is all that I required in
- TBTREE13. This unit has been redone in assembly language to
- enhance performance. See the source listing for further
- details.
-
- MYPRINT.PAS - This routine is a collection of printer codes
- and routines to print the codes. This is sufficient for my
- purposes and is included only because I needed a couple of
- routines for TP4PRINT.PAS and PAGE.PAS. This unit is not
- very elegant and will need to be modified if it does not
- support your printer.
-
- NUMBERS.PAS - This unit contains no code in the
- implementation. It is simply a repository for often used
- types which are not part of the original Turbo Pascal 4.0
- product.
-
- PAGE.PAS - This unit contains the interface and
- implementation to manipulate the page buffer. The page
- buffer is stored internally (heap) in an area set aside to
- hold physical records. This buffer greatly enhances
- performance of the overall product. To function the buffer
- can be set to as small as one page (512 bytes). However,
-
- 7
-
-
-
- TBTREE13 Copyright (c) 1988 Dean H. Farwell II
-
-
- performance is going to be dismal. The amount of buffer
- space is dependent on many factors and the size needs to be
- played with to see what works. One rule stands though ..
- bigger is better! One interesting note - routines are
- provided to change the size (both increase and decrease)
- dynamically as often as desired. There are other routines
- available as well. The user has the option to force a page
- to be written to disk immediately when it is changed (stored
- in the buffer) as opposed to waiting until it gets swapped
- out or all pages are written to disk. Immediately writing
- to disk ensures that nothing gets lost on a system or
- program crash but it has severe performance implications.
- Use of this is not normally required, but the option is
- there and usable.
-
- SETOPS.PAS - This unit contains routines that support set
- operations. They facilitate queries using multiple fields
- from one data file. This greatly reduces the workload on
- the programmer.
-
- SORT.PAS - This unit contains three procedures for sorting.
- This unit was was created in version 1.2 and adds greatly to
- the overall product. It uses a quicksort algorithm to
- provide good performance. Just as importantly, it supports
- all data types supported by TBTREE13 and is extremely easy
- to use.
-
- STRINGS.PAS - This unit contains routines to manipulate
- strings. Only two routines are provided. There is great
- room for expansion of this unit if required.
-
- TIME.PAS - This unit contains the implementation and
- interface to manipulate time arrays. A time array stores a
- four byte time value retrieved directly from RAM. It is
- probably not the best way to do this particular function,
- but it works and is fast. It is essentially only used in
- conjunction with least recently used routines (in PAGE.PAS
- and FILEBUFF.PAS). If anyone experiences compatibility
- problems let me know.
-
- Programs - Source Code For All Programs
-
- TP4PRINT.PAS - This program is provided merely to permit my
- source code to be printed out in a better format than
- provided by the DOS Print command. It looks for page
- control codes imbedded in my source (control codes are
- comments to the compiler) and controls paging. It puts a
- header which includes file name, page and print time. It
- provides top and bottom margins as well. Give it a try by
- printing out one of the source code files to see if you want
- to use it or another source code printer. Syntax is :
-
- TP4PRINT filename where filename includes path if
- source is not in the current
- directory
-
- BUILDTPU.PAS - When this program is compiled, all units
- associated with TBTREE13 will also be compiled. This simply
-
- 8
-
-
-
- TBTREE13 Copyright (c) 1988 Dean H. Farwell II
-
-
- provides a convenient way to build all the units at once
- without compiling them individually.
-
- EXAM1.PAS - This example program demonstrates how to create
- a data file and a couple of indexes. It then demonstrates
- how to insert data into the data file and the indexes.
- Several other routines are demonstrated as well. Important
- to note is that this routine has been changed from version
- 1.2 and prior. It now uses StoreNewLogicalRecord to store
- the record as opposed to using StoreALocicalRecord. Look at
- the source and this may make sense.
-
- EXAM2.PAS - This program shows how to perform retrievals.
- The files created in EXAM1.PAS are used.
-
- EXAM3.PAS - This program shows how to do deletes and
- updates. The files created in EXAM1.PAS are used.
-
- EXAM4.PAS - This program shows how to use routines in
- FILEBUFF.PAS with text files.
-
- EXAM5.PAS - This program shows how to use the retrieval on
- range routine. The files created in EXAM1.PAS are used.
-
- EXAM6.PAS - This program shows how to use the retrieval on
- partial string match routine. The files created in
- EXAM1.PAS are used.
-
- EXAM7.PAS - This program shows how to build/rebuild indexes
- from an existing data file. The files created in EXAM1.PAS
- are used.
-
- EXAM8.PAS - This program shows use of routine to delete
- entries from a logical records list. The files created in
- EXAM1.PAS are used.
-
- EXAM9.PAS - This program shows an example of sorting using
- the SORT unit. The files created in EXAM1.PAS are used.
-
- EXAM10.PAS - This program shows an example of using one of
- the set operations found in the SETOPS unit. The files
- created in EXAM1.PAS are used.
-
- Instructions For Use of TBTREE13
-
- The following is a short set of instructions for performing the more
- common database type operations using TBree11.
-
- Uses Statements - All of the units were compiled with the
- appropriate uses statement included. Therefore, the user does
- not need to be concerned about order when specifying units in the
- uses statements in the user's programs. As you can see in my
- examples, I put the units in alphabetical order. The user only
- needs to put the units which have either type statements or
- routine calls or both which must be visible within the user's
- program. My example programs demonstrate which ones are probably
- required. One method which actually works is to not include any
- initially and let the compiler find the syntax errors. When an
-
- 9
-
-
-
- TBTREE13 Copyright (c) 1988 Dean H. Farwell II
-
-
- error is found because a type or routine is not found then see
- which unit it's in and put that unit in the uses statement. It's
- not a great way to do it, but it works.
-
- Setting Initial Parameters - Several parameters should be set by
- the user although they have defaults if a value is not set by the
- user. All of the parameters are only accessible by calling
- routines supplied to check and set values. The variables
- themselves are not visible to the user. The following parameters
- can be set by the user:
-
- Open Files Allowed - Contained in FILEBUFF.PAS. Determines
- number of open files allowed. See FILEBUFF.PAS for
- details. To set the parameter to a value of 5 do the
- following:
-
- SetMaxOpenFiles(5);
-
- Maximum Buffer Pages Allowed - Contained in PAGE.PAS.
- Determines the maximum number of 512 byte pages to keep in
- memory at one time. Set this to something that makes sense
- for your machine (memory size available). Since this is set
- at run time, you can set this after you determine how much
- memory is available. To set this parameter to 50 pages (25K
- bytes) use the following:
-
- SetMaxBufferPages(50);
-
- One note is that the buffer is initialized to 128K bytes
- which may be too large for your application. You should
- reset the default to whatever you want.
-
-
- Immediate Disk Write - This parameter if set to TRUE will
- force a write of a page to disk immediately after it is
- stored in the page buffer. Otherwise, the page will only be
- written when it gets swapped out or is explicitly written by
- the user (all pages for a file or all pages). To set this
- value to TRUE use:
-
- SetImmediateDiskWrite(TRUE);
-
- Creating Data Files - One of the first things we would want to do
- is create data files to hold our data. This is extremely simple.
- We only need to make one call for each data file. The following
- statements would create a data file:
-
- var
- xxx : FnString; (* holds name of data file *)
-
- begin
- xxx := 'MYFILE'; (* name of file WITHOUT extension *)
- Createfile(xxx,DATA); (* create the file and append the
- extension ('.dat') for a DATA
- file *)
- end;
-
- Notice that the xxx is not a file type but an 80 character string
-
- 10
-
-
-
- TBTREE13 Copyright (c) 1988 Dean H. Farwell II
-
-
- type which holds the name of the file. The file name assigned
- does not initially have an extension. The extension is appended
- automatically when CreateFile is called. Notice that the type of
- file created is a DATA file. This is absolutely the only type of
- file that the user should ever create using the CreateFile
- routine. INDEX files are created a different way. BITMAP files
- and LLIST files are only created by routines internally. These
- two are also deleted internally. Do not fool with these last two
- file types explicitly.
-
- Creating Index Files - The next step is to create one or more
- index files for each data file. You will not normally have a
- data file with no indexes (although for small files, you may want
- to). To create an index file the user must know the type and
- size of the key (item to be stored in the index). The following
- statements will create an index file for the data file created
- above. The index will hold Byte values (values from 0 - 255).
-
- var
- yyy : FnString;
-
- begin
- yyy := 'Index1'; (* notice no extension *)
- CreateIndex(yyy,SizeOf(Byte),xxx,BYTEVALUE);
-
- Notice that the data file name is also passed in. This is stored
- in the header of the index file. I do not presently use it for
- anything, but I think I'm going to need it later. I did not want
- to have to change the parameters needed in later releases. Also
- notice use of the SizeOf routine to pass in the size of the item
- to be stored in the index. This is usually more accurate than
- passing in a constant value. This will especially help avoid
- errors when using strings. Remember, strings use one extra byte
- for the length.
-
- Inserting Data - Once files and indexes exist we can insert data.
- Inserting data consists of two separate steps. The first step is
- to insert the data into the data file. The second step is to
- insert the applicable data into each index which is associated
- with that data file. To insert data we first need to have the
- data record stored in memory somewhere. The most logical place
- to have the data record is in a variable which is of type record
- of some kind. For example, the following is an example of a
- variable declaration which would work:
-
- type
- MyRecord = record
- field1 : Byte; (* for this example this
- is the only field
- indexed (Index1) although
- you could index all fields
- if desired. *)
- field2 : Word;
- field3 : String[10];
- end;
-
- var
- thisRec : MyRecord;
-
- 11
-
-
-
- TBTREE13 Copyright (c) 1988 Dean H. Farwell II
-
-
- lrNum : LrNumber;
-
- We could use the following sequence to insert a record in the
- data file and the index associated with this data file:
-
- thisRec.field1 := 20;
- thisRec.field2 := 60000;
- thisRec.field3 := 'Hi Mom';
-
- lrNum := StoreNewLogicalRecord(xxx,thisRec,SizeOf(thisRec));
-
- InsertValueInBTree(yyy,lrNum,thisRec.field1);
-
- Notice that StoreNewLogicalRecord was used because it will find
- an unused logical record number and assign it to this new record.
- This logical record number is the one inserted (along with the
- associated value) into the BTree. This is a change form versions
- prior to version 1.3. This simplifies things and makes stroring
- much more straightforward.
-
- Retrieving Data - Retrieving data consists of making a query to
- an index. The query is a procedure call which will build and
- return a list of logical records which match the selection
- criteria. The query conditions are defined in NUMBERS.PAS and
- are of type Condition. Once the list is retrieved, the user can
- step through the list starting at either end and going in either
- direction. The logical record numbers are stored in the index in
- such a way that their corresponding key values (values in the
- index) are in ascending order. A logical record list created
- using an index will reflect this order. If there are multiple
- occurences of the same key then the last one added is in the
- front. As previously noted, the list can be traversed front to
- back (ascending) or back to front (descending). A cursor is kept
- internally in the list to keep track of where the user is in the
- list. The user can not get to the cursor but can set it to the
- front or back and move it forward or backward (by retrieving
- values from the list). Normally, to traverse the list, set the
- cursor to the front of the list and then set up a loop which will
- traverse the list and terminate when a logical record number of 0
- is returned. The list is not disturbed as you move through it.
- It will exist until you destroy it or until the program ends.
- If the program ends before the list is destroyed explicitly by
- the user then there may be a file on disk with an extension of
- '.lrl'. If you terminate and find any of this type of file
- laying around then you did not destroy it prior to ending the
- program. You should fix this as leaving a list hanging around
- just takes up space and does nothing productive. There is a
- limit of 99 lists which can coexist. This is such an
- exceptionally large number that if you ever exceed it there is an
- error somewhere. Therefore, I did not leave provisions to
- terminate gracefully if this is exceeded. An example of a call
- to retrieve data follows:
-
- var
- key : Byte;
- lrLst : LrList;
- lrNum : LRNumber;
-
-
- 12
-
-
-
- TBTREE13 Copyright (c) 1988 Dean H. Farwell II
-
-
- begin
- key := 20;
- GetValuesFromBTree(yyy,key,LE,lrLst);
- (* lrLst is the newly created list *)
- lrNum := GetFirstLr(lrLst); (* set cursor to front *)
- while lrNum <> 0 do (* traversal loop *)
- begin
- GetALogicalRecord(xxx,lrNum,thisRec,SizeOf(thisRec));
- (* we can not do whatever with the record *)
- .
- .
- .
- lrNum := GetNextLr(lrLst);
- end; (* this will continue until end of list *)
-
- Versions 1.1 and later have the added capability to retrieve
- records for a range of values and retrieve records on a partial
- string match. An example of each follows:
-
- var
- key1,
- key2 : Byte;
- lrLst : LrList;
- lrNum : LRNumber;
-
- begin
- key1 := 10;
- key2 := 75;
- GetRangeFromBTree(yyy,key1,GE,key2,LE,lrLst);
- (* build list of logical
- record numbers which
- are between 10 and 75 *)
-
- (* list is in lrLst *)
- lrNum := GetFirstLr(lrLst);
- while lrNum <> 0 do
- begin
- GetALogicalRecord(xxx,lrNum,thisRec,SizeOf(thisRec));
- (* we can not do whatever with the record *)
- .
- .
- .
- lrNum := GetNextLr(lrLst);
- end; (* this will continue until end of list *)
-
- ------------------------------
-
- var
- keyString : String[10];
- lrLst : LrList;
- lrNum : LRNumber;
-
- begin
- keyString := 'C';
- (* assume zzz is index holding values of type STRINGVALUE *)
- GetSubstringFromBTree(zzz,keyString,ST,lrLst);
- (* build list of logical
- record which begin with
-
- 13
-
-
-
- TBTREE13 Copyright (c) 1988 Dean H. Farwell II
-
-
- 'C' *)
- (* list is in lrLst *)
- lrNum := GetFirstLr(lrLst);
- while lrNum <> 0 do
- begin
- GetALogicalRecord(xxx,lrNum,thisRec,SizeOf(thisRec));
- (* we can not do whatever with the record *)
- .
- .
- .
- lrNum := GetNextLr(lrLst);
- end; (* this will continue until end of list *)
-
-
- Deleting Data - Deleting data is the inverse of inserting data.
- There are two steps involved. The first step is to retrieve the
- record. This is only required if we do not know the values for
- all fields which have a corresponding index. Once we have the
- record we can delete the entries from the index(es). The second
- step is to delete the logical record from the data file. For the
- following example assume that the same definitions used above
- apply but that field2 also has an index with the file name
- residing in a variable zzz. Also assume that we know that we
- want to delete logical record number 24:
-
- GetALogicalrecord(xxx,24,thisRec,SizeOf(thisRec));
-
- DeleteValueFromBTree(yyy,24,thisRec.field1);
- DeleteValueFromBTree(zzz,24,thisRec.field2);
-
- DeleteRecord(xxx,24); (* delete value from data record *)
-
- Updating Data - There are no separate commands to update data.
- Updating is again a two step process. The first step is to
- fetch the appropriate data record(s). Once this is accomplished,
- the appropriate fields can be updated and the data file can be
- stored. A third step may be required. If any indexed fields
- (fields which have an index associated with them) are changed
- then the old value will have to be deleted and then the new value
- will have to be inserted into the index. Rather than show an
- example here, I refer you to EXAMPLE3.PAS. I noticed that in
- that example, I store the updated data record after updating the
- index rather than before. The order of these two operations is
- not important.
-
- Deleting Files - The operation of deleting a file is not going to
- be done very often but will be necessary from time to time. It
- is very important that the routine provided be used as opposed to
- going to DOS and doing a delete. The provided routine does some
- necessary cleanup work. Specifically, it gets rid of the
- appropriate BITMAP. The following is an example:
-
- DeleteFile(yyy);
-
- Notice that this routine is used by the user to delete both index
- and data files. Deleting a data file will not automatically
- delete the associated index files.
-
-
- 14
-
-
-
- TBTREE13 Copyright (c) 1988 Dean H. Farwell II
-
-
- Sorting - Sorting can be done on any number of fields as you
- desire. Sorting requires two steps. The first is to build a
- sort field list, a list containing information on each of the
- fields to sort on. The second step is to use this sort field
- list to sort a logical records list. After sorting is completed,
- the user should destroy the sort field list using the provided
- routine unless the list will be required later. An example
- follows:
-
- type
- MyRecord = record
- field1 : Byte; (* for this example this
- is the only field
- indexed (Index1) although
- you could index all fields
- if desired. *)
- field2 : Word;
- field3 : String[10];
- end;
-
- var
- lrLst : LrList;
- sPtr : SortFieldList;
-
- begin
- GetValidLogicalRecords(xxx,lrLst); (* list of all records *)
-
- sPtr := NIL; (* EXTREMELY CRITICAL !!! - you must pass in a
- pointer variable whose value is NIL. If
- you forget this little step all kinds of
- bad things will probably happen!!! *)
-
- AddFieldToSortList(sPtr,4,STRINGVALUE);
- AddFiledToSortList(sPtr,2,WORDVALUE);
- SortList(lrLst,xxx,sPtr,SizeOf(MyRecord));
- DestroySortList(sPtr); (* retrieve the heap space *)
- end;
-
- Notice that the second parameter is the BYTE position of the sort
- field within the data record. It is not the field number. This
- is extremely important!! See SORT.PAS for details.
-
- Building/Rebuilding Indexes - Once in awhile you may manage to
- wipe out an index or in some other way cause it to be out of
- whack with your data file. Or, you may want to add an index to a
- field which previously did not have an index. This can easily be
- accomplished using a routine added in version 1.1. Rather than
- repeating the example here, please refer to EXAMPLE7.PAS which
- will show how to build a list of existing records and rebuild an
- index from that list. The added routine will also allow you to
- build a list of all logical records associated with a data file
- without using an index.
-
- Viewing Status Information - I have included several routines for
- checking on the values of certain parameters and for allowing the
- user to delve into the depths if desired. The first routine to
- discuss is one which allows you to find out how many entries are
- in a logical record list once it has been created. To accomplish
-
- 15
-
-
-
- TBTREE13 Copyright (c) 1988 Dean H. Farwell II
-
-
- this the following call is used:
-
- cnt := GetCountLr(lrLst); (* lrLst is your variable where
- the list resides *)
-
- A second valuable utility is a routine which will return the
- number of files which are presently open using the FILEBUFF unit.
- In other words this returns the number of files which are on the
- files open list. This is not necessarily the number of files
- which DOS thinks is open. Only files opened using the create
- file or open file routines provided in FILEBUFF.PAS will be
- included in this number. Not included will be other files opened
- by the user and the files such as standard input and output etc.
- An example of a call is below:
-
- cnt := GetNumberOpenFiles;
-
- Several routines are provided in FILES.PAS which have not been
- discussed yet. The first is RecordUsed. This function will
- check to see if a given record is used. It is probably only
- useful to a user for checking to see if a logical record is in
- use (data record). Using it will not cause any problems, but see
- FILES.PAS to be sure that you understand what this routine is
- telling you. The other routine of interest in FILES.PAS is
- FixFileName. This is very important for appending the correct
- file extension to the file name. Remember, when a file is
- created, the correct extension is appended to the name
- automatically. If a program does not create a file but uses one
- that already exists then the file name residing in the users
- variable (of type FnString) must have the correct extension.
- Instead of leaving that to you I have included FixFileName. When
- you start up a program, call FixFileName for each data file and
- index file you will use. This will ensure that there will not be
- problems with the wrong extension. An example is below:
-
- var
- xxx : FnString;
-
- begin
- xxx := 'MyFile';
- FixFileName(xxx,DATA);
-
- The largest number of utilities lies in PAGE.PAS. The first
- routines of interest are the ones for writing part or all of the
- buffer to disk. The demand paging used in the PAGE.PAS unit is
- critical to the performance of TBTREE13. Since a large number
- of pages are in the memory buffer, the need to constantly go out
- to disk is reduced. Disk seeks (head movement) is what slows
- down most database operations. The secret is to do everything
- you can to alleviate the need to go out to disk. Memory is many
- times faster than external storage. However, if changes are made
- to pages in the buffer we want to ensure that they do not get
- lost. We need to write them to disk at some point. Normally, a
- page will stay in memory until another page needs the space. The
- unit uses a least recently used algorithm which in nothing earth
- shattering (but it works). When a page needs to be swapped out,
- a check is made to see if it is dirty (changes have been made
- since it was read in or last written to disk). If it is clean no
-
- 16
-
-
-
- TBTREE13 Copyright (c) 1988 Dean H. Farwell II
-
-
- problem. If it is dirty it is written to disk prior to allowing
- the page to be reused. The user can force pages to be written
- out at any time. One method is to set a parameter which forces a
- write to disk anytime a page is written to in memory.
- Essentially, a page will never be dirty this way. To accomplish
- this the SetImmediateDiskWrite routine is used. The user can set
- this parameter to TRUE or FALSE. FALSE is the default, but the
- user should probably make an explicit call anyway on
- initialization so there is no doubt or confusion. If FALSE is
- set, the user may still want to periodically write pages to disk.
- The user can write all pages for a given file or all pages in the
- buffer. For Example:
-
- WriteEntireBufferToDisk;
-
- or
-
- WriteBufferToDisk(xxx);
-
- The user should always write all pages to disk prior to
- termination. Otherwise, changes made to pages still in memory
- will be lost. A handy way to do this is to call it from an exit
- procedure. See EXAMPLE2.PAS for an example.
-
- A second set of routines are provided to get rid of (release)
- pages in the buffer. The pages should be written to disk prior
- to release. These routines can be dangerous. I need them
- internally for operations such as changing the size of the buffer
- dynamically and also for deleting a file. They are visible and
- free to use, but understand what you're doing first. If you're
- looking for a convenient way to wreck havoc with your data these
- two are perfect candidates. The only reason I can forsee why you
- would want to attempt to use these is to release some pages for
- files you won't need for a while to leave room for an operation
- which will need a lot of pages. Before releasing write the
- affected pages to disk!! There is not really a need to micro
- manage the resources like this and they can lead to problems if
- misused. Accidentally, losing a node (page) in the middle of an
- index will leave it in an unrepairable state and will ruin your
- whole day.
-
- A third set of routines are much friendlier and I can stay off my
- soap box. These routines manipulate and check the size of the
- buffer. All values are in terms of pages. A page is 512 bytes.
- To digress slightly, 512 worked well because a smaller number
- caused inefficiency in the least recently used page algorithm and
- also meant that index keys had to be smaller. A larger number
- means that less pages can be in memory at once. Do not try to
- change this constant as a lot of things are affected. Just
- changing the constant will not work properly. This is due to the
- fact that a constant declaration can not be declared using a
- simple expression. For example:
-
- const
- xxx = 10;
- yyy = xxx/2; (* illegal *)
-
- Therefore, other constants that depend on xxx have to be set
-
- 17
-
-
-
- TBTREE13 Copyright (c) 1988 Dean H. Farwell II
-
-
- explicitly. In the above, yyy would have to be set to 5.
- Anyway, back to the matters at hand. To set the size of the
- buffer the following call should be made:
-
- SetMaxBufferPages(50); (* 25K bytes *)
-
- If a user does not set the page limit explicitly, the value will
- default to one page. Everything still works, but doing this will
- pretty well define slow. A minimum of 10 pages is required to
- have this work well at all. One hundred or so pages is
- preferred. A great deal depends on your application as well! A
- user can check this value at any time by doing the following:
-
- cnt := CheckMaxBufferPages;
-
- The user can check the number actually in use by doing the
- following:
-
- cnt := CheckPagesInUse;
-
- One routine not mentioned earlier is CheckImmediateDiskWrite which
- returns TRUE if dirty pages will be immediately written to disk,
- FALSE otherwise.
-
- One interesting routine is PrintPagebuffer. This routine is one
- that I developed for my own debugging purposes and is included to
- allow you to see what the page buffer looks like. One warning
- though, if the page buffer is large, the output will be quite
- lengthy.
-
- The last routine is more for curiosity's sake and for tuning the
- size of the buffer. A call to PrintBufferStats will print the
- statistics for buffer use. The number of pages in use, the
- maximum number of pages allowed, the number of requests for a
- page (fetches), the number of fetches in which the page was in
- the buffer (hits) and the ratio of hits to fetches will be
- output. A hit ratio of much below 95% could signal a requirement
- to increase the buffer size. You really need to keep this number
- in perspective though. If you are walzing through a file in
- which none of the records are in the buffer, the hit ratio will
- be poor. Other operations will have higher hit ratios. At least
- it's there to use as desired.
-
- A recently added routine contained in the BTREE unit is
- NumberOfBTreeLevels which returns the number of levels currently
- in an index. If that number gets above 10 you are doing
- something rather strange and may run into stack problems. The
- cause of such an occurence is that you have a huge number of
- index entries (many thousands) or your index entries are large.
- The only entries over 4 bytes are strings. These types of
- indexes should use strings as small as possible. Don't use first
- name, last name, and middle initial as all one indexed filed
- unless there is a really good reason to (doubtful). Any string
- index entry over about 20 bytes is suspect. Remember what an
- index is for .. to store something to allow quick retrievals.
- The index is not designed to be the data file. A deep index will
- simply not be as fast as a shallow index. The smaller the index
- entry size, the shallower the index.
-
- 18
-
-
-
- TBTREE13 Copyright (c) 1988 Dean H. Farwell II
-
-
-
- Other Routines - I discussed above the routines which are
- directly applicable to a user who wants to use the indexing and
- data manipulation capabilities of TBTREE13. There are some lower
- level general routines such as those found in MATH.PAS and
- STRINGS.PAS which are used internally. Since they are fairly
- general in nature, they may have applicability for your other
- applications. I won't go into details on their inner workings
- here, but you have the source code, so you can look in the units
- to see if there is anything else useful for whatever you happen
- to be doing.
-
-
- Current Product Limitations
-
-
- The following limitations apply to TBTREE13. Only limitations
- inherent to TBTREE13 are discussed. DOS limits and practical limits
- like disk drive capacity are not discussed. These limitations may
- change (become less restrictive) in future additions.
-
-
- Number of Data Files - No Limit
-
- Number of Index Files Per Data File - No Limit
-
- Size of Data File - MAXLONGINT (2,147,483,647) Logical or
- Physical Records (whichever comes first)
- or disk space (the real limit!!)
-
- Size of Index File - MAXLONGINT Physical Records. The real
- restriction is going to be stack size. I
- use a recursive routine to traverse the
- BTrees. Recursion is very efficient for
- this but has the problem that each level
- keeps a couple of thousand bytes on the
- stack. It is very important to use a
- large stack size if indexes are going to
- be deep. Most indexes will not be more
- than 4 or 5 levels deep unless a HUGE
- number of records are involved or the
- keys in the indexes are large. Large is
- relative, but you need to try to keep
- entries below 20 bytes or so. This
- really only affects strings presently. I
- would like to know if anyone is running
- into problems with this as I can handle
- this iteratively rather than recursively
- which will make the problem disappear at
- the expense of some added coding
- complexity. I recently did a test to see
- to see how critical this is. I did a
- test using an index entry size of 100
- bytes (100 byte strings). I entered over
- 27000 records before running out of disk
- space (took over one day on my archaic
- 4.77 MHz computer!). I had somewhere in
- the area of 15 levels which is more than
-
- 19
-
-
-
- TBTREE13 Copyright (c) 1988 Dean H. Farwell II
-
-
- you should reasonably have. I also used
- a 32K stack (64K being the maximum
- allowed). Anyway, I couldn't blow it up
- so I don't think you will either!
-
- Size of Page Buffer - Limited by heap space. I have not used
- this with any type of heap extender or
- other program which uses other memory
- (extended, expanded) for the heap. I
- would enjoy seeing feedback if anyone
- tries it. If I need to do something to
- facilitate use of a heap extender let me
- know and I'll try to accomodate.
-
- Number of Files Open At Once - Actual number set by user, but
- user does not have to worry
- about keeping track of files
- used by TBTREE13 since it opens
- and closes files as required.
-
- Size of Key (Index Entry) - 494 bytes. The present
- restriction is really 256 bytes
- since the only composite data type
- allowed in the indexes is the
- string type. Realistically, you
- do not want to come close to this
- number of bytes per entry or your
- index will be much deeper than
- desired.
-
- Size of Data Record - 65520 bytes since this is the limit
- for structured (composite) types in
- Turbo Pascal 4.0.
-
- Max File Name Length - 80 characters including drive
- designator, path, file name and
- extension.
-
- Logical List Limit - Presently, 99 lists can be exist at one
- time. This is probably about 95 more than ever required.
-
- Number Of Sort Fields - Unlimited
-
- Future Enhancements / Directions
-
- One of the advantages of using the shareware concept to distribute
- this software is that I can be very flexible and can tailor this
- product to the needs of users. I am soliciting as much feedback as
- possible. I need to know if there is interest in a product such as
- this and also need to know what is really important to users. I hope
- to be able to accomplish the following in the future:
-
- I may change the BTREE.PAS unit to include a binary search routine
- internally to speed the finding of key matches. Presently I use a
- linear search within each node (index page). This will increase
- performance somewhat. It will be totally transparent to users.
- So far, it is faster than everyone imagined, so I have gotten no
- negative feedback concering performance.
-
- 20
-
-
-
- TBTREE13 Copyright (c) 1988 Dean H. Farwell II
-
-
-
- I may change the delete routines in the BTREE.PAS unit to force
- combining of nodes when each is half full or less. Presently, a
- node is not deleted until the last entry is deleted. This will add
- a little bit of performance on queries and will reduce the size of
- the indexes if deletes are done often, but will otherwise be
- transparent.
-
- I may change the BTREE.PAS unit so that I do not use recursion to
- walk the tree. As mentioned earlier, I have done a test to see if
- anyone will have problems, and I really don't foresee any.
-
- One area which may prove slow if a data file becomes very large is
- that of finding an unused record. Since I want to reuse holes in
- the file prior to expanding the file I do a linear search to find
- holes. I walk through the bitmap looking for a logical record
- marked unused. I do the same for index files looking for unused
- physical records. On a large file this takes a little time. I
- may add the capability to make reuse optional. In other rwords
- the user can turn it on and off. Feedback in this area is
- appreciated. So far, I have not received any negative feedback on
- this.
-
- The biggest change that I will probably implement in the future is
- a unit which traps errors and sets an error flag.
-
- Version Release Notes
-
- Version 1.1
-
- 1. Added capability to do a search of an index on a range of values.
-
- 2. Added capability to do partial string match searches (strings
- only). Added one routine to the BTREE unit and several routines to
- the COMPARE unit and one type declaration to the NUMBERS unit to
- facilitate this change.
-
- 3. Corrected bug in BTREE unit. Documentation stated that duplicate
- entries for a given logical record number and field value were
- prevented from being entered. This was not previously the case
- although that is now true.
-
- 4. Added a routine (LOGICAL unit) to build a list of all current
- records for a file (all valid ie not deleted records). This is
- essential for rebuilding indexes or accessing data without an index.
- In actuality, a user could have built this routine, but it was not
- intuitively obvious.
-
- 5. Added the capability to delete an entry from an existing logical
- record list.
-
- 6. Divided BTREE.PAS into BTREE.PAS and two include files :
- BTREE1.INC and BTREE2.INC.
-
- 7. Added several more examples.
-
- Version 1.2
-
-
- 21
-
-
-
- TBTREE13 Copyright (c) 1988 Dean H. Farwell II
-
-
- 1. Added Sorting capabilities.
-
- 2. Added several routines to LRECLIST.PAS to facilitate sorting. I
- also made a couple of internal cosmetic changes to LRECLIST.PAS.
-
- 3. Added one example.
-
- 4. Removed CalculateRecordSize and several unused data types from
- LOGICAL.PAS. These were for future projects and now are now targeted
- for TBASE.
-
- Version 1.3
-
- 1. Added SETOPS.PAS which provides Intersection, Union and
- Difference.
-
- 2. Added one example.
-
- 3. MATH.PAS redone in assemble language. See listing for details and
- credit to the author.
-
- 4. Added one routine (FindLrInList) to LRECLIST.PAS.
-
- 5. In LOGICAL.PAS changed MAXDATASIZE to 65520 since this is the
- maximum size for a structured type in Turbo Pascal 4.0.
-
- 6. TP4PRINT.PAS now prints the file creation date as part of the
- header.
-
- 7. Added NumberOfBTreeLevels routine to the BTREE unit.
-
- 8. Added StoreNewLogicalRecord routine to LOGICAL.PAS. This makes it
- easier to store new records. You no longer need to find the next
- available logical record number using the FirstUnUsedRecord
- routine located in FILES.PAS. StoreNewLogicalRecord does this for
- you and returns the associated logical record number.
-
-
-
- Turbo Pascal is a registered trademard of Borland International
- CompuServe is a registered trademark of CompuServe Incorporated
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- 22
-
-
-