home *** CD-ROM | disk | FTP | other *** search
- (* TBTree13 Copyright (c) 1988 Dean H. Farwell II *)
-
- unit Sort;
-
- (****************************************************************************)
- (* *)
- (* S O R T R O U T I N E S *)
- (* *)
- (****************************************************************************)
-
-
- (* This unit contains the data types and routines required to sort logical
- record lists. A logical record list can be sorted on any number of
- fields. This unit technically violates the object oriented design
- techniques used throughout TBTREE because it directly manipulates an
- object ( LrList) external to this unit. This is done to optimize
- performance. To alleviate this, I could combine this unit with the
- LRECLIST unit. I opted against this since sorting is considered by
- most people to be a separate type of operation than the operations
- performed by the LRECLIST unit. The decision really doesn't affect
- anything as far as the user is concerned.
-
- Sorting using this unit is generally straighforward. To sort a logical
- record list takes two steps. The first is to build a sort field list, a
- list containing information on each of the fields to sort on. The second
- step is to call SortList with the correct parameters. Once the sort is
- completed you should call DestroySortList which will return the space
- used to store the sort field list. This last step does not need done if
- you intend to sort on the same fields at a later time.
-
- Step 1 - Call AddFieldToSortList for each field to sort on. In other words
- if you are sorting on three fields, you must call AddFieldToSortList three
- times. When calling the routine you must supply a sort field variable (a
- pointer) which is equal to NIL for the first call to the routine. You
- must also supply the byte position of the field within the logical record
- and the ValueType of the field. Again, for the first call when building a
- list sPtr must be NIL. The routine must be called once for each field.
- The field passed in the first call has the highest sort precedence. The
- field passed in as the next field will have the next highest precedence etc.
- The only tricky part is determining the byte position of a field within a
- logical record. Probably the only way to do it is manually determine
- how much room is taken be the preceding fields. An easy way to store it
- is to have a record which contains the position for each field. For
- example:
-
- MyRecord = record
- name : String[20];
- age : Byte;
- jobTitle : JobType; { assume its an enumerated type }
- end;
-
- MyRecordPos = record
- namePos : Byte;
- agePos : Byte;
- jobTitlePos : Byte;
- end;
-
- var myPos : MyRecordPos;
-
- procedure SetPosMyRecord;
-
- begin
- myPos.namePos := 1;
- myPos.agePos := 22; { note its not 21 since the previous field
- is 21 bytes (1 for the length) }
- myPos.jobTitlePos := 23;
- end;
-
- var sPtr : SortFieldList;
-
- procedure BuildMyList;
-
- begin { sort on age then name
- sPtr := NIL; { very important !! }
- AddFieldToSortList(sPtr,myPos.agePos,BYTEVALUE);
- AddFieldToSortList(sPtr,myPos.namePos,STRINGVALUE):
- end;
-
-
- This method method will make it easier if you ever change you logical
- record definitions.
-
- Step 2 - Now that you have built the list simply call SortList. The
- call is simple. The list to sort, the file name of the data
- file, the sort field list and the size of the data record are all passed
- in. The name of the data file and the logical record size is very
- important.
-
- Once the call is made the list will be sorted and returned. One thing
- is important to realize. Sorting takes time. Although the calls are
- simple to the user, there is a great deal of work going on behind the
- scenes. Sort times will very from seconds to hours depending on the
- size of the list and to a lesser extent the size of the logical
- records and the type of fields. I have included an example program
- which can be used as desired.
-
-
- note - I used the QuickSort and BubbleSort algorithms found in
- the book Data Structures and Algorithms by Aho, Hopcroft and
- Ullman. I took the liberty of making a few changes to speed
- them up for this particular application. Some of the coding
- is not as straight forward as desired, but this was done for
- the sake of efficiency. By doing some tuning I managed to
- decrease sorting times by as much as 30%. This is especially
- true for lists which are partially sorted (already sorted
- on the first sort field). The routines work which is
- the goal anyway. *)
-
- (* Version Information
-
- Version 1.1 - Not Applicable. This unit was introduced in version 1.2
-
- Version 1.2 - Added this entire unit.
-
- Version 1.3 - No Changes *)
-
- (*\*)
- (*////////////////////////// I N T E R F A C E //////////////////////////////*)
-
- interface
-
- uses
- Compare,
- FileDecs,
- Files,
- Logical,
- LRecList,
- Page;
-
- type
- SortFieldPosition = DataSizeRange; (* position of sort field (bytes
- within the data record *)
-
- SortFieldList = ^SortField; (* used to form a linked list containing info
- about sort fields *)
-
- SortField = record (* used to keep info about one
- sort field *)
-
- pos : SortFieldPosition; (* byte pos of sort field relative
- to the beginning of the data
- record *)
- vType : ValueType; (* type of the field *)
- next : SortFieldList;
- end;
-
- (*\*)
- (* This routine will take a previously used list and delete all entries. sPtr
- will end up being NIL after the call. It is important to call this routine
- rather than just setting sPtr to NIL in order to return the heap space used
- by the list *)
-
- procedure DestroySortList(var sPtr : SortFieldList);
-
-
- (* This routine will add one sort field to the end of a sort field list. The
- byte position of the field within the data record and the type of the field
- must be supplied. The pointer to the sort field list must also be supplied.
- The list is kept in the order of entry. Therefore, the first entry should
- be the first field to sort on, the second field should be the next field to
- sort on if the first fields in two or more records contain the same value
- etc. *)
-
- procedure AddFieldToSortList(var sPtr : SortFieldList;
- pos : SortFieldPosition;
- vType : ValueType);
-
-
- (* This routine sorts a logical record list in ascending order using the sort
- field list passed in a a parameter. The file name for the data file and the
- size ot the logical records (data records) must also be passed in. lrlst is
- passed in unsorted and is returned sorted in ascending order. If the caller
- wants to traverse the list in descending order, call this routine and
- traverse the list in reverse order.
-
- note - prior to sorting, this routine checks the heap to see if there is
- sufficient heap space to perform the sort. The routine requires 2 blocks
- of heap memory each equal to the logical record size of the data records
- (passed in as size). If there is not enough memory, the list is returned
- unsorted. *)
-
- procedure SortList(var lrLst : LrList; (* list to sort *)
- fName : FnString; (* data file name *)
- sPtr : SortFieldList; (* list holding sort fields *)
- size : DataSizeRange); (* size of data record
- (logical record) *)
-
- (*\*)
- (*///////////////////// I M P L E M E N T A T I O N /////////////////////////*)
-
- implementation
-
- const
- BUBBLELIMIT = 7; (* once a list has less than BUBBLELIMIT items
- it will be sorted using a Bubble Sort *)
-
- (* This routine will take a previously used list and delete all entries. sPtr
- will end up being NIL after the call. It is important to call this routine
- rather than just setting sPtr to NIL in order to return the heap space used
- by the list *)
-
- procedure DestroySortList(var sPtr : SortFieldList);
-
- var
- tempPtr : SortFieldList;
-
- begin
- while sPtr <> NIL do
- begin
- tempPtr := sPtr^.next;
- Dispose(sPtr);
- sPtr := tempPtr;
- end;
- end; (* end of DestroySortList routine *)
-
- (*\*)
- (* This routine will add one sort field to the end of a sort field list. The
- byte position of the field within the data record and the type of the field
- must be supplied. The pointer to the sort field list must also be supplied.
- The list is kept in the order of entry. Therefore, the first entry should
- be the first field to sort on, the second field should be the next field to
- sort on if the first fields in two or more records contain the same value
- etc. *)
-
- procedure AddFieldToSortList(var sPtr : SortFieldList;
- pos : SortFieldPosition;
- vType : ValueType);
-
- var
- tempPtr : SortFieldList;
-
- begin
- if sPtr = NIL then
- begin
- New(sPtr);
- tempPtr := sPtr;
- end
- else
- begin
- tempPtr := sPtr;
- while tempPtr^.next <> NIL do
- begin
- tempPtr := tempPtr^.next;
- end;
- New(tempPtr^.next);
- tempPtr := tempPtr^.next;
- end;
- tempPtr^.pos := pos;
- tempPtr^.vType := vType;
- tempPtr^.next := NIL;
- end; (* end of AddFieldToSortList routine *)
-
-
- (*\*)
- (* This routine sorts a logical record list in ascending order using the sort
- field list passed in a a parameter. The file name for the data file and the
- size ot the logical records (data records) must also be passed in. lrlst is
- passed in unsorted and is returned sorted in ascending order. If the caller
- wants to traverse the list in descending order, call this routine and
- traverse the list in reverse order.
-
- note - prior to sorting, this routine checks the heap to see if there is
- sufficient heap space to perform the sort. The routine requires 2 blocks
- of heap memory each equal to the logical record size of the data records
- (passed in as size). If there is not enough memory, the list is returned
- unsorted. *)
-
- procedure SortList(var lrLst : LrList; (* list to sort *)
- fName : FnString; (* data file name *)
- sPtr : SortFieldList; (* list holding sort fields *)
- size : DataSizeRange); (* size of data record
- (logical record) *)
- type
- DataArrayPtr = ^DataArray;
- DataArray = Array[DataSizeRange] of Byte;
-
- var
- datPtr1,
- datPtr2 : DataArrayPtr;
- cnt : LrNumber;
-
- (*\*)
- (* This routine will swap the position of two entries in a logical record
- list. To function properly, the lrPos1 must be the relative position of
- entry one and lrPos2 must be the relative position of entry two.
- Relative position means position in the list. If a list has 100 entries,
- the first entry is number 1 and the last entry is number 100. If the
- second and third entries were to be swapped you should pass in 2 for
- lrPos1 and 3 for lrPos2. *)
-
- procedure SwapLr(lrPos1 : LrNumber;
- lrPos2 : LrNumber);
-
- var
- pgPos1,
- pgPos2 : LRArrayRange;
- pr1,
- pr2 : PrNumber;
- tempLr1,
- tempLr2 : LrNumber;
-
- begin
- pr1 := DesiredPage(lrPos1);
- pgPos1 := DesiredPosition(lrPos1);
- pr2 := DesiredPage(lrPos2);
- pgPos2 := DesiredPosition(lrPos2);
- if lrLst.currPage <> pr1 then
- begin
- FetchPage(lrLst.fName,pr1,lrLst.page);
- lrLst.currPage := pr1;
- end;
- tempLr1 := lrLst.lrArray[pgPos1];
- if lrLst.currPage <> pr2 then
- begin
- FetchPage(lrLst.fName,pr2,lrLst.page);
- lrLst.currPage := pr2;
- end;
- tempLr2 := lrLst.lrArray[pgPos2];
- lrLst.lrArray[pgPos2] := tempLr1;
- if lrLst.currPage <> pr1 then
- begin
- StorePage(lrLst.fName,lrLst.currPage,lrLst.page);
- FetchPage(lrLst.fName,pr1,lrLst.page);
- lrLst.currPage := pr1;
- end;
- lrLst.lrArray[pgPos1] := tempLr2;
- if (lrLst.count > LRARRAYSIZE) then
- begin
- StorePage(lrLst.fName,lrLst.currPage,lrLst.page);
- end;
- end; (* end of SwapLr routine *)
-
- (*\*)
- (* This routine will return the logical record number corresponding to the
- position (lrPos) in the logical record list. This routine allows
- retrieval using the linked list as a psuedo array. *)
-
- function GetLrUsingPosition(lrPos : LrNumber) : LrNumber;
-
- var
- pgPos : LRArrayRange;
- pr : PrNumber;
-
- begin
- pr := DesiredPage(lrPos);
- pgPos := DesiredPosition(lrPos);
- if lrLst.currPage <> pr then
- begin
- FetchPage(lrLst.fName,pr,lrLst.page);
- lrLst.currPage := pr;
- end;
- GetLrUsingPosition := lrLst.lrArray[pgPos];
- end; (* end of GetLrUsingPosition routine *)
-
- (*\*)
- (* This routine will compare logical record lr1 with logical record lr2.
- The routine will LESSTHAN if lr1 is less than lr2 for the sort fields in
- sPtr. It will return EQUALTO if the records are equal for those sort
- fields. It will return GREATERTHAN if lr1 is greater than lr2 for those
- sort fields. *)
-
- function CompareRecords(lr1 : LrNumber;
- lr2 : LrNumber) : Comparison;
-
- var
- tempPtr : SortFieldList;
- done : Boolean;
- result : Comparison;
-
- begin
- GetALogicalRecord(fName,lr1,datPtr1^,size);
- GetALogicalRecord(fName,lr2,datPtr2^,size);
- tempPtr := sPtr;
- done := FALSE;
- while not done do
- begin
- result := CompareValues(datPtr1^[tempPtr^.pos],
- datPtr2^[tempPtr^.pos],
- tempPtr^.vType);
- case result of
- LESSTHAN :
- begin
- done := TRUE;
- CompareRecords := LESSTHAN;
- end;
- GREATERTHAN :
- begin
- done := TRUE;
- CompareRecords := GREATERTHAN;
- end;
- EQUALTO :
- begin
- tempPtr := tempPtr^.next;
- if tempPtr = NIL then
- begin
- done := TRUE;
- CompareRecords := EQUALTO;
- end;
- end;
- end; (* end of case statement *)
- end;
- end; (* end of CompareRecords routine *)
-
- (*\*)
- (* This routine will compare logical record lr with pivot record. The
- pivot record must be loaded into the location pointed to by datPtr2
- prior to calling this routine. Aside from the above, this routine
- works much like the CompareRecords routine above. *)
-
- function CompareRecordToPivot(lr : LrNumber) : Comparison;
-
- var
- tempPtr : SortFieldList;
- done : Boolean;
- result : Comparison;
-
- begin
- GetALogicalRecord(fName,lr,datPtr1^,size);
- tempPtr := sPtr;
- done := FALSE;
- while not done do
- begin
- result := CompareValues(datPtr1^[tempPtr^.pos],
- datPtr2^[tempPtr^.pos],
- tempPtr^.vType);
- case result of
- LESSTHAN :
- begin
- done := TRUE;
- CompareRecordToPivot := LESSTHAN;
- end;
- GREATERTHAN :
- begin
- done := TRUE;
- CompareRecordToPivot := GREATERTHAN;
- end;
- EQUALTO :
- begin
- tempPtr := tempPtr^.next;
- if tempPtr = NIL then
- begin
- done := TRUE;
- CompareRecordToPivot:= EQUALTO;
- end;
- end;
- end; (* end of case statement *)
- end;
- end; (* end of CompareRecordToPivot routine *)
-
- (*\*)
- (* This routine performs a standard bubble sort on elements of a list
- ranging from i to j where i is the location of the first element
- and j is the location of the last element *)
-
- procedure BubbleSort(i : LrNumber;
- j : LrNumber);
-
- var
- tempLr1,
- tempLr2,
- ii,
- jj : LrNumber;
-
- begin
- for ii := i to j - 1 do
- begin
- for jj := j downto ii + 1 do
- begin
- tempLr1 := GetLrUsingPosition(jj);
- tempLr2 := GetLrUsingPosition(jj-1);
- if CompareRecords(tempLr1,tempLr2) = LESSTHAN then
- begin
- SwapLr(jj,jj-1);
- end;
- end;
- end;
- end; (* end of BubbleSort routine *)
-
- (*\*)
- (* This routine will the logical record number for the record which is the
- pivot record. The algorithm has been altered to enhance performance.
- Once it is determined that not all records are equal and therefore do
- not need sorted, the first and middle records are compared rather than
- the first and first one that differs. This will enhance performance. *)
-
- function FindPivot(i : LrNumber;
- j : LrNumber) : LrNumber;
-
- var
- k,
- mid,
- tempLrMid,
- tempLr1,
- tempLr2 : LrNumber;
-
- begin
- tempLr1 := GetLrUsingPosition(i);
- for k := i + 1 to j do
- begin (* try to find record not equal to the first one *)
- tempLr2 := GetLrUsingPosition(k);
- if CompareRecords(tempLr1,tempLr2) <> EQUALTO then
- begin
- mid := (i + j) div 2; (* middle record *)
- tempLrMid := GetLrUsingPosition(mid);
- case CompareRecords(tempLr1,tempLrMid) of
- LESSTHAN : FindPivot := tempLrMid;
- EQUALTO : begin
- case CompareRecords(tempLr1,tempLr2) of
- LESSTHAN : FindPivot := tempLr2;
- GREATERTHAN : FindPivot := tempLr1;
- end;
- end;
- GREATERTHAN : FindPivot := tempLr1;
- end;
- Exit;
- end;
- end;
- FindPivot := 0; (* different keys not found *)
- end; (* end of FindPivot routine *)
-
- (*\*)
- function Partition(i : LrNumber;
- j : LrNumber;
- pivotLr : LrNumber) : LrNumber;
-
- var
- l,
- r,
- tempLr1 : LrNumber;
-
- begin
- l := i;
- r := j;
- GetALogicalRecord(fName,pivotLr,datPtr2^,size);
- while TRUE do
- begin
- tempLr1 := GetLrUsingPosition(l);
- while CompareRecordToPivot(tempLr1) = LESSTHAN do
- begin
- l := l + 1;
- tempLr1 := GetLrUsingPosition(l);
- end;
- tempLr1 := GetLrUsingPosition(r);
- while CompareRecordToPivot(tempLr1) <> LESSTHAN do
- begin
- r := r - 1;
- tempLr1 := GetLrUsingPosition(r);
- end;
- if l > r then
- begin
- Partition := l;
- Exit;
- end;
- SwapLr(l,r);
- end;
- Partition := l;
- end; (* end of Partition routine *)
-
- (*\*)
- (* This routine will recursively sort a list of logical record numbers
- using a combination of a quicksort and a bubble sort algorithm. The
- list to be is passed in as lrLst. fName is the name of the data file,
- sPtr is the pointer to the sort list, size is the size of the data file
- logical records and i and j are positions in the logical record to
- sort. *)
-
- procedure QuickSort(i : LrNumber;
- j : lrNumber);
-
- var
- k,
- pivotLr : LrNumber;
-
- begin
- pivotLr := FindPivot(i,j);
- if pivotLr <> 0 then
- begin
- k := Partition(i,j,pivotLr);
- if (k - i) < BUBBLELIMIT then
- begin
- BubbleSort(i,k-1);
- end
- else
- begin
- QuickSort(i,k-1);
- end;
- if (j - k) < BUBBLELIMIT then
- begin
- BubbleSort(k,j);
- end
- else
- begin
- QuickSort(k,j);
- end;
- end;
- end; (* end of QuickSort routine *)
-
-
- begin
- cnt := GetCountLr(lrLst);
- if cnt > 1 then
- begin
- if MaxAvail >= size then
- begin
- GetMem(datPtr1,size);
- if MaxAvail >= size then
- begin
- GetMem(datPtr2,size);
- QuickSort(1,cnt);
- FreeMem(datPtr1,size);
- end;
- FreeMem(datPtr2,size);
- end;
- end;
- end; (* end of SortList routine *)
-
-
- end. (* end of Sort Unit *)