home *** CD-ROM | disk | FTP | other *** search
Text File | 1991-09-17 | 84.9 KB | 1,949 lines |
-
-
- September 17, 1991 RPSORT Reference Page i
-
-
- Table of Contents
- -----------------
-
- Introduction 1
-
- How to Display RPSORT Built-in Syntax Screens 1
-
- Syntax Conventions 1
-
- How To Exit Quickly From RPSORT 2
- Quick Exit From RPSORT When Output Goes To The Standard Output 2
-
- General Description of How RPSORT Does A Sort 3
-
- Summary Of RPSORT Syntax 4
-
- Details Of RPSORT Syntax 5
-
- Options For Specifying Input And Output Files 5
- Using RPSORT As A Filter 5
- Specifying Input And Output Files Directly 5
-
- Specifying Lines Or Fixed Length Records 6
- Lines Are The Default 6
- /Fnnnn - Specifying Fixed Length Records 6
-
- Detailed Description Of Sort Key Types Supported By RPSORT 7
- Sort Keys That Are Character Strings 7
- Default Case Insensitive Character Strings 7
- ASCII (Case Sensitive) Character Strings 7
- C Language Style Character Strings 7
- Turbo Pascal Style Character Strings 8
- Sort Keys That Are Binary Numbers 8
- Signed Binary Integers 8
- Unsigned Binary Integers 9
- BASICA And GWBASIC Floating Point Numbers 9
- Turbo Pascal Real Numbers 9
- Math Co-Processor Floating Point Numbers 10
-
- Defining The Desired Sort Sequence To RPSORT 11
- Standard Defaults For Sort Keys 11
- Switches Which Set Defaults For Sort Keys 11
- /A - Sort all Text Keys in ASCII (Case Sensitive) Sequence 11
- /C - Make All Text Keys Be C Language Strings 11
- /P - Make All Text Keys Be Turbo Pascal Strings 11
- /R - Sort All Keys In Reverse (Descending Order) 11
-
-
-
- September 17, 1991 RPSORT Reference Page ii
-
-
- Table of Contents (continued)
- -----------------------------
-
- Defining Sort Keys 12
- Sort Key Definition Syntax 12
- Col - The Start Column For A Key 12
- Len - The Length Of A Key 12
- R - Sorting The Key In Reverse (Descending) Order 12
- A - Sorting The Key In ASCII (case insensitive) Order 12
- C - Specifying A C Language Type String 12
- P - Specifying A Turbo Pascal Type String 12
- I - Specifying A Signed Binary Integer 13
- U - Specifying An Unsigned Binary Integer 13
- F - Specifying A Math Co-processor Type Floating Point Number 13
- M - Specifying A BASIC Interpretor Type Floating Point Number 13
- T - Specifying A Turbo Pascal Type Real Number 13
- List Of Various Compiler And Interpreter Numeric Data Types 14
-
- Miscellaneous Switches 15
- /Q - Suppressing Copyright And Completion Messages 15
- /Eerrfile - Directing Error Messages To A File 15
- /B - Ignoring Control Breaks Entered From The Keyboard 15
- /D - Delete Records Whose Sortkeys Duplicate Previous Record 15
- /N - Delete Null Lines 16
- /Td - Designate Drive To Be Used For Temporary Files. 16
- /Z - Ignore Ctrl-Z In Text File. Use Entire Physical File. 16
-
- Efficiency Considerations 17
- Do ASCII Sort If Text Keys Are All Upper Case Or All Lower Case 17
- How Memory Size Affects RPSORT Speed And Need For Temp Disk Space 18
- Using CHKDSK Or MEM To Determine Free Memory 18
- Sorts Requiring No Merge Phase And No Temporary Files 19
- Sorts Requiring One Merge Phase And One Temporary File 20
- Sorts Requiring Two Or More Merge Phases And Two Temporary Files 21
- Deciding What Drives To Put Temporary And Output Files On 21
- Buffers Command In Your Config.Sys 22
- Using Disk Cache Programs 22
-
- Special Situations 23
- Sorting Files That Contain Tabs 23
- Writing The Output To The File That Contained The Input 24
-
- Two Incompatibilities With The DOS SORT 24
-
- Error Messages 25
- Error Numbers And Return Codes 25
- Syntax Error Messages 26-30
- DOS Version Before 2.0 Message 31
- Insufficient Memory Messages 31
- Line Or String Too Long Messages 31
- Input/Output Error Messages 32-33
- Never Should Happen Error Messages 33
-
-
-
- September 17, 1991 RPSORT Reference Page 1
-
-
- Introduction
-
- RPSORT is a sort utility that greatly improves upon the features and the
- performance of the sort utility distributed with Microsoft DOS. First,
- RPSORT does everything that the DOS SORT does. Virtually any command
- that works with DOS SORT works with RPSORT and produces the same result.
-
- But RPSORT does much more. It can sort very large files and supports
- multiple sort keys. It is extremely fast. I do not know of another sort
- utility that can outspeed it.
-
- RPSORT sorts text files. These consist of lines each ended by CRLF (i.e.
- a carriage return and line feed). RPSORT also sort files of fixed length
- records such as those produced by many BASIC, Pascal and C programs.
-
- RPSORT supports numerous sort key types including regular text keys, C
- language strings, Turbo Pascal strings, signed and unsigned binary
- integers of any length and several types of binary floating point numbers.
-
- RPSORT can delete null lines (consisting only of a CRLF). It can also
- delete records/lines whose sort keys duplicate those in a previous
- record/line.
-
- A summary of RPSORT syntax appears on page 4 of this document.
- A comprehensive list of RPSORT examples can be found in the file
- EXAMPLES.DOC.
-
-
- How to Display RPSORT Built-in Syntax Screens
-
- Enter the RPSORT command with no parameters, to see RPSORT's built-in
- syntax screens. Use the Page Down and Page Up keys to negotiate the
- screens. Press the Esc key when you are finished viewing the syntax
- screens.
-
-
- Syntax Conventions
-
- . Items in square brackets ([]) are optional. Type the information inside
- the brackets but not the brackets themselves.
-
- . An item followed by an ellipsis (...) may be repeated several times.
-
- . Capital letters (A thru Z) and special characters (/ and ? and +) should
- be entered as they appear in the syntax except that you may enter lower
- case letters in place of the capital letters.
-
- . Words spelled out in lower case letters describe an item you are to enter.
- For example, where you see the word "inputfile" in the syntax, enter the
- path (if necessary) and the name of an input file. File names and other
- and other parameters may be entered in lower or upper case as you choose.
-
-
-
- September 17, 1991 RPSORT Reference Page 2
-
-
- How To Exit Quickly From RPSORT
-
- RPSORT is very fast and can sort files containing hundreds of kilobytes
- and thousands of records in just a few seconds (I am assuming a 286 CPU
- and a hard disk). However, if you are sorting a really large file (say
- 20 megabytes) then the execution time could be a some number of minutes.
- If you start such a sort and then realize that you specified the wrong
- sort key(s), you can terminate the sort immediately as follows:
-
- . Enter a Ctrl-Break (i.e. hold down the Ctrl key press the Break key).
-
- . Within a very few seconds, RPSORT will respond with the message:
-
- Do you wish to quit RPSORT? Press Esc to quit, any other key to
- continue.
-
- . If you do indeed wish to terminate the sort press the Esc key. RPSORT
- will clean up properly by deleting any temporary files as well as any
- partial output file and then it will terminate.
-
- . If you decide you don't want to terminate the sort after all, press any
- key but the Esc key and the sort will continue.
-
- After terminating the sort, as above, you can then re-enter the RPSORT
- command with the correct parameters.
-
- There might be other reasons to terminate the sort. Perhaps you need the
- computer for some other purpose and can't wait for the sort to finish.
- In such cases, be aware that any work done by RPSORT will be lost. If
- you do the sort later on you will have to start it from the beginning.
-
- If you want RPSORT to ignore any control break, use the /B switch. See
- "Miscellaneous Switches" on page 15.
-
-
- Quick Exit From RPSORT When Output Goes To The Standard Output
-
- If RPSORT is writing its output to the standard output, as in:
-
- RPSORT <inputfile >outputfile
-
- then the termination proceeds a little differently:
-
- . As above you enter Ctrl-Break.
-
- . RPSORT simply terminates the sort within a few seconds without giving
- you a chance to change your mind. As above it deletes any temporary
- files but it does not delete the output file. It can't delete the
- latter because it doesn't know the name of a redirected output file.
-
-
-
- September 17, 1991 RPSORT Reference Page 3
-
-
- General Description of How RPSORT Does A Sort
-
- When you execute RPSORT you specify in the command line:
-
- . The source of the input data (one or more files) and the destination
- for the sorted output (either a file or the screen).
-
- . Whether the input is lines terminated by CRLF or fixed length records.
-
- . Optionally, you define one or more sort key. These indicate:
-
- . The location of the key in the line or record.
- . The length of the key.
- . The type of key. Any of several string or numeric types.
-
- If there are no sort key definitions, RPSORT assumes a default
- character string key consisting of the entire line or record.
-
- The sort process involves the following:
-
- . RPSORT compares two records/lines, at a time, to determine which comes
- first and swaps them, if necessary, to put them in the right sequence.
- The comparisons continue until the entire input has been sequenced.
-
- . RPSORT uses the quicksort algorithm (invented by C. A. R. Hoare in
- 1962) to determine which records/lines to compare. This algorithm is
- very good at doing the sort with the minimum number of comparisons.
-
- . In comparing two records/lines, RPSORT compares the sort keys in the
- same sequence as their appearance in the command line until it finds an
- unequal compare or runs out of sort keys.
-
- . If all the sort keys are equal for two records, RPSORT breaks the tie
- by comparing the locations of the two records in the input. This
- maintains any inherent order in the file (i.e. if two or more records
- have identical sort keys then their order among themselves in the
- sorted output will be the same as it was in the input).
-
- . For files consisting of lines, some of the lines may be:
-
- . Too short to contain any part of a given sort key. Then, the sort
- keyis taken to be a null string and sorts lower than anything else.
-
- . Or too short to contain the whole sort key. Then, the key comparison
- is done for the length of the shorter key. If the keys are equal for
- that length, the shorter key sorts low.
-
- . If the input file(s) are small enough to fit in the available memory
- space the sort is done in one pass in memory.
-
- . If the input is too big to fit into memory, it is read in chunks and
- each chunk is sorted and written to a temp file. Then RPSORT uses one
- or more merge phases to combine the chunks into the sorted output file.
-
-
-
- September 17, 1991 RPSORT Reference Page 4
-
- Usage: RPSORT [/Q] [/Eerrfile] [/]? [inputfile[+inputfile]]
- [outputfile] [/A] [/B] [/C] [/D] [/Fnnnn] [/N] [/P]
- [/R] [/Td] [/Z] [sort key defin. . .]
-
- Sort key defin syntax: /+ [col] [:len] [A] [C] [F] [I] [M] [P] [R] [T] [U]
-
-
- Summary Of RPSORT Syntax
-
- Input is one or more filespecs (including path if required) separated by
- plus signs. Output is a single filespec. Input filespec(s) must precede
- output filespec. Wildcard characters are not allowed. Input file(s) are
- sorted together into the single output file. For example:
-
- RPSORT IPFILE1.DAT+IPFILE2.TXT+C:\MYDIR\IPFILE3.DAT OPFILE.DAT
-
- RPSORT can also be used as a filter. For example:
-
- RPSORT <IPFILE >OPFILE
-
- By default, RPSORT assumes a text file with the entire line as a case
- insensitive sort key. This can be changed by some of the parameters below.
-
- /Q suppresses copyright and success messages. Must be first parameter.
- /Eerrfile specifies file to which error messages will go instead of the
- screen. Should precede any parameter except /Q.
- /? or ? displays built-in syntax screens.
- /A does an ASCII sort. Case sensitive (lower case not equal upper case).
- /B tells RPSORT to ignore any control break entered from the keyboard.
- /C specifies C language style text keys (terminated by a binary zero).
- /D deletes any record whose sortkeys duplicate those in a previous record.
- /Fnnnn says that the input consists of fixed length records of nnnn bytes.
- /N deletes any null lines (those consisting only of a CRLF sequence).
- /P specifies Pascal style text keys (first byte is length of string).
- /R specifies a reverse (descending order) sort.
- /Td designates drive to be used for temp files instead of default drive.
- /Z tells RPSORT to ignore Ctrl-Z in text file and use the entire file.
- The /R switch applies to all sort keys. The /A, /C and /P switches apply to
- all text sort keys. They can't be over-ridden for an individual sort key.
-
- A sort key definition starts with /+ and may include the following
- attributes. No spaces are allowed between the attributes:
-
- col is starting column of this key. Col 1 is the first col in the record.
- :len is the length of this key.
- A does an ASCII (case sensitive) sort for the key.
- C sorts this key as C language text key (terminated by a binary zero).
- F sorts this key as a 80x87 floating point number. Len is 4, 8 or 10.
- I sorts this key as a signed binary integer. This may be any length.
- M sorts this key as a BASICA floating point number. Len is 4 or 8.
- P sorts this key as Pascal text key (first byte is length of string).
- R does a reverse (descending) sort for this key.
- T sorts this key as a Turbo Pascal type "real" number. Len must be 6.
- U sorts this key as an unsigned binary integer. This may be any length.
- Attributes F, I, M, P, T and U are only allowed for fixed length records.
-
-
-
- September 17, 1991 RPSORT Reference Page 5
-
-
- Usage: RPSORT [/Q] [/Eerrfile] [/]? [inputfile[+inputfile]]
- [outputfile] [/A] [/B] [/C] [/D] [/Fnnnn] [/N] [/P]
- [/R] [/Td] [/Z] [sort key defin. . .]
-
- Sort key defin syntax: /+ [col] [:len] [A] [C] [F] [I] [M] [P] [R] [T] [U]
-
- Details Of RPSORT Syntax
-
- There are three types of parameters:
-
- . Those that specify files (i.e inputfile and outputfile).
- . Switches which consist of a slash and a letter plus possibly a file
- name, number or drive letter (e.g /Q, /Eerrfile, /Fnnnn, Td).
- . Sort key definitions each of which defines a single sort key.
-
- The parameters can be entered in any sequence except that:
-
- . The inputfile(s) must always precede the outputfile.
- . The /Q switch (see /Q - Suppressing Copyright And Completion Messages)
- must precede any other parameter.
- . The /Eerrfile switch (see /Eerrfile - Directing Error Messages To A
- File) should precede everything but /Q.
-
- Options For Specifying Input And Output Files
-
- Using RPSORT As A Filter
-
- RPSORT can be used as a filter which reads the standard input and
- writes to the standard output. For example:
-
- RPSORT <ipfile >opfile
-
- The standard output need not be redirected and can go to the screen.
- The standard input must be redirected to a file or piped from the
- output of another program. RPSORT will not accept an input file
- directly from the keyboard. If you take the input from the standard
- input then the output MUST go to the standard output.
-
- Specifying Input And Output Files Directly
-
- You can specify the input and output files directly. Input is one or
- more files separated by plus signs but output must be a single file.
- The filespecs may include a path. Wildcard characters are not
- allowed in any file name. All input files are combined and sorted
- together into the single output file. For example:
-
- RPSORT IPFILE1.DAT+IPFILE2.DAT+C:\MYDIR\IPFILE3.DAT OPFILE.DAT
-
- If the path and filename for the output filespec are the same as that
- for an existing file, the latter will be replaced by the output from
- RPSORT. If this is what you want, fine but if you don't want to lose
- the existing file then use a different name for the output.
-
-
-
- September 17, 1991 RPSORT Reference Page 6
-
-
- Usage: RPSORT [/Q] [/Eerrfile] [/]? [inputfile[+inputfile]]
- [outputfile] [/A] [/B] [/C] [/D] [/Fnnnn] [/N] [/P]
- [/R] [/Td] [/Z] [sort key defin. . .]
-
- Sort key defin syntax: /+ [col] [:len] [A] [C] [F] [I] [M] [P] [R] [T] [U]
-
-
- Specifying Lines Or Fixed Length Records
-
- Lines Are The Default
-
- By default, the file is assumed to consist of lines. A line is a
- sequence of characters terminated by CRLF. RPSORT also accepts the
- LFCR sequence as a line terminator. The lines may vary in length
- from null lines up to a maximum length of 32750. RPSORT will reject
- a file that contains a line longer than this.
-
- If the last record in an input file does not terminate with CRLF or
- LFCR, RPSORT will append these two characters and display a message
- informing you of its action.
-
- If the input is two or more files, RPSORT will, if necessary, append
- a CRLF to terminate the last line in each of the files. RPSORT never
- assumes that a line starting in one file continues in the next.
-
- Only character string sort keys are allowed in a file of lines.
- Binary numeric sort keys are not allowed.
-
-
- /Fnnnn - Specifying Fixed Length Records
-
- A file of fixed length records contains records all of the same
- length. The /Fnnnn switch tells RPSORT that the records are fixed
- length and the value you enter for nnnn specifies the length. For
- example, /F65 tells RPSORT that the file consists of 65 byte records.
-
- Fixed length records need not end with a CRLF but if they do, those
- two bytes must be included in the length given by the /Fnnnn switch.
-
- The maximum length you may specify is 32750. RPSORT would reject
- /F32751.
-
- If the last record in the input is shorter than the length given in
- the /Fnnnn switch (i.e. the file length is not an exact multiple of
- nnnn), RPSORT ignores the last record and does not include it in the
- sorted output. RPSORT displays a message to inform you of its action.
-
- If the input consists of two or more files, RPSORT will skip last
- short records from each of the input files. RPSORT never assumes
- that a record starting in one file continues in the next.
-
- All key types supported by RPSORT are allowed in a file of fixed
- length records.
-
-
-
- September 17, 1991 RPSORT Reference Page 7
-
-
- Usage: RPSORT [/Q] [/Eerrfile] [/]? [inputfile[+inputfile]]
- [outputfile] [/A] [/B] [/C] [/D] [/Fnnnn] [/N] [/P]
- [/R] [/Td] [/Z] [sort key defin. . .]
-
- Sort key defin syntax: /+ [col] [:len] [A] [C] [F] [I] [M] [P] [R] [T] [U]
-
-
- Detailed Description Of Sort Key Types Supported By RPSORT
-
- Sort Keys That Are Character Strings
-
- Default Case Insensitive Character Strings
-
- This is the only sequence supported by the DOS SORT. The digits 0
- through 9 come before the letters. Lower case letters sort equal
- to upper case letters. Foreign letters, punctuation and currency
- symbols sort equal to their American English equivalents.
-
- ASCII (Case Sensitive) Character Strings
-
- The sequence is according to the ASCII value assigned to each
- character. This puts the digits 0 through 9 before any letters and
- puts all of the upper case letters before any of the lower case
- letters. Foreign letters, punctuation and currency symbols sort
- higher than any of the above.
-
- The ASCII value for each character is the code used internally by
- the computer to represent that character. An ASCII sort is the
- fastest possible sort because it requires no pre-processing of the
- characters.
-
- You can specify this type of sort key by using either the /A switch
- (see page 11) or the A attribute (see page 12).
-
- C Language Style Character Strings
-
- C language strings are allocated some maximum length in your C
- program. This should be the length in the sort key definition.
-
- For example, if you define "char mystr[8]" in your C program then
- the compiler allocates 8 bytes and therefore the length specified
- to RPSORT should also be 8.
-
- The actual character string, however, may be shorter. C language
- strings are terminated by a binary zero if they do not fill the
- allocated space. Therefore, RPSORT takes the length of a C style
- string to be the lesser of:
-
- . The length attribute (or if absent the default length).
- . The length up to but not including the first binary zero.
-
- You can specify this type of sort key by using either the /C switch
- (see page 11) or the C attribute (see page 12).
-
-
-
- September 17, 1991 RPSORT Reference Page 8
-
-
- Usage: RPSORT [/Q] [/Eerrfile] [/]? [inputfile[+inputfile]]
- [outputfile] [/A] [/B] [/C] [/D] [/Fnnnn] [/N] [/P]
- [/R] [/Td] [/Z] [sort key defin. . .]
-
- Sort key defin syntax: /+ [col] [:len] [A] [C] [F] [I] [M] [P] [R] [T] [U]
-
-
- Detailed Description Of Sort Key Types Supported By RPSORT (continued)
- Sort Keys That Are Character Strings (continued)
-
- Turbo Pascal Style Character Strings
-
- Turbo Pascal strings are allocated some maximum length in your
- Pascal program. This should be the length given in the sort key
- definition.
-
- For example, if you define string[8] in your Pascal program then
- the compiler allocates 9 bytes to the string and therefore the
- length specified to RPSORT should also be 9.
-
- The first byte in a Pascal string is a length byte. This contains
- a binary number which is the actual length of the string. The
- remaining bytes allow enough room for the longest possible string.
-
- The length must be between 2 and 256 inclusive. These limits
- correspond to string[1] and string[255] respectively.
-
- If RPSORT finds a length byte value, in the file, that is too large
- (i.e. greater than or equal to the specified length) it aborts.
- This would only occur if the sort key was incorrectly defined.
-
- You can specify this type of sort key by using either the /P switch
- (see page 11) or the P attribute (see page 12). This type of sort
- key is only allowed for fixed length records.
-
-
- Sort Keys That Are Binary Numbers
-
- Signed Binary Integers
-
- A signed binary integer is a two's complement binary integer that
- is stored low byte first, high byte last.
-
- This is the natural way for an 80X86 CPU to store binary integers.
- As far as I know, all language compilers and interpreters for IBM
- PCs and clones store them this way.
-
- RPSORT allows signed binary integer sort keys to be any length from
- 1 up to the length of the record.
-
- You can specify this type of sort key by using the I attribute (see
- page 13). This type of sort key is only allowed for fixed length
- records.
-
-
-
- September 17, 1991 RPSORT Reference Page 9
-
-
- Usage: RPSORT [/Q] [/Eerrfile] [/]? [inputfile[+inputfile]]
- [outputfile] [/A] [/B] [/C] [/D] [/Fnnnn] [/N] [/P]
- [/R] [/Td] [/Z] [sort key defin. . .]
-
- Sort key defin syntax: /+ [col] [:len] [A] [C] [F] [I] [M] [P] [R] [T] [U]
-
-
- Detailed Description Of Sort Key Types Supported By RPSORT (continued)
- Sort Keys That Are Binary Numbers (continued)
-
- Unsigned Binary Integers
-
- Unsigned binary integers, just like signed binary integers, are
- stored low byte first, high byte last.
-
- RPSORT allows unsigned binary integer sort keys to be any length
- from 1 up to the length of the record.
-
- You can specify this type of sort key by using the U attribute (see
- page 13). This type of sort key is only allowed for fixed length
- records.
-
- BASICA And GWBASIC Floating Point Numbers
-
- RPSORT supports binary floating point numbers as defined by the
- BASIC interpreter (prior to MS-DOS v5.0) and older versions of
- Microsoft QuickBASIC (prior to QB v4.0). The lengths that RPSORT
- will accept for these numbers are:
-
- Length = 4 for single precision numbers.
- Length = 8 for double precision numbers.
-
- You can specify this type of sort key by using the M attribute (see
- page 13) and one of the lengths listed above. This type of sort
- key is only allowed for fixed length records.
-
- Turbo Pascal Real Numbers
-
- RPSORT supports Turbo Pascal numbers of type "real". The length
- need not be specified and is always 6.
-
- You can specify this type of sort key by using the T attribute (see
- page 13). This type of sort key is only allowed for fixed length
- records.
-
- This was the "real" type in the original version of Turbo Pascal
- and is still supported in version 6.0. To see how to sort the new
- 80x87 formats in Turbo Pascal (single, double, extended and comp)
- refer to the table on page 14. Also see the next section on "Math
- Co-Processor Floating Point Numbers".
-
-
-
- September 17, 1991 RPSORT Reference Page 10
-
-
- Usage: RPSORT [/Q] [/Eerrfile] [/]? [inputfile[+inputfile]]
- [outputfile] [/A] [/B] [/C] [/D] [/Fnnnn] [/N] [/P]
- [/R] [/Td] [/Z] [sort key defin. . .]
-
- Sort key defin syntax: /+ [col] [:len] [A] [C] [F] [I] [M] [P] [R] [T] [U]
-
-
- Detailed Description Of Sort Key Types Supported By RPSORT (continued)
- Sort Keys That Are Binary Numbers (continued)
-
- Math Co-Processor Floating Point Numbers
-
- RPSORT supports three types of math co-processor (i.e. 80x87)
- floating point numbers. The table below gives the lengths and
- names assigned to them by Intel and by three popular compilers.
-
- Length Intel QuickBasic Turbo Pascal Turbo C
- ------ ---------- ---------- ------------ -------
- 4 short real single single float
- 8 long real double double double
- 10 temp real N/A extended long double
-
- You can specify this type of sort key by using the F attribute (see
- page 13) and one of the lengths listed above. This type of sort
- key is only allowed for fixed length records.
-
- RPSORT does not require a math co-processor to sort numbers of this
- type and does not use the 80x87 even if it is present.
-
- Zero values returned by an 80x87 are marked as either a +0 or a -0.
- Some zero values arise from underflow. This occurs if a result is
- too small (i.e. has too negative an exponent) for the given numeric
- format (short real, long real or temp real). The 80x87 returns a
- zero result but keeps the sign of the small number.
-
- RPSORT sorts minus zeros as less than plus zeros. I could call
- this a deliberate feature in that it reflects as best as possible
- the true sequence of very small results but actually it's a natural
- consequence of the way I do the sort.
-
- A result can be too large for the given numeric format. This is
- called overflow. Most compilers generate an error and do not store
- store a result but an 80x87 can return special values denoting
- plus and minus infinity. RPSORT sorts plus infinity higher than
- any other value and minus infinity as lower than any other value.
-
- The 80x87 also generates special values for error conditions (e.g.
- taking the square root of a negative number). Any compiler would
- generate an error rather than store such values. Still, RPSORT
- must do something if it finds them. I sort them the same as plus
- or minus infinity depending on their sign.
-
-
-
- September 17, 1991 RPSORT Reference Page 11
-
-
- Usage: RPSORT [/Q] [/Eerrfile] [/]? [inputfile[+inputfile]]
- [outputfile] [/A] [/B] [/C] [/D] [/Fnnnn] [/N] [/P]
- [/R] [/Td] [/Z] [sort key defin. . .]
-
- Sort key defin syntax: /+ [col] [:len] [A] [C] [F] [I] [M] [P] [R] [T] [U]
-
-
- Defining The Desired Sort Sequence To RPSORT
-
- Standard Defaults For Sort Keys
-
- The following defaults are used by RPSORT unless you specify other
- defaults (see "Switches Which Set Defaults For Sort Keys") or specify
- different attributes in the sort key definition for a sort key (see
- "Defining Sort Keys").
-
- . The sort key consists of the entire record/line.
-
- . The sort key is a character string to be sorted per the same case
- insensitive sequence used by the DOS SORT. Digits 0 through 9
- precede the letters. Lower case letters sort equal to upper case.
- Foreign letters, punctuation and currency symbols sort equal to
- their American English equivalents.
-
- . The sort will be in ascending (low to high) sequence.
-
-
- Switches Which Set Defaults For Sort Keys
-
- These switches change some of the defaults for sort keys. They can't
- be over-ridden by individual sort key definitions. Use them only if
- you want all your sort keys to have the same attributes. The /C and
- /P switches may be of particular interest to computer programmers.
-
- The /A, /C and /P switches apply to all character string sort keys
- (i.e. they apply to any sort key that is not defined as being a
- binary numeric type). /C and /P are mutually exclusive but either
- may be used in conjunction with /A.
-
- /A makes the ASCII (case sensitive) sequence the default. Digits 0
- through 9 precede the letters and all upper case letters precede
- any lower case letters. The sequence is per the ASCII code for
- each character.
-
- /C says that all string keys are C language character strings. See
- page 7 for a description of C style strings.
-
- /P says that all string keys are Turbo Pascal type strings. See
- page 8 for a description of Pascal style strings. The /P switch
- is only allowed for fixed length records.
-
- /R specifies a reverse sort. The sort will be in descending (high
- to low) sequence. /R applies to all the sort keys you define.
-
-
-
- September 17, 1991 RPSORT Reference Page 12
-
-
- Usage: RPSORT [/Q] [/Eerrfile] [/]? [inputfile[+inputfile]]
- [outputfile] [/A] [/B] [/C] [/D] [/Fnnnn] [/N] [/P]
- [/R] [/Td] [/Z] [sort key defin. . .]
-
- Sort key defin syntax: /+ [col] [:len] [A] [C] [F] [I] [M] [P] [R] [T] [U]
-
-
- Defining Sort Keys
-
- Sort Key Definition Syntax
-
- If no sort key definitions are given, RPSORT assumes a single
- default sort key (see "Standard Defaults For Sort Keys" and
- "Switches Which Set Defaults For Sort Keys").
-
- You may specify as many sort key definitions as you like provided
- that they fit within the command line (maximum of 127 bytes). Sort
- key definitions consist of a /+ followed by a list of attributes
- with no spaces between them. You may, however, use spaces to
- separate one sort key definition from another or from a switch.
-
- All attributes are optional. A sort key definition may be just /+,
- which gets you the same default sort key as when no sort key
- definitions are specified. The following describes each of the
- attributes:
-
- col is the starting column for the key. This must be at least 1
- but no more than 32750. For fixed length records, the maximum
- is the largest column such that there is enough room in the
- remainder of the record to hold the minimum legitimate key
- length for the given key type.
-
- :len is the length for this key. The legitimate values for len
- depend on the type of the sort key.
-
- R specifies a reverse sort for this key. The sequence will be
- in descending (high to low) sequence.
-
- The next three attributes are used for character string keys. C
- and P are mutually exclusive but either may be used with A. C and
- P may be of interest to computer programmers.
-
- A does an ASCII (case sensitive) sort for the key. The digits 0
- through 9 precede the letters and all upper case letters
- precede any of the lower case letters.
-
- C says that this key is a C language character string. See page
- 7 for a description of C style strings.
-
- P says that this key is a Turbo Pascal character string. See
- page 8 for a description of Pascal style character strings.
- The P attribute is only allowed for fixed length records.
-
-
-
- September 17, 1991 RPSORT Reference Page 13
-
-
- Usage: RPSORT [/Q] [/Eerrfile] [/]? [inputfile[+inputfile]]
- [outputfile] [/A] [/B] [/C] [/D] [/Fnnnn] [/N] [/P]
- [/R] [/Td] [/Z] [sort key defin. . .]
-
- Sort key defin syntax: /+ [col] [:len] [A] [C] [F] [I] [M] [P] [R] [T] [U]
-
-
- Defining Sort Keys (continued)
- Sort Key Definition Syntax (continued)
-
- The next five attributes define binary numeric type keys which are
- only allowed for fixed length records. They are mutually exclusive.
- These attributes may be of interest to computer programmers.
-
- The table on page 14 lists some programming language compilers and
- interpreters and indicates the appropriate type and length
- attributes to be used for each of their binary numeric data types.
-
- I sorts this key as a signed binary integer. These may be any
- length. See page 8 for additional details.
-
- U sorts this key as an unsigned binary integer. These may be
- any length. See page 9 for additional details.
-
- F sorts this key as a binary floating point number of the type
- produced by a math co-processor (i.e. an 80x87). RPSORT
- supports three precisions for 80x87 floating point numbers.
- The table below gives the lengths for each precision and the
- names assigned to them by Intel and three popular compilers.
-
- Length Intel QuickBasic Turbo Pascal Turbo C
- ------ ---------- ---------- ------------ -------
- 4 short real single single float
- 8 long real double double double
- 10 temp real N/A extended long double
-
- RPSORT does not require a math co-processor to sort numbers of
- this type and does not use the 80x87 even if it is present.
-
- See page 10 for additional details concerning math
- co-processor floating point numbers.
-
- M sorts this key as a binary floating point number as defined by
- the BASIC interpreter (prior to MS-DOS v5.0) and older
- versions of Microsoft QuickBASIC (prior to QB v4.0). The len
- attribute can be 4 or 8.
-
- Use len = 4 for single precision numbers.
- Use len = 8 for double precision numbers.
-
- T sorts this key as a Turbo Pascal number of type "real". The
- len parameter need not be specified and is 6 by default. See
- page 9 for additional details.
-
-
-
- September 17, 1991 RPSORT Reference Page 14
-
-
- The following table lists the type and length attributes for the binary
- numeric types available in a few programming language compilers and
- interpreters.
-
- If you are using a compiler that is not in this table, you should review
- the previous pages along with the programmers guide for your compiler to
- see if any of the binary numeric types supported by RPSORT match those
- available with your compiler.
-
- Compiler Or Interpreter Number Type Type Attribute Length Attribute
- ----------------------- ----------- -------------- ----------------
- Microsoft QuickBASIC Integer I 2
- v4.0 and later & Long I 4
- Microsoft QBASIC Single F 4
- Double F 8
-
- Microsoft QuickBASIC Integer I 2
- v3.0, 8087 Single F 4
- Double F 8
-
- IBM BASICA & Integer I 2
- Microsoft GWBASIC & Single M 4
- Microsoft QuickBASIC Double M 8
- v1.0, v2.0 and v3.0 non-8087
-
- Turbo Pascal Shortint I 1
- v4.0 and later Integer I 2
- Longint I 4
- Byte U 1
- Word U 2
- Real T 6
- Single F 4
- Double F 8
- Extended F 10
- Comp I 8
-
- Turbo Pascal Integer I 2
- v3.0 8087 Byte U 1
- Real F 8
-
- Turbo Pascal Integer I 2
- v1.0, v2.0 and v3.0 non-8087 Byte U 1
- Real T 6
-
- Borland/Turbo C signed char I 1
- unsigned int U 2
- short int I 2
- int I 2
- unsigned long U 4
- long I 4
- float F 4
- double F 8
- long double F 10
-
-
-
- September 17, 1991 RPSORT Reference Page 15
-
-
- Usage: RPSORT [/Q] [/Eerrfile] [/]? [inputfile[+inputfile]]
- [outputfile] [/A] [/B] [/C] [/D] [/Fnnnn] [/N] [/P]
- [/R] [/Td] [/Z] [sort key defin. . .]
-
- Sort key defin syntax: /+ [col] [:len] [A] [C] [F] [I] [M] [P] [R] [T] [U]
-
-
- Miscellaneous Switches
-
- /Q - Suppressing Copyright And Completion Messages.
-
- The /Q switch, if it is the first parameter, suppresses display of:
-
- . The Copyright message when the sort starts.
- . The "Sort successfully completed." message after successful sort.
-
- Error messages, if any, will still be displayed.
-
- /Eerrfile - Directing Error Messages To A File.
-
- This switch directs error and successful completion messages to the
- file designated by errfile instead of the screen. For example:
-
- /Ec:\mydir\myerrors
-
- Specify /Enul to send error messages to the DOS NUL file which means
- nowhere. Only the /Q switch, if any, should precede the /E switch.
-
- /B - Ignoring Control Breaks Entered From The Keyboard.
-
- Tells RPSORT to ignore Ctrl-Break from the keyboard. This would be
- useful if you setup a batch file which includes RPSORT and you don't
- want the users of the batch file to be able to interrupt RPSORT.
-
- /D - Delete Records Whose Sortkeys Duplicate Those In A Previous Record
-
- Tells RPSORT to delete any records/lines whose sort keys duplicate
- those in a previous one. This deletes records/lines even if they
- are not identical to a previous one since all that is required is
- that the sort keys be the same.
-
- To only delete identical records/lines, tack on /+a as the last sort
- key. This produces an equal compare only for identical
- records/lines. For example:
-
- RPSORT /D /+1:2
-
- deletes any lines whose first two bytes equal those on a previous
- line, while
-
- RPSORT /D /+1:2 /+a
-
- deletes only lines that are identical to a previous line.
-
-
-
- September 17, 1991 RPSORT Reference Page 16
-
-
- Usage: RPSORT [/Q] [/Eerrfile] [/]? [inputfile[+inputfile]]
- [outputfile] [/A] [/B] [/C] [/D] [/Fnnnn] [/N] [/P]
- [/R] [/Td] [/Z] [sort key defin. . .]
-
- Sort key defin syntax: /+ [col] [:len] [A] [C] [F] [I] [M] [P] [R] [T] [U]
-
-
- Miscellaneous Switches (continued)
-
- /N - Delete Null Lines
-
- This switch deletes all null lines (i.e. lines consisting only of a
- CRLF). Lines that are all spaces and thus look like null lines when
- you list them will not be deleted. This switch is not allowed for
- fixed length records for which it would be meaningless.
-
-
- /Td - Designate Drive To Be Used For Temporary Files.
-
- Given enough memory, RPSORT loads the entire input into memory,
- sorts it and writes the sorted data to the output file. In such
- cases, RPSORT does not need to create any temporary files.
-
- If the input is larger than the available memory, RPSORT reads the
- file a chunk at a time, sorts each chunk and writes the sorted
- chunks to a temporary file. RPSORT then does one or more merge
- phases to combine the chunks into a single sorted output file.
-
- RPSORT normally puts temporary files on the default drive. The /T
- switch lets you to specify an alternate drive. For example:
-
- /TC
-
- puts the temporary file(s) on your C drive. See the section on
- "Efficiency Considerations" for more details.
-
-
- /Z - Ignore Ctrl-Z In Text File. Use Entire Physical File.
-
- RPSORT (just like MS-DOS) treats Ctrl-Z as the end of a text file.
- This is usually the correct thing to do since Ctrl-Z, if present,
- normally follows the last byte of actual data.
-
- Sometimes, however, one or more Ctrl-Zs occur in the middle of a
- text file. Files downloaded from bulletin boards may contain
- garbage characters (such as Ctrl-Z) due to a noisy line. If you
- sort such a file, the sorted output is shorter than the original
- file because RPSORT uses only part of the input.
-
- The /Z switch tells RPSORT to ignore Ctrl-Zs and to use the entire
- input. RPSORT deletes any Ctrl-Zs except for one at the end of the
- file. /Z is not applicable to fixed length records where Ctrl-Z has
- no special meaning and is just taken as another data byte.
-
-
-
- September 17, 1991 RPSORT Reference Page 17
-
-
- Efficiency Considerations
-
- Do ASCII Sort If Text Keys Are All Upper Case Or All Lower Case
-
- An ASCII sort puts text keys in order according to the ASCII code
- assigned each of the characters. This is the fastest possible sort
- because RPSORT can sequence records by directly comparing the sort keys
- without having to pre-process them in any way.
-
- If a file contains both upper and lower case letters and you want all
- the keys starting with a lower case "a" to be together with the keys
- starting with an upper case "A" and so on, then you can't do an ASCII
- sort and must do a case insensitive sort.
-
- However, if your file contains only upper case letters (or if it
- contains only lower case letters) then an ASCII sort will acheive the
- the same result as a case insensitive sort but will be faster. You
- specify an ASCII sort either by using the A attribute in each sort key:
-
- RPSORT /+1:5A /+12:7A INPUT.DAT OUTPUT.DAT
-
- or by using the /A switch:
-
- RPSORT /A /+1:5 /+12:7 INPUT.DAT OUTPUT.DAT
-
- If your files contain foreign letters, punctuation or currency symbols
- and you want these to sort the same as their American English
- equivalents then you must do a case insensitive sort.
-
-
-
- September 17, 1991 RPSORT Reference Page 18
-
-
- Efficiency Considerations (continued)
-
- How Memory Size Affects RPSORT Speed And Need For Temp Disk Space
-
- The amount of memory (I mean conventional memory not Expanded or
- Extended memory) affects RPSORT's speed in the following ways:
-
- . If memory is big enough or conversely the file is small enough to do
- the sort in memory, in one pass, then the sort will be optimally fast.
-
- . Otherwise the input must be sorted a chunk at a time with the chunks
- being written to a temp file. Then one or more merge phases will be
- required to combine the chunks. If memory is very small and many
- merge phases are required, RPSORT would slow down dramatically.
-
- The following pages contain a lot of nitty gritty detail about the
- conditions which force RPSORT to use temp disk space and how much temp
- disk space it might need. You can ignore these details if your
- situation meets either of the following conditions:
-
- . No temp files are needed if the free memory (see "Using CHKDSK Or MEM
- To Determine Free Memory" below) equals the input size plus twice the
- line/record count plus 70,000. A 10,000 line 400,000 byte file
- requires 400,000 plus (2 * 10,000) plus 70,000 or a total of 490,000
- bytes of free memory to sort the input without using temp disk space.
-
- . If the drive assigned to hold temp files (either the default drive or
- the drive specified in the /T switch) has twice as much space as the
- size of the input file, this will always be sufficient.
-
-
- Using CHKDSK Or MEM To Determine Free Memory
-
- To determine the amount of free memory in your system, use the CHKDSK
- command which gives you a display something like:
-
- 362496 bytes total disk space
- 53248 bytes in 2 hidden files
- 303104 bytes in 36 user files
- 6144 bytes available on disk
-
- 655360 bytes total memory
- 581168 bytes free
-
- The free memory is on the last line (581168 in this example). If you
- own MS-DOS 5.0 you can use the MEM command and get something like:
-
- 655360 bytes total conventional memory
- 655360 bytes available to MS-DOS
- 564288 largest executable program size
-
- Here the free memory appears on the "largest executable program size"
- line (564288 in this case).
-
-
-
- September 17, 1991 RPSORT Reference Page 19
-
-
- Efficiency Considerations (continued)
- How Memory Size Affects RPSORT Speed And Need For Temp Disk Space (cont.)
-
- Sorts Requiring No Merge Phase And No Temporary Files
-
- If possible, RPSORT will do a sort in a single pass without requiring
- any temporary files. Use the following steps to determine whether a
- given file can be sorted in a single pass:
-
- . First determine the amount of free memory (called FREEMEM below).
- See "Using CHKDSK Or MEM To Determine Free Memory" above.
-
- . Then the memory space required, by RPSORT, for the input equals:
-
- File Size + Twice The Number Of Records/Lines In The File
-
- This sum is called FILESPACE below. For example, if the file size
- were 453,868 bytes and it consisted of 8,323 lines then FILESPACE
- would equal 453,868 + 8,323 + 8,323 or 470,514 bytes.
-
- . RPSORT also requires some memory for itself and for buffers and
- tables. This depends on the size of FREEMEM:
-
- . If FREEMEM exceeds 170,000 bytes RPSORT reserves 70,000 bytes.
- In this case, a file can be sorted in one pass if:
-
- FILESPACE is less than FREEMEM - 70,000
-
- . If FREEMEM is less than 170,000 bytes then RPSORT reserves 18,000
- bytes plus one-third of the remainder of FREEMEM for itself.
- This means that a file can be sorted in a single pass if:
-
- 2 * (FREEMEM - 18,000)
- FILESPACE is less than ----------------------
- 3
-
- . If FREEMEM is less than approximately 30,000 bytes, then RPSORT
- will be unable to do the sort at all.
-
-
-
- September 17, 1991 RPSORT Reference Page 20
-
-
- Efficiency Considerations (continued)
- How Memory Size Affects RPSORT Speed And Need For Temp Disk Space (cont.)
-
- Sorts Requiring One Merge Phase And One Temporary File
-
- When a single pass sort is not possible, RPSORT breaks up the file
- into "chunks" and sorts each chunk separately. Then it merges these
- chunks to produce the sorted output. Use the following steps to
- check whether a file can be sorted with a single merge phase using
- only a single temporary file the same size as the input file:
-
- . Compute FREEMEM and FILESPACE as described in the previous section.
-
- . Then compute the number of chunks (called #CHUNKS below) as
- follows and round up to the next higher integer:
-
- . If FREEMEM exceeds 170,000 bytes then:
-
- FILESPACE
- #CHUNKS = ----------------
- FREEMEM - 70,000
-
- . If FREEMEM is less than 170,000 then:
-
- 3 * FILESPACE
- #CHUNKS = ----------------------
- 2 * (FREEMEM - 18,000)
-
- . Now compute the maximum number of chunks that RPSORT can merge at
- one time (called MAXMERGE below) as follows and round down to the
- next lower integer:
-
- . If FREEMEM exceeds 315,000 then:
-
- FREEMEM - 50,000
- MAXMERGE = ----------------
- 16,000
-
- . If FREEMEM exceeds 90,000 but is less than 315,000 then:
-
- MAXMERGE = 16
-
- . If FREEMEM is less than 90,000 then:
-
- 8 * (FREEMEM - 18,000)
- MAXMERGE = ----------------------
- 36,000
-
- . If #CHUNKS is less than or equal to MAXMERGE, then RPSORT will do a
- single merge phase sort using a single temp file the same size as
- the input file.
-
-
-
- September 17, 1991 RPSORT Reference Page 21
-
-
- Efficiency Considerations (continued)
- How Memory Size Affects RPSORT Speed And Need For Temp Disk Space (cont.)
-
- Sorts Requiring Two Or More Merge Phases And Two Temporary Files
-
- If necessary, RPSORT will do a multiple merge phase sort. This
- requires two temporary files each the size of the input file.
-
- Actually, RPSORT doesn't abruptly go to a full second merge phase if
- it can't do the sort in one merge phase. If #CHUNKS is less than
- twice MAXMERGE it does a one and a fraction merge phase sort. The
- first temp file (TEMP1) will be the same size as the input but the
- second (TEMP2) will be smaller as follows:
-
- #CHUNKS - MAXMERGE + 1
- Size of TEMP2 = ---------------------- * Size of input file
- #CHUNKS
-
-
- Deciding What Drives To Put Temporary And Output Files On
-
- Reading one file and writing another file concurrently on the same
- drive is generally inefficient because it requires that the drive head
- assembly constantly move back and forth between the two files. This
- can slow things down significantly.
-
- RPSORT always finishes reading the input file before it starts writing
- the output file. This means there is no loss of efficiency if the
- input and output are on the same drive. Of course there must be enough
- room on this drive to hold the output file.
-
- If a sort requires temporary files they are written at the same time as
- the input file is read. Similarly, temporary files are read at the
- same time as the output file is written. The drive assigned for
- temporary files must have enough space to hold the entire input and in
- some cases twice that much. Temp files go to the default drive but you
- can over-ride this with the /T switch.
-
- If you have a big enough RAM disk, you should consider putting the temp
- files there. This could markedly enhance the performance of RPSORT.
-
- If you don't use a RAM disk, you should assign temp files to a drive
- other than the ones on which the input and output files reside. This
- dictum is not absolute, however, as indicated by the following:
-
- . If you have only one hard drive and both the input and output files
- reside there, you are better off putting the temp files on the same
- hard drive than on a floppy.
-
- . If you are short of disk space, putting the temp files and the output
- file on the same drive could help because the output file might be
- able to reuse part of the space allocated to temp files.
-
-
-
- September 17, 1991 RPSORT Reference Page 22
-
-
- Efficiency Considerations (continued)
-
- Buffers Command In Your Config.Sys
-
- MS-DOS allocates disk buffers in memory to support read and write
- operations. The buffers are usually 512 bytes each. MS-DOS allocates
- 10 or 15 buffers depending on whether your system has less or more than
- 512K of memory. Some applications run faster with a larger number of
- buffers. You specify this in your config.sys file. For example:
-
- BUFFERS=30
-
- On my computer (a 10Mhz 286 with a slow hard disk):
-
- . Sorting a modest size files (say up to a megabyte) speeds up little
- if at all when I increase the number of buffers.
-
- . Sorting a large file (say a few megabytes), speeds up a very few
- percent with BUFFERS=20.
-
- . BUFFERS=30 produces an additional small improvement for very large
- files (upwards of ten megabytes).
-
- To fine tune the performance of RPSORT on your system, sort files of
- the type and size typical for you and test the effect various BUFFER
- values. In any case, you probably will use the number of buffers that
- is optimal for your principal applications not for RPSORT.
-
-
- Using Disk Cache Programs
-
- Disk cache programs (like SMARTDRV.SYS which is distributed as part of
- MS-DOS 5.0 package) set aside an area of memory called the disk cache.
- Typically the disk cache is allocated in expanded or extended memory
- and may be quite large (i.e. a megabyte or more).
-
- Disk cache programs intercept accesses to disk and retain data from
- the disk, in the cache. If the data is required later on, the disk
- cache program can provide the data from memory rather than having to go
- to the disk drive which would be much slower.
-
- If the retained data is needed often enough then the performance of
- your system will improve. Otherwise, your system may slow down due to
- the overhead of the disk cache program.
-
- I can't make any definitive statement as to how disk cache programs
- might improve or degrade the performance of your system.
-
- If you contemplate using a disk cache program, I suggest that you
- perform experiments with caches of different sizes and possibly with
- different cache programs. These experiments should include the entire
- range of activities you perform on your system.
-
-
-
- September 17, 1991 RPSORT Reference Page 23
-
-
- Special Situations
-
- Sorting Files That Contain Tabs
-
- If your input file contains tabs and if the tabs must be expanded to
- the proper number of spaces for your sort keys to line up properly,
- then RPSORT will not sort it correctly because RPSORT does not expand
- the tabs. The solution is to process the file with some program that
- expands the tabs and then to use the output of this program as the
- input to RPSORT.
-
- As a convenience, for those who face this problem, I have included a
- program called RPTAB in this package. The syntax for RPTAB is:
-
- RPTAB input-filespec output-filespec [tabstop...]
-
- The parameters must be given in the indicated order.
-
- The input is a file containing tabs to be expanded. After RPTAB does
- its thing, the contents of the output file will be the same as that of
- the input except that all tabs will have been replaced by the number of
- spaces required to reach the next tab stop.
-
- Specifying tab stops is optional. If you don't specify any, the
- default tab stops are at positions 1, 9, 17, 25, 33... and so on at
- intervals of eight columns. The following command will expand tabs to
- the default tab stops:
-
- RPTAB MYTABS.DAT MYSPACES.DAT
-
- If you specify tab stops they must be a sequence of integers each
- greater than the preceding one. The first tab stop is always at column
- 1 and you need not specify it. RPTAB follows the rule that the
- interval between the last two specified tab stops implies subsequent
- tab stops at the same interval. The command:
-
- RPTAB MYTABS.DAT MYSPACES.DAT 6 15 27
-
- tells RPTAB that the tab stops are at positions 1, 6, 15, 27, 39, 51...
- etc. The interval of 12 between 15 and 27 is propagated to subsequent
- tab stops.
-
- This package includes the file RPTAB.PAS which is the complete source
- code for RPTAB. RPTAB is written in Turbo Pascal and compiled with
- version 6.0 of that compiler. If you want to improve or modify RPTAB
- in any way, please feel free to do so. Please! Please! do not
- distribute any modified version under my name. If you did it, you
- deserve the credit.
-
- RPTAB.PAS consists of Pascal statements and an assembly language
- sub-routine called ExpandTabs. This latter was written using Turbo
- Pascal's inline assembler (a very useful addition by Borland).
-
-
-
- September 17, 1991 RPSORT Reference Page 24
-
-
- Special Situations (continued)
-
- Writing The Output To The File That Contained The Input
-
- Nothing stops you from specifying the same file as both input and
- output in a RPSORT command. It is dangerous but it can be beneficial
- in some circumstances.
-
- It is possible to do this is because RPSORT never starts writing the
- output file until after it has finished reading the input file.
- Therefore it will not destroy the input before it has read it.
-
- The danger is that after RPSORT has started writing the output file but
- before it has finished, your system may go down due possibly to a power
- failure or a software or hardware problem or whatever. In this case
- the input would be destroyed and the output would not yet exist. This
- would mean the loss of your data unless you had backed up your file or
- it could be recreated in some way.
-
- The benefit is realized when you must put the output file on the same
- drive as the input file but there is not enough space, on the drive, to
- hold both. By using the same file for input as for output you would
- re-use the same disk space and thus might be able to do a sort that
- otherwise you could not do. Once again, don't do this unless you have
- backed up your data or you have some relatively easy way to recover it.
-
- None of the above applies if RPSORT is being used as a filter. In that
- case if the output file is the same as the input then the input file
- will be destroyed by DOS before RPSORT even starts executing.
-
- Two Incompatibilities With The DOS SORT
-
- Their are two exceptions to the statement that any command that works
- with the DOS SORT will produce the same result with RPSORT:
-
- . RPSORT will not let you type the input file from the keyboard.
-
- . The DOS SORT tacks the CRLF, that ends a line, onto the sort key.
- RPSORT doesn't. Thus, RPSORT sorts null lines to the beginning of a
- file. The DOS SORT precedes them with any line whose sort key starts
- with a character like tab or formfeed whose ASCII value is less than
- that for CR.
-
-
-
- September 17, 1991 RPSORT Reference Page 25
-
-
- Error Messages
-
- Error Numbers And Return Codes
-
- Each type of error that RPSORT can detect has been assigned an error
- number which appears in the corresponding error message. For example:
-
- ERROR 049: No room on disk to write sorted output file.
-
- When RPSORT terminates, it sets the "errorlevel" return code as follows:
-
- . If the sort was successful, RPSORT sets the return code to zero.
-
- . If one or more syntax errors are discovered, the relevant error
- messages are displayed and the sort is terminated. The return code
- is set to the error number for the first error detected.
-
- . If an error is discovered while executing the sort (typically some
- kind of input, output or insufficient memory error), the appropriate
- error message is displayed and the return code is set to the error
- number for that error.
-
- The error numbers are broken down into groups as follows:
-
- Error Number Group Range Of Error Numbers
- ---------------------- ----------------------
- Syntax Errors 1 - 34
- DOS Version Before 2.0 37
- Insufficient Memory 40 - 41
- Line/String Too Long Errors 43 - 44
- Input/Output Errors 46 - 54
-
- There are also a number of error messages with error numbers in the
- range 59 through 74 which should never happen. Any of these could
- imply a bug in RPSORT.
-
- If you run RPSORT from a batch file, you can test the return codes
- in statements like:
-
- IF ERRORLEVEL 1 GOTO SORTERR
-
- This would catch any return code greater than or equal to one and
- thus any error at all. Another example:
-
- IF ERRORLEVEL 40 GOTO EXECERR
-
- This would catch any return code greater than or equal to 40 and
- thus any sort execution error.
-
-
-
- September 17, 1991 RPSORT Reference Page 26
-
-
- Error Messages (continued)
-
- Syntax Error Messages
-
- When RPSORT parses the command line it displays messages for any syntax
- errors it finds. It always parses the complete command line and
- therefore may report several errors. Many error messages display the
- bad parameter at the end of the message. For example:
-
- ERROR 019: Only one keylength allowed: "/13:5:7"
-
- In the message listing below, the quoted word "badparm" stands for the
- bad parameter that RPSORT is complaining about.
-
- RPSORT never executes the sort if it finds syntax errors but instead
- terminates immediately after displaying the last error message.
-
- The list of syntax error messages follows:
-
- ERROR 001: Slash (/) must be followed by a parameter.
-
- A slash was followed by a space. Slash must always be followed by
- one of the switch characters or it must start a sort key definition.
-
- ERROR 002: Illegal parameter: "badparm"
-
- This message is displayed when RPSORT finds an illegal parameter but
- can't figure out a more specific error to cite. It lists this
- message and the bad parameter that it objects to.
-
- ERROR 003: Only one /X switch is allowed.
-
- /X in this message will either be /F, /E or /T. Each of these
- switches may only be specified once in an RPSORT command.
-
- ERROR 004: /P and /C are incompatible.
-
- /P and /C are mutually exclusive. /P says that all character string
- sort keys are Pascal style strings while /C says that all character
- string sort keys are C language style strings. There is no way a
- character string can be both of these.
-
- ERROR 005: Record len must be between 1 and 32,750 in: "badparm"
-
- "badparm" is a /Fnnnn switch specifying a record length that is
- either zero or greater than 32750. This is not allowed.
-
- ERROR 006: Pascal string key only allowed in fixed len record: "badparm"
-
- "badparm" is a sort key definition specifying a Pascal style string
- (i.e. including the P attribute). This is only allowed if a /Fnnnn
- switch was specified to tell RPSORT that the file consists of fixed
- length records.
-
-
-
- September 17, 1991 RPSORT Reference Page 27
-
-
- Error Messages (continued)
- Syntax Error Messages (continued)
-
- ERROR 007: /P only allowed for fixed length records.
-
- The /P switch which says that all character string sort keys are
- Pascal style sort keys is only allowed if a /Fnnnn switch was
- specified to tell RPSORT that the file consists of fixed length
- records.
-
- ERROR 008: Binary number key (F,I,M,T or U) only allowed in fixed len
- record: "badparm"
-
- "badparm" is a sort key definition listing one of the binary number
- attributes. These are only allowed if a /Fnnnn switch was specified
- to tell RPSORT that the file consists of fixed length records.
-
- ERROR 009: /N switch not allowed for fixed length records.
-
- The /N switch, which says that null lines are to be deleted, is only
- allowed for a file consisting of lines. It is not allowed if a
- /Fnnnn switch has been specified.
-
- ERROR 010: One and only one temp drive letter may be entered: "badparm"
-
- "badparm" is a /T switch specifying either no drive letters or more
- than one. It should list only a single drive to be used for
- temporary files (e.g. /TC).
-
- ERROR 011: Non-existent drive: "badparm"
-
- "badparm" is a /T switch specifying a drive letter that does not
- exist in your system.
-
- ERROR 012: Invalid character for the drive: "badparm"
-
- "badparm" is a /T switch specifying a non-alphabetic drive. A drive
- can only be specified by a letter.
-
- ERROR 013: Start column must be between 1 and 32,750: "badparm"
-
- "badparm" is a sort key definition specifying a start column that is
- either zero or larger than 32750. This is not allowed.
-
- ERROR 014: Start column must not exceed record len: "badparm"
-
- "badparm" is a sort key definition specifying a start column that is
- larger than the record length in the /Fnnnn switch. This is not
- allowed.
-
-
-
- September 17, 1991 RPSORT Reference Page 28
-
-
- Error Messages (continued)
- Syntax Error Messages (continued)
-
- ERROR 015: Only one start column allowed: "badparm"
-
- "badparm" is a sort key definition that specifies more than one start
- column for the sort key. This is not allowed.
-
- ERROR 016: Error in sort key: "badparm"
-
- "badparm" is an erroneous sort key definition. RPSORT is unable to
- cite a more specific error.
-
- ERROR 017: Key len must be between 1 and 32,750: "badparm"
-
- "badparm" is a sort key definition specifying a key length that is
- either zero or larger than 32750. This is not allowed.
-
- ERROR 018: Key len is too big to fit in record: "badparm"
-
- "badparm" is a sort key definition containing a key length that would
- cause the key to extend beyond the end of the record as specified by
- the /Fnnnn switch.
-
- ERROR 019: Only one key length allowed: "badparm"
-
- "badparm" is a sort key definition that specifies more than one key
- length for the sort key. This is not allowed.
-
- ERROR 020: Length for 80x87 floating point number must be 4, 8 or
- 10: "badparm"
-
- "badparm" is a sortkey definition specifying an 80x87 type floating
- point number (attribute F). The key length (either explicit or the
- implied key length to the end of the record) is not one of the
- legitimate values (4, 8 or 10).
-
- ERROR 021: Length for GWBASIC/BASICA floating point number must be 4 or
- 8: "badparm"
-
- "badparm" is a sortkey definition specifying a GWBASIC/BASICA type
- floating point number (attribute M). It key length (either explicit
- or the implied key length to the end of the record) is not one of the
- legitimate values (4 or 8).
-
- ERROR 022: Length for Turbo Pascal floating point number must be
- 6: "badparm"
-
- "badparm" is a sortkey definition specifying a Turbo Pascal type
- floating point number (attribute T). It specifies a key length other
- than 6 which is the only legitimate value. It is not necessary to
- specify a key length for Turbo Pascal floating point numbers because
- RPSORT assumes the length 6 by default.
-
-
-
- September 17, 1991 RPSORT Reference Page 29
-
-
- Error Messages (continued)
- Syntax Error Messages (continued)
-
- ERROR 023: Length for Pascal strings must be between 2 and 256: "badparm"
-
- "badparm" is a sortkey definition specifying a Turbo Pascal type
- string (attribute P). It specifies a key length less than 2 or more
- than 256 which are the limits for this type of string and correspond
- to string[1] and string[255] respectively.
-
- ERROR 024: Length for Pascal strings must be between 2 and 256.
-
- The /P switch was specified telling RPSORT that all character string
- type sort keys were Turbo Pascal style strings but at least one of
- the character string sort key definitions gave a key length less than
- 2 or more than 256. Alternatively, one of them had no explicit key
- length but the implied key length to the end of the record was not in
- the required range.
-
- ERROR 025: "P" and "C" attributes are incompatible: "badparm"
-
- "badparm" is a sort key definition that specifys both the P and C
- attributes, thus saying that the sort key is both a Pascal style
- string and a C style string. This is not possible.
-
- ERROR 026: C attribute conflicts with /P: "badparm"
- or
- ERROR 026: P attribute conflicts with /C: "badparm"
-
- "badparm" is a sort key definition that specifys the C or P attribute.
- This conflicts with the opposite /P or /C switch thus implying that
- the sort key is both a C style string and a P style string. This is
- not possible.
-
- ERROR 027: Sort key cannot be both a binary number and a
- string: "badparm"
-
- "badparm" is a sort key definition specifying one of the attributes
- (A, C or P) appropriate to a character string key and also one of the
- attributes (F, I, M, T, U) appropriate to a binary number key. This
- is not allowed.
-
- ERROR 028: Only one binary key type allowed in a sort key: "badparm"
-
- "badparm" is a sort key definition which includes more than one of
- the binary number attributes (F, I, M, T, U). A sort key can't be
- two different kinds of numbers.
-
-
-
- September 17, 1991 RPSORT Reference Page 30
-
-
- Error Messages (continued)
- Syntax Error Messages (continued)
-
- ERROR 029: Only one list of input files and a single output file may be
- given. Found additional file spec: "badparm"
-
- "badparm" is the third filespec or list of filespecs listed in the
- command line. The first filespec or list of filespecs separated by
- plus signs is taken to be the input. Then there should be a single
- filespec for the output. For example:
-
- RPSORT INPUT1.DAT+INPUT2.DAT OUTPUT.DAT
-
- No additional filespec is not allowed.
-
- ERROR 030: Multiple files not allowed in output spec: "badparm"
-
- The first filespec or list of filespecs separated by plus signs is
- taken to the input. Subsequently, in the command line, you would
- enter the output filespec. This must be a single file.
-
- ERROR 031: Misplaced plus sign in input file list: "badparm"
-
- The list of files you specify for the input must be separated by plus
- signs. There must be no spaces around the plus signs and there must
- be no plus sign before the first filespec or after the last filespec
- in the list.
-
- ERROR 032: Input is redirected from the standard input, output must go
- to the standard output. The following file spec is illegal: "badparm"
-
- You redirected the standard input to a file. In this case the
- output must also go to the standard output. This can either be to
- the screen by default or can be redirected to a file.
-
- ERROR 033: You must specify an input file.
-
- You did not specify an input file either explicitly or by redirecting
- the standard input to a file. RPSORT insists that its input come
- from a file specified in one of these two ways.
-
- ERROR 034: No name specified for error file: "badparm"
-
- "badparm" is a /E switch that did not include a file name. The /E
- switch must include a file name. For example: /ESORTERRS.TXT
-
-
-
- September 17, 1991 RPSORT Reference Page 31
-
-
- Error Messages (continued)
-
- DOS Version Before 2.0 Message
-
- ERROR 037: RPSORT requires MS DOS version 2.00 or later.
-
- RPSORT uses MS-DOS functions that were added in version 2.0 and
- therefore can not run with an earlier version.
-
-
- Insufficient Memory Messages
-
- ERROR 040: Not enough memory. RPSORT requires 30,000 bytes.
-
- RPSORT can run in very small amounts of memory but there is a limit.
-
- ERROR 041: Not enough memory to hold at least two records/lines at a
- time.
-
- If the records or lines in your file are large, you may need more
- available memory than the basic 30K RPSORT usually requires. You
- need room to hold at least two lines or records at a time plus you
- need memory to hold the RPSORT program and a few tables and other
- odds and ends. In the extreme case where your lines or records are
- 32000 bytes each, you might need some 90K of memory to run RPSORT.
-
- Line Or String Too Long Messages
-
- ERROR 043: Line exceeds max length of 32750 bytes.
-
- RPSORT found a line in the file that exceeded the maximum allowed
- length of 32750 bytes.
-
- ERROR 044: Found Pascal string whose length byte exceeds specified key
- length.
-
- The binary number in the first byte of a Pasal string (the length
- byte) must be less than the length attribute specified in the sort
- key definition. Otherwise, the string would extend beyond the end
- of the key.
-
-
-
- September 17, 1991 RPSORT Reference Page 32
-
-
- Error Messages (continued)
-
- Input/Output Error Messages
-
- ERROR 046: No data in input file(s) so nothing to do.
-
- There was no data in the input file(s). Either the size of the input
- file(s) was zero or the first byte of each input file was a Ctrl-Z
- thus terminating the input file(s) at the very beginning.
-
- ERROR 047: Input file not found: filename
-
- The input file named by "filename" was not found. Check the spelling
- of the name or add a path if appropriate.
-
- ERROR 048: Error reading input file.
-
- Normally you would not see this error message from RPSORT. Usually
- if there is an uncorrectable error while reading a disk, DOS will
- tell you and then prompt you to specify:
-
- Abort, Retry, Ignore, Fail
-
- Typically you would try R for retry a few times to see if it can get
- past the error. If not, you would probably enter A for abort and
- RPSORT would never know what happened since DOS would terminate it.
-
- If, however, you entered F for fail then DOS would return this info
- to RPSORT and RPSORT would display the ERROR 048 message.
-
- If you were to enter I for ignore then DOS would return to RPSORT
- with no indication that the read had failed. RPSORT would assume
- that data from the file had actually been read into memory. This
- data would be garbage but RPSORT would happily sort the garbage and
- produce a meaningless output file.
-
- ERROR 049: No room on disk to write sorted output file.
-
- There is not enough space on the drive you assigned for the output
- file to hold the latter.
-
- ERROR 050: No room on disk to write temp file.
-
- The drive you assigned for RPSORT temporary files has insufficient
- space to hold them.
-
- ERROR 051: Unable to create temp file.
-
- There was an error attempting to create a temporary file. Probably,
- the disk you assigned to hold temporary files doesn't have enough
- directory entries available in the current directory. RPSORT may
- require up to three directory entries for temporary files.
-
-
-
- September 17, 1991 RPSORT Reference Page 33
-
-
- Error Messages (continued)
- Input/Output Error Messages (continued)
-
- ERROR 052: Unable to create output file.
-
- There was an error attempting to create the output file. Probably,
- the drive and directory you assigned to the output file doesn't have
- any directory entries available.
-
- ERROR 053: Unable to create error file: "badparm"
-
- There was an error attempting to create the error message file.
- Probably, the drive and directory you assigned to the error file
- doesn't have any directory entries available.
-
- ERROR 054: Ran out of space on disk attempting to write error file.
- Redirecting error messages to the screen.
-
- There is not enough space on the drive you assigned for the error
- message file to hold the latter. RPSORT displays the current and any
- subsequent error messages to the screen before terminating.
-
-
- Never Should Happen Error Messages
-
- At a number of points, in RPSORT, I check for errors resulting from the
- use of DOS functions under conditions where in principle no errors
- could occur. Any such errors would imply the possibility of a bug in
- RPSORT. There would also be the possibility of a bug in MS-DOS.
-
- If any of these error messages are displayed, please send me a precise
- description of the circumstances. This would include the amount of
- memory available in your system (the amount reported by CHKDSK or MEM
- not the total amount), the size of the file in bytes and the count of
- records or lines, whether the file consists of fixed length records or
- lines and the kind of sort key(s) you were using.
-
- ERROR 059: Error allocating memory.
-
- An error occurred using memory allocation functions in MS-DOS.
-
- ERROR 060: Unknown error accessing disk.
- through
- ERROR 074: Unknown error accessing disk.
-
- These messages all have the same text, but the error number would
- tell me where in the program the error occurred. All of these have
- to do with disk I/O.
-
-
-