home *** CD-ROM | disk | FTP | other *** search
- DELIMIT
-
- Version 1.0
-
- Copyright 1993 Jefferson P. Carey
- Delimit is a program that scans columnar reports (stored as ascii text
- files) and extracts relevant data, writing data to an ascii file in comma-
- delimited format. A full explanation will follow, but I believe a simple
- example is the best way to show you the capabilities of Delimit. In a
- nutshell, Delimit can take a report like this:
-
- -----------------------------------------------------------------------------
-
- XYZ Corporation Page: 1
- Sales Commission Report
- For The Month Beginning 01/01/1993
-
-
- Salesperson ID Item Number Qty Sold Cost Each Item Sales Commission
- -------------- ----------- -------- --------- ---------- ----------
- 8342 981239872 5 74.95 374.75 56.21
- 987243873 23 14.95 343.85 51.58
- 989123783 3 274.85 824.55 123.68
- ---------- ----------
- Totals: 1543.15 231.47
-
-
- Salesperson ID Item Number Qty Sold Cost Each Item Sales Commission
- -------------- ----------- -------- --------- ---------- ----------
- 8573 981239872 4 74.95 299.80 44.97
- 987243873 27 14.95 403.65 60.55
- 989123783 6 274.85 1649.10 247.37
- ---------- ----------
- Totals: 2352.55 352.89
-
- XYZ Corporation Page: 2
- Sales Commission Report
- For The Month Beginning 02/01/1993
-
-
- Salesperson ID Item Number Qty Sold Cost Each Item Sales Commission
- -------------- ----------- -------- --------- ---------- ----------
- 8342 981239872 6 74.95 449.70 67.46
- 987243873 25 14.95 373.75 56.06
- 989123783 4 274.85 1099.40 164.91
- ---------- ----------
- Totals: 1922.85 288.43
-
-
- Salesperson ID Item Number Qty Sold Cost Each Item Sales Commission
- -------------- ----------- -------- --------- ---------- ----------
- 8573 981239872 3 74.95 224.85 33.73
- 987243873 22 14.95 328.90 49.34
- 989123783 5 274.85 1374.25 206.14
- ---------- ----------
- Totals: 1928.00 289.21
-
- Grand Totals: 7746.55 1162.00
-
- -----------------------------------------------------------------------------
-
- and create a file containing the data from the report in comma-delimited
- format, like this:
-
- "01/01/1993",8342,981239872,5,74.95,374.75,56.21
- "01/01/1993",8342,987243873,23,14.95,343.85,51.58
- "01/01/1993",8342,989123783,3,274.85,824.55,123.68
- "01/01/1993",8573,981239872,4,74.95,299.80,44.97
- "01/01/1993",8573,987243873,27,14.95,403.65,60.55
- "01/01/1993",8573,989123783,6,274.85,1649.10,247.37
- "02/01/1993",8342,981239872,6,74.95,449.70,67.46
- "02/01/1993",8342,987243873,25,14.95,373.75,56.06
- "02/01/1993",8342,989123783,4,274.85,1099.40,164.91
- "02/01/1993",8573,981239872,3,74.95,224.85,33.73
- "02/01/1993",8573,987243873,22,14.95,328.90,49.34
- "02/01/1993",8573,989123783,5,274.85,1374.25,206.14
-
- What's the point? In many organizations, out-of-date, unfriendly, and
- inflexible computer systems still prevail. Many of these systems are
- capable of outputting a variety of highly informative reports (such as the
- one shown above), but lack the capability to be easily customized to
- provide other types of data output. Essentially, the data exists in the
- system, but you can only see it presented in ways that the system
- designers intended (unchangeable reports). In many cases the data
- presented in such reports could be of even greater value if it could be
- extracted and analyzed using software (databases, spreadsheets, etc.) on
- a PC. Herein lies the value of Delimit.
-
- Any modern data analysis PC software that's worth a dime can import
- data from a comma-delimited ascii file (exactly the kind of file created
- by Delimit). Once the data is available to the PC software, the
- possibilities for analysis (and even new reports) are endless.
-
- At this point, if you still don't understand the purpose of Delimit, this
- program probably isn't going to be of use to you. Do me a favor and
- pass it on to a friend who might be interested. You could be doing your
- friend a favor as well.
-
- On the other hand, if you are faced with the same situation I've just
- described, read on. The rest of this document will explain how to use
- Delimit, and includes examples for all of the features.
- Just one thing before we get started. This program is being released as
- shareware...with a twist. Individuals using it for personal use, and
- nonprofit organizations using it in their nonprofit ventures, are free to
- use Delimit without paying the registration fee, if they choose. Anyone
- else using Delimit (for-profit businesses), beyond a reasonable trial
- period (use your own judgement here), must pay a registration fee of
- $24.95 to continue to use the program. Registered users will receive a
- disk containing the latest version of Delimit, an upgrade notice when a
- newer version of the program is available, and a discount off the cost of
- registering the newer version. Please note that individuals and
- nonprofit organizations electing to use Delimit without paying the
- registration fee are only entitled to free use of the program, and not to
- these additional benefits of registered users.
-
- Yes, you could easily "cheat" and use Delimit without paying the
- registration fee. But, my sincere hope is that those who use it will
- appreciate its real value (time saved, information gained, money saved,
- etc.) and will realize that their $24.95 is a worthwhile investment.
-
- Delimit required a great deal of personal time and effort to develop. I
- appreciate all of you who support my work through your registration.
- Thank you!
-
-
-
- To register your copy of Delimit, print
- the file REGISTER.TXT, fill it
- out and enclose payment, and mail it
- to the address at the bottom of the form. Using Delimit
-
- To use Delimit you need to create a configuration file for the report you
- want to process. Don't worry, this configuration file is quite simple to
- make. Below is the configuration file that was used to process the report
- shown earlier in the documentation. I'll explain each line in this
- configuration file next.
-
- -------------------------------------------------------------------------------------
-
- [Settings]
- InputFile=sample.txt
- OutputFile=output.txt
- DiscardFile=discard.txt
- FilterDefault=Exclude
- BlankFieldFill=True
- IncludeOperator=And
- ExcludeOperator=And
- Trimming=True
- StringDelimiter="
- FieldDelimiter=,
-
- [Include]
- 18,11,Numeric
-
- [Exclude]
-
- [Fields]
- 1,14,Numeric
- 18,11,Numeric
- 32,8,Numeric
- 43,9,Numeric
- 55,10,Numeric
- 68,10,Numeric
-
- [Occasionals]
- 25,23,"For The Month Beginning",49,10,Alpha
-
- -------------------------------------------------------------------------------------
-
-
- {Settings] section
- In this section you set the values of several parameters that affect the
- operation of Delimit. Each parameter, and its possible values, is
- explained below.
-
- InputFile
- This parameter specifies the name of the report that you want to
- process. You may specify a drive and path if the file is not in the
- current directory. This parameter is required.
-
- OutputFile
- This parameter specifies the name of the report where you want
- Delimit to send its output (the comma delimited data). You may
- specify a drive and path if the file will not be in the current
- directory. This parameter is required.
-
- DiscardFile
- Later in the configuration file you will be able to specify which
- lines Delimit should "throw out" when processing the report. In
- our example report we are only interested in lines with item sales
- figures, and all other lines should be discarded. If you include
- this parameter in your configuration file, the discarded lines will
- be written to the specified file. This feature is useful for checking
- that the proper lines were discarded when you are working on
- creating a correct configuration file. After running Discard, you
- can look at the contents of the discard file and make sure no good
- lines were discarded. This parameter is optional.
-
- FilterDefault
- This parameter tells Delimit whether to keep or discard lines in
- the report by default. The valid values for this parameter are
- "Include" and "Exclude". In some cases it will be easiest to specify
- which lines contain data, so by default Delimit should exclude
- lines from the report (i.e. it will discard (exclude) a line unless it
- meets the criteria you have specified for keeping a line --
- FilterDefault=Exclude). In other cases, it will be easiest to
- specify which lines to exclude (FilterDefault=Include). For the
- example report, we are going to specify which lines to keep so
- FilterDefault=Exclude.
-
- BlankFieldFill
- This parameter determines whether Delimit will fill a blank field
- with the most recent nonblank value of that field. The valid
- values for this parameter are "True" and "False". In the sample
- report, the salesperson id is shown on the first line for each
- salesperson, but on successive lines this field is blank. In this
- case, BlankFieldFill=True so that salesperson id's will be carried
- down to successive lines in the comma delimited file, until a new
- salesperson id is found. If BlankFieldFill=False, the first few
- lines of the comma delimited file would have looked like this:
-
- "01/01/1993",8342,981239872,5,74.95,374.75,56.21
- "01/01/1993",,987243873,23,14.95,343.85,51.58
- "01/01/1993",,989123783,3,274.85,824.55,123.68
- "01/01/1993",8573,981239872,4,74.95,299.80,44.97
- "01/01/1993",,987243873,27,14.95,403.65,60.55
-
- IncludeOperator
- Later in this documentation I will explain how to specify
- conditions that lines must meet in order to be included in
- processing and output to the comma delimited file. At times it
- might be necessary to specify more than one condition that a line
- must meet to be included. This parameter specifies whether those
- conditions should be combined with an AND or an OR. For
- example, you can specify that a line must meet condition x AND
- condition y, or you can specify that a line must meet condition x
- OR condition y. Valid values for this parameter are "And" and
- "Or".
-
- ExcludeOperator
- Later in this documentation I will explain how to specify
- conditions that lines must meet in order to be excluded from
- processing and output to the comma delimited file. At times it
- might be necessary to specify more than one condition that a line
- must meet to be excluded. This parameter specifies whether those
- conditions should be combined with an AND or an OR. For
- example, you can specify that a line must meet condition x AND
- condition y, or you can specify that a line must meet condition x
- OR condition y. Valid values for this parameter are "And" and
- "Or".
-
- Trimming
- This parameter determines if Delimit will trim spaces from the
- beginning and end of fields that are written to the comma
- delimited file. Valid values for this parameter are "True" and
- "False".
-
- StringDelimiter
- This parameter determines which ascii character will be used to
- delimit strings in the output file. The default is the double quote
- character ("). Either specify a single character (such as " or ') or
- specify the ascii value of a character in three digit decimal form
- (such as 047 or 179).
-
- FieldDelimiter
- This parameter determines which ascii character will be used to
- separate fields in the output file. The default is the comma
- character (,). Either specify a single character (such as , or |) or
- specify the ascii value of a character in three digit decimal form
- (such as 047 or 179).
-
-
- {Include} section
- In this section, you specify the conditions that each line in the report
- must meet in order to be included in the comma delimited file. The
- format of the lines in this section is "column number, number of
- characters, condition". The column number and number of characters
- specify the characters that must meet the condition. The condition can
- be a string, a set of characters, or one of the words "Alpha", "Numeric",
- "Blank", or "NonBlank". In the example report the line "18,11,Numeric"
- specified that the 11 characters, starting in column 18, must be a
- number for the line to be included. Here are some more examples:
-
- The character in column 11 must be 'A', 'B', or 'C':
- 11,1,{ABC}
-
- The 3 characters starting in column 11 must be "Abc":
- 11,3,"Abc"
-
- The first 5 characters on the line must be blank:
- 1,5,Blank
-
- At least one of the first 5 characters must not be blank:
- 1,5,NonBlank
-
- None of the 10 characters starting in column 35 can be a number:
- 35,10,Alpha
-
- The 10 characters starting in column 35 must be a number:
- 35,10,Numeric
- Note: " 123 " is a number, while "123 456" is not a valid
- number.
-
- If you put more than one line in this section, the conditions you specify
- on each line will be combined with one of the logical operators AND or
- OR, as determined by the value of the IncludeOperator parameter. For
- example:
-
- The first character must be an 'A' AND the next 10 characters
- must be a number:
- [Settings]
- IncludeOperator=And
- [Include}
- 1,1,{A}
- 2,10,Numeric
-
- The first character must be an 'A' OR the next 10 characters must
- be a number:
- [Settings]
- IncludeOperator=Or
- [Include}
- 1,1,{A}
- 2,10,Numeric
-
-
- {Exclude] section
- The exclude section is identical to the include section, except that it is
- used to specify the lines that should be excluded rather than included.
- Complex conditions can be specified using a combination of
- FilterDefault, IncludeOperator, ExcludeOperator, [Include}, and
- [Exclude] settings. Some examples follow.
-
- Include all lines that have a number in the first 10 characters
- AND have a '/' in columns 50 and 53 (a good way to search for
- dates of the form MM/DD/YY) but do not have the word "Deleted"
- beginning in column 17:
- [Settings]
- FilterDefault=Exclude
- IncludeOperator=And
- [Include]
- 1,10,Numeric
- 50,1,{/}
- 53,1,{/}
- [Exclude]
- 17,7,"Deleted"
-
- Exclude any lines in which the first 5 characters are blank OR
- contain the word "Total", unless there is a number in the 6
- characters beginning in column 30:
- [Settings]
- FilterDefault=Include
- ExcludeOperator=Or
- [Exclude]
- 1,5,Blank
- 1,5,"Total"
- [Include]
- 30,6,Numeric
-
-
- [Fields] section
- In the Fields section, you specify which columns from the included lines
- contain the data that you want sent to the output file. Each line in the
- Fields section is of the form "column number, number of characters, field
- type". The column number and number of characters specify which
- characters to extract from the line in the report. The field type is either
- of the words "Alpha" or "Numeric". If the field is alpha, it will be
- enclosed in quotes in the output file. Numeric fields will not be enclosed
- in quotes. For example, in the sample output file line below, the first
- field was specified as Alpha and the second field was specified as
- numeric.
-
- "John Doe", 25
-
- You may specify an unlimited number of fields. They will be sent to the
- output file in the order specified in the configuration file.
-
-
- [Occasionals] section
- An occasional is a combination of both an Include and a Field
- specification. In many reports, data is listed only occasionally at the
- beginning of a section or at the top of a page. In the sample report
- shown earlier, the reporting period is only shown at the top of each page
- on the line that contains the text "For The Month Beginning". To
- include this data at the beginning of each line in the output file, an
- occasional was specified in the configuration file. The format of the lines
- in the Occasionals section is "column number, number of characters,
- condition, field column, field characters, field type". The first three
- parameters specify the condition that occasional lines meet, and the last
- three parameters specify the position on the line and type of the data to
- be written to the output file. For example:
-
- Any lines containing the text "For The Month Beginning" starting
- in column 25, contain a 10 character Alpha field starting in
- column 49, is specified as:
- 25,23,"For The Month Beginning",49,10,Alpha.
-
- The fields specified in the Occasionals section will not be output as the
- occasional lines are encountered, but instead, at the beginning of each
- and every line that is included from the report. Take a look at the
- sample output file to see how this works. You may have more than one
- occasional, with each one being output at the beginning of each included
- line.
-
-
- Running Delimit
- Once you have created a configuration file, just type DELIMIT followed
- by the name of the configuration file, then press the Enter key. A
- sample report and configuration file have been included with Delimit.
- The sample report is named SAMPLE.TXT and the configuration file for
- processing this report is named SAMPLE.CFG. To run Delimit on this
- report, just type DELIMIT SAMPLE.CFG and press the Enter key. The
- results will be written to the files OUTPUT.TXT and DISCARD.TXT.
-
-
- Contacting the Author
- I can be reached on CompuServe. My ID is 70413,1360.
-
- You may also contact me in writing at:
- Jeff Carey
- 3735 Eastmont Avenue
- Bloomington, IN 47403
-
- I'd greatly appreciate any suggestions for improvement, constructive
- criticisms, or even compliments!
-