home *** CD-ROM | disk | FTP | other *** search
Text File | 1990-11-10 | 336.8 KB | 9,614 lines |
- I
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- -EQTAwkF-Ç
- -EUtility Creation ToolF-Ç
-
-
- For PC/MS-DOS
- Version 4.20 10-10-90
-
-
-
-
-
-
-
-
- Saturday, November 10, 1990
-
-
-
-
-
- -E(c) Copyright 1989, 1990 Pearl BoldtF-Ç
-
- Darnestown, MD 20878
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- QTAwk License
- Utility Creation Program
- Version 4.20 10-10-90
- (c) Copyright 1988 - 1990 Pearl Boldt. All Rights Reserved.
-
- Pearl Boldt
- Quik Trim
- 13012 Birdale Lane
- Darnestown, MD 20878
- CompuServe ID: 72040.434
-
- Registration Information
-
- QTAwk is a copyrighted program protected by both U.S. and
- international copyright law. If you obtained QTAwk from a
- shareware disk vendor, an on-line computer service or bulletin
- board, a friend or colleague, or another similar source, you have
- an unregistered (trial) copy. You may use this copy without
- charge for a limited period of time under the terms of the QTAwk
- license agreement (below). After this time is up, you must
- register and pay for QTAwk to continue using it.
-
- This method of distribution is known as shareware. It allows you
- to determine whether QTAwk meets your needs before you pay for
- it.
-
- The registration fee for a single copy of QTAwk is $50. Payment
- of this fee entitles you to:
-
- * A disk with the latest version of QTAwk, registered to you.
-
- * One copy of the printed QTAwk manual.
-
- * An upgrade to the next release of QTAwk.
-
- * Technical support via electronic mail or telephone.
-
- If you prefer, you may register for $35 and receive only the
- disk and notices of future upgrades. Network, site, and corporate
- licenses are also available; contact the copyright holder for
- more information.
-
- Upgrade Information
-
- If you purchased QTAwk version 4.02 or later at the $50 rate, or
- a site license for version 4.02 or later, you are entitled to a
- free upgrade to version 4.20. If you are not entitled to a free
- upgrade, or you wish to order a version 4.20 manual use the order
-
-
- QTAwk - iii - QTAwk
-
-
-
-
-
-
- form following the License Agreement.
-
- QTAwk License Agreement
-
- 1. Copyright: The QTAwk program and all other programs and
- documentation distributed or shipped with it are Copyright
- 1988 - 1990 Pearl Boldt and are protected by U.S. and
- International Copyright law. In the rest of this document,
- this collection of programs is referred to simply as "QTAwk".
- You are granted a license to use your copy of QTAwk only
- under the terms and conditions specified in this license
- agreement.
-
- 2. Definitions: QTAwk is distributed in two forms. A
- "registered" copy of QTAwk is a copy distributed on diskette,
- purchased from the copyright holder. A "shareware" copy of
- QTAwk is a copy distributed on diskette or via an electronic
- bulletin board, on-line service, or other electronic means,
- obtained from a shareware disk vendor, or obtained from
- another individual.
-
- 3. Shareware Copies: Shareware copies of QTAwk are distributed
- to allow you to try the program before you pay for it. They
- are Copyright 1988 - 1990, Pearl Boldt and do not constitute
- "free" or "public domain" software. You may use a shareware
- copy of QTAwk at no charge for a trial period of up to 21
- days. If you wish to continue using QTAwk after that period,
- you must purchase a registered copy. If you choose not to
- purchase a registered copy, you must stop using QTAwk, though
- you may keep copies and pass them along to others. You may
- give QTAwk to others for noncommercial use IF:
-
- => All Files And Documentation Accompany The Programs.
- => The Files Are Not Modified In Any Way.
-
- 4. Registered Copies: Registered copies of QTAwk are
- distributed to those who have purchased them from the
- copyright holder.
-
- 5. Use of One Copy on Two Computers: If you have a registered
- copy of QTAwk which is licensed for use on a single computer,
- you may install it on two computers used at two different
- locations (for example, at work and at home), provided there
- is no possibility that the two computers will be in use at
- the same time, and provided that you yourself have purchased
- QTAwk, or if QTAwk was purchased by your employer, that you
- have your employer's explicit permission to install QTAwk on
- two systems as described in this paragraph. The right to
-
-
- QTAwk - iv - QTAwk
-
-
-
-
-
-
- install one copy of QTAwk on two computers is limited to
- copies originally licensed for use on a single computer, and
- may not be used to expand the number of systems covered under
- a multi-system license.
-
- 6. Use of QTAwk on Networks or Multiple Systems: You may
- install your registered copy of QTAwk on a computer attached
- to a network, or remove it from one computer and install it
- on a different one, provided there is no possibility that
- your copy will be used by more users than it is licensed for.
- A "user" is defined as one keyboard which is connected to a
- computer on which QTAwk is installed or used, regardless of
- whether or not the user of the keyboard is aware of the
- installation or use of QTAwk in the system.
-
- 7. Making Copies: You may copy any version of QTAwk for normal
- backup purposes, and you may give copies of the shareware
- version to other individuals subject to paragraph (4) above.
- You may not give copies of the registered version to any
- other person for any purpose, without explicit written
- permission from the copyright holder.
-
- 8. Distribution Restrictions: You may NOT distribute QTAwk
- other than through individual copies of the shareware version
- passed to friends and associates for their individual,
- non-commercial use. Specifically, you may not place QTAwk or
- any part of the QTAwk package in any user group or commercial
- library, or distribute it with any other product or as an
- incentive to purchase any other product, without express
- written permission from the copyright holder and you may not
- distribute for a fee, or in any way sell copies of QTAwk or
- any part of the QTAwk package. If you are a shareware disk
- vendor approved by the Association of Shareware Professionals
- (ASP), you may place QTAwk in your library without prior
- written permission, provided you notify the copyright holder
- within 15 days of doing so and provided your application has
- been fully approved in writing by the ASP, and is not simply
- submitted or awaiting review.
-
- 9. Use of QTAwk: QTAwk is a powerful program. While we have
- attempted to build in reasonable safeguards, if you do not
- use QTAwk properly you may destroy files or cause other
- damage to your computer software and data. You assume full
- responsibility for the selection and use of QTAwk to achieve
- your intended results. As stated below, the warranty on QTAwk
- is limited to replacement of a defective program diskette or
- manual.
-
-
-
- QTAwk - v - QTAwk
-
-
-
-
-
-
- 10. LIMITED WARRANTY: All warranties as to this software,
- whether express or implied, are disclaimed, including without
- limitation any implied warranties of merchantability, fitness
- for a particular purpose, functionality or data integrity or
- protection are disclaimed.
-
- 11. Satisfaction Guarantee: If you are dissatisfied with a
- registered copy of QTAwk for any reason (whether or not you
- find a software error or defect), you may return the entire
- package at any time up to 90 days after purchase for a full
- refund of your original registration fee.
-
- Questions may be sent to:
-
- Pearl Boldt
- Quik Trim
- 13012 Birdale Lane
- Darnestown, MD 20878
- CompuServe ID: 72040.434
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- QTAwk - vi - QTAwk
-
-
-
-
-
-
- QTAwk 4.20 Order Form
- Utility Creation Program
- Version 4.20 10-10-90
- (c) Copyright 1988 - 1990 Pearl Boldt. All Rights Reserved.
-
- Return to:
- Pearl Boldt
- Quik Trim
- 13012 Birdale Lane
- Darnestown, MD 20878
-
- Make all Checks Payable to: Pearl Boldt
-
- Name:
- Company:
- Address:
-
- Phone:
-
- Register QTAwk to: Company (___) or Individual (___)
- Send information on: Site Licenses (___), Reseller Pricing (___)
-
- I have read and agree to abide by the QTAwk license agreement,
-
- Signature:
-
- Where did you hear about QTAwk?
-
-
- Quantity Price
-
- Disk, manual, next update ($50/copy): ________ $ ________.____
- Disk only, no update ($35/copy): ________ $ ________.____
-
- Disk size: ___ 5.25" acceptable ___ 3.5" required
-
- Subtotal $ ________.____
-
- Shipping charges, per copy: $ ________.____
-
- Disk, manual, next update: │ Disk only:
- US standard - included │ US standard - included
- US 2-day - $8.00 (US) │ US 2-day - $8.00 (US)
- Canada (air) - $5.00 (US) │ Canada (air) - $3.00 (US)
- All Others (air) - $10.00 (US) │ All Others (air) - $5.00 (US)
-
- Total enclosed: $ ________.____
-
-
-
- QTAwk - vii - QTAwk
-
-
-
-
-
-
- ===> Please read the following before ordering! <===
-
- Order Information
-
- International Orders:
-
- Orders from outside the U.S. must be paid by a check or money
- order in U.S. funds and drawn on a U.S. bank; or by an
- international postal money order in U.S. dollars. Checks which
- are not in U.S. funds and drawn on a U.S. bank will be returned
- due to extremely high charges imposed by U.S. banks to collect
- the funds. Purchase orders (minimum $200) can be accepted from
- outside the U.S., but you must contact us before ordering.
-
- Company Purchase Orders:
-
- Purchase orders for amounts of $100 and over are accepted from
- established U.S. companies; orders under $100 are accepted but
- must be prepaid. Have your purchasing agent contact Pearl Boldt
- for terms. Credit references will be required for new customers.
-
- Multi-System Licenses:
-
- Multi-system licensing arrangements are available for network,
- site, and corporate use of QTAwk. Check the line on the order
- form or con- tact us for more information. A sample schedule of
- license fees is below; contact us for pricing on the exact number
- of systems you wish to license. The fee includes a master
- diskette and one manual; addi- tional manuals are $10 each (less
- for over 100 copies).
-
-
- Systems Price Systems Price Systems Price
- 2 85.00 15 425.00 50 1,150.00
- 3 120.00 20 550.00 60 1,160.00
- 4 155.00 25 675.00 70 1,320.00
- 5 190.00 30 750.00 80 1,480.00
- 10 330.00 40 950.00 100 1,800.00
-
- Return to:
- Pearl Boldt
- Quik Trim
- 13012 Birdale Lane
- Darnestown, MD 20878
-
- Make all Checks Payable to: Pearl Boldt
-
-
-
-
- QTAwk - viii - QTAwk
-
-
-
-
-
-
- QTAwk Update History
-
- ==> QTAwk Version 4.02. This version contains two additions
- from the previous versions:
-
- 1. The command line argument, double hyphen, "--", stops
- further scanning of the command line for options. The double
- hyphen argument is not passed to the QTAwk utility in the
- ARGV array or counted in the ARGC variable. Since QTAwk only
- recognizes two command options, this has been included to be
- compatible with the latest Unix(tm) conventions.
-
- 2. The built-in array ENVIRON has been added. This array
- contains the environment strings passed to QTAwk. Changing a
- string in ENVIRON will have no effect on the environment
- strings passed in the QTAwk "system" built-in function.
- Environment strings are set with the PC/MS-DOS "SET" command.
- The strings are of the form:
-
- name = string
-
- where the blanks on either side of the equal sign, '=', are
- optional and depend on the particular form used in the "SET"
- command. The QTAwk utility may scan the elements of ENVIRON
- for a particular name or string as desired.
-
-
- ==> QTAwk Version 4.10. This version contains one addition from
- the previous versions:
-
- 1. In previous versions, the GROUP pattern keyword could accept
- patterns consisting only of a regular expression constant.
- For version 4.10, The GROUP pattern keyword has been expanded
- to accept {re] constants, string constants and variables. The
- variables are evaluated at the time the GROUP patterns are
- first utilized to scan an input record. The value is
- converted to string form and interpreted as a regular
- expression.
-
- GROUP /regular expression constant/ { ... }
- GROUP "string constant" { ... }
- GROUP Variable_name { ... }
-
- GROUP patterns are still converted into an internal form for
- regular expressions only once, when the pattern is first used
- to scan an input line. Any variables in a GROUP pattern will
- be evaluated, converted to string form and interpreted a
- regular expression.
-
-
- QTAwk - ix - QTAwk
-
-
-
-
-
-
- ==> QTAwk Version 4.20, dated 10/11/90. This version contains
- three additions from the previous versions:
-
- 1. The behavior of the RS pre-defined variable has been
- changed. It is now similar to the behavior of the FS
- variable. If RS is assigned a value, which when converted to
- a string value, is a single character in length, then that
- character becomes the record separator. If the string is
- longer in length than a single character, then it is treated
- as a regular expression. The string matching the regular
- expression is treated as a record separator. As for FS, the
- string value is converted to the internal regular expression
- form when the assignment is made.
-
- 2. Two new functions have been added:
- getc() --> reads a single character from the current input
- file. The character is returned by the function.
- fgetc(file) --> reads a single character from the file
- 'file'. The character is returned by the function.
-
- These functions allow the user to naturally obtain single
- characters from any file including the standard input file
- (which would be the keybord if not redirected or piped).
-
- 3. Error messages now have a numerical value displayed in
- addition to the short error message. The error messages are
- listed in numerical order in the QTAwk documentation with a
- short explanation of the error. In some cases, an attempt has
- been made to provide guidance as to what may have caused the
- error and possible remedies. Since the error messages are
- generated at fixed points within QTAwk and may be caused by
- different reasons in different utilities during
- interpretation or during execution on data files, it is not
- possible to list every possible reason for the display of the
- error messages. The line number within the QTAwk utility on
- which the error was discovered and the input data file record
- number are provided in the error message to provide some help
- to the user in attempting to ascertain the real reason for
- the error.
-
-
-
-
-
-
-
-
-
-
-
- QTAwk - x - QTAwk
-
-
-
-
-
-
- Introduction
-
- QTAwk is called a Utility Creation Tool and not a programming
- language because it is intended for the average computer user as
- well as the more experienced user and programmer. QTAwk has been
- designed to make it easy for the average user to create those
- small, or maybe not so small, utilities needed to accomplish
- small, or not so small, everyday jobs. The jobs which are too
- small to justify the time and cost of using the traditional
- computer programming language and maybe hiring a professional
- programmer to accomplish.
-
- This paper presents a description of the QTAwk utility creation
- tool and its use. Most computer users have many small tasks to
- accomplish that are usually left undone for lack of the proper
- tool. Typically these tasks require finding one or more records
- within a file and executing some action depending on the record
- located.
-
- In order to accomplish these tasks the user needs a tool which
- will allow the following to be accomplished easily:
-
- 1. reading files record by record,
-
- 2. spliting (parsing) the records read into words or fields,
-
- 3. determining if a record, or records, satisfy a
- pre-determined match criteria, i.e. finding the "proper"
- record(s),
-
- 4. when the proper records are found, executing some action or
- actions on the records or fields of the records.
-
- QTAwk supplies the user with all of these features in an easy to
- use manner. Specifying the name of a file is all the user need do
- to open the file and read it record by record. The user may
- easily change what a "record" is or let it default to an ASCII
- text line as used by all text editors and which can be written by
- all word processors. QTAwk will automatically split (parse)
- records into fields. Initially a field is a word or a sequence of
- non-blank characters. The user may change the definition of a
- field easily to adapt to the needs of a particular situation.
-
- Arithmetic expressions, logical expressions or regular
- expressions may be used to define the criteria for selecting
- records for action. Regular expressions are a powerful means of
- describing the criteria for selecting, i.e matching, the text of
- records. Arithmetic expressions utilize the ordinary arithmetic
-
-
- QTAwk - xi - QTAwk
-
-
-
-
-
-
- operators (addition, subtraction, multiplication, etc.) for
- describing the criteria for selecting records and logical
- expressions utilize the logical operators (less than, equal to,
- greater than, etc.) for selecting records.
-
- Of all the operators available in QTAwk, the regular expression
- operators may be only ones most readers are not familiar with.
- Regular expressions are a powerful and useful tool for working
- with text. Yet for all their power, they are surprisingly simple
- and easy to use when learned. Chapter Two explains regular
- expressions fully, in a manner that will make them usable by a
- person totally unfamiliar with them.
-
- QTAwk is patterned after The Awk Programming Language by Alfred
- V. Aho, Brian W. Kernighan and Peter J. Weinberger. The Awk
- program implementing The Awk Programming Language is available on
- most Unix (tm) systems. Aho, Kernighan and Weinberger invented
- the automatic input loop and the pattern-action pairs used in
- QTAwk and are to be heartily congratulated for this. Without Awk,
- QTAwk would not exist. QTAwk is an extensive expansion of The Awk
- Programming Language in many important aspects. In addition, some
- of the admitted shortcommings of The Awk Programming Language
- have been corrected.
-
- A short summary of the major differences between QTAwk and Awk
- is given below. Appendix II contains a more detailed listing of
- the differences.
-
- 1. Expanded set of regular expression operators,
- 2. Use of "named expression"s in regular expressions to simplify
- construction of complicated regular expressions,
- 3. Expanded arithmetic operator set,
- 4. Expanded set of pre-defined patterns giving more control in
- the sequence of utility execution,
- 5. True multi-dimensional arrays
- 6. Integration of the multi-dimensional arrays with the
- arithmetic operators allowing the assignment of and operation
- on entire arrays.
- 7. Integration of the multi-dimensional arrays with user-defined
- functions allowing the use of arrays in functions in a
- natural and intuitive manner,
- 8. Expanded set of keywords allowing local variables,
- 'switch'/'case' flow control, and premature closure of the
- current input file,
- 9. Expanded set of arithmetic and string built-in functions, a
- new array function, and new variable access functions,
- 10. Corrected input function syntax,
- 11. Added new Input/Output functions,
-
-
- QTAwk - xii - QTAwk
-
-
-
-
-
-
- 12. Expanded formatted I/O capability,
- 13. Expanded user-defined functions allowing variable number of
- arguments,
- 14. New user controlled utility execution trace capability,
- 15. Expanded list of built-in variables giving more control and
- access to QTAwk utility execution.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- QTAwk - xiii - QTAwk
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- QTAwk - 0-16 - QTAwk
-
-
-
-
-
-
- Section 1.0 Tutorial
-
-
- E-1.0 TUTORIALF-Ç
-
- E1.1 DataF
-
- QTAwk is designed to be used to search data or text files using
- short user created utilities. The types of files that QTAwk is
- designed to work with are "text" files, commonly called ASCII
- files. The files contain user readable text and numbers. The text
- is contained in lines and the lines end with carriage-return,
- new-line character pairs or single new-line characters. Text
- files are written by application programs and word processors and
- text editors.
-
- The information in the files is grouped by fields on a single
- line or by lines separated by a blank line or some other
- "special" characters. For example, the following lines list
- information on various states:
-
- US # 10461 # 4375 # MD # Annapolis ( Maryland )
- US # 40763 # 5630 # VA # Richmond ( Virgina )
- US # 2045 # 620 # DE # Dover ( Delaware )
- US # 24236 # 1995 # WV # Charleston ( West Virginia )
- US # 46047 # 12025 # PA # Harrisburg ( Pennsylvania )
- US # 7787 # 7555 # NJ # Trenton ( New Jersey )
- US # 52737 # 17895 # NY # Albany ( New York )
- US # 9614 # 535 # VT # Montpelier ( Vermont )
- US # 9278 # 975 # NH # Concord ( New Hampshire )
- US # 33265 # 1165 # ME # Augusta ( Maine )
-
- Each line, or record in QTAwk, consists of 12 words. The 12
- words of the first record are:
-
- 1: US
- 2: #
- 3: 10461
- 4: #
- 5: 4375
- 6: #
- 7: MD
- 8: #
- 9: Annapolis
- 10: (
- 11: Maryland
- 12: )
-
- The first word lists the country, the third word lists the state
-
-
- QTAwk - 1-1 - QTAwk
-
-
-
-
-
-
- Section 1.1 Tutorial
-
-
- area in square miles, the fifth word lists the state population
- in thousands. the seventh word lists the state abbreviation, the
- nineth word lists the state capital, and the eleventh word lists
- the state name. The second, fourth, sixth, eighth, tenth and last
- words are word separators. The word separators are not necessary
- for QTAwk, but make each line easier for people to read. A copy
- of this entire file, states.dta, is given in Appendix IV.
-
- This information could be manipulated in various ways. A few of
- the ways in which this could be done are:
- 1. the manner of listing changed, or
- 2. only lines meeting certain criteria listed:
- a) those states with a minimum area,
- b) those states with a minimum population,
- c) population greater than a minimum and less than a
- maximum,
- d) area less than a maximum and population greater than a
- minimum,
- e) population density (population / area) less than a
- maximum.
- 3. the list could be sorted
- a) alphabetically by
- 1: state capital,
- 2: state name,
- 3: state abbreviation.
- b) by area,
- c) by population.
- 4. some Information could be deleted from the list such as the
- capital.
-
- There are many more ways to manipulate the information. In order
- to do so the information in the list must first be read record by
- record and each record split into its constituent parts. Once the
- parts for each record have been determined, the information can
- be easily manipulated, changed, or rearranged.
-
- E1.2 Running QTAwkF
-
- QTAwk is started from the DOS command prompt, giving the QTAwk
- utility to run and the files to search. The QTAwk utility may be
- written directly on the command line or contained in one or more
- files named on the command line. If given on the command line, it
- is usually enclosed in double quotes:
-
- QTAwk "$5 > 50000 {print;}" states.dta
-
-
-
- QTAwk - 1-2 - QTAwk
-
-
-
-
-
-
- Section 1.2 Tutorial
-
-
- This QTAwk utility will print the record for every state for
- which the area is greater than 50,000 square miles.
-
- The example shows the form of QTAwk utilities, a sequence of
- patterns and actions in the form:
-
- pattern1 { action1 }
- pattern2 { action2 }
- pattern3 { action3 }
- .
- .
- .
-
- QTAwk opens the files named on the command line, reads a record,
- splits (parses) each record into the individual words or fields
- and compares the record with each pattern in the order in which
- they have been written in the QTAwk utility. If the record
- matches a pattern, the corresponding action contained in braces
- is executed.
-
- Patterns may be arithmetic expressions, logical expressions,
- regular expressions or combinations of all three types of
- expressions. The example above has a logical expression pattern.
-
- Under DOS, programs indicate the end of text lines in ASCII
- files with a Carriage Return, Newline character pair. QTAwk
- follows the practice of converting all such pairs to a single
- newline when reading the file. In writing files, QTAwk converts
- single Newline characters to a Carriage Return, Newline pair.
-
- For the data in the "states" data file, a question that may be
- asked is the total population of Canada. The first field can be
- used to identify the data for Canada and the fifth field contains
- population data. The following utility will sum the population
- data for Canada:
-
- $1 == "Canada" { Total += $5 }
- END { print Total; }
-
- In this example, when the first field of a record is equal to
- "Canada", the fifth field is accumulated into the variable Total.
- When all records have been processed, Total is printed. The
- printing of Total is accomplished in the action associated with
- the pattern 'END'. 'END' is a pre-defined pattern, the associated
- action is executed after closing the input file.
-
-
-
- QTAwk - 1-3 - QTAwk
-
-
-
-
-
-
- Section 1.2 Tutorial
-
-
- The remaining chapters explain QTAwk expressions, patterns,
- action statements and more. All of these are combined into a
- QTAwk utility. In using and creating QTAwk utilities, the user
- needs to remember the fundamental QTAwk processing sequence:
-
- 1. QTAwk opens each input file and reads the file record by
- record,
- 2. as each record is read, it is split into fields,
- 3. the record is then compared against the patterns for matches,
- 4. When a match is found, the associated action is executed.
-
- Keeping this fundamental loop in mind will make using QTAwk very
- simple indeed.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- QTAwk - 1-4 - QTAwk
-
-
-
-
-
-
- Section 2.0 Regular Expression
-
-
- E-2.0 REGULAR EXPRESSIONSF-Ç
-
- Regular expressions are a means of describing sequences of
- "characters". In the discussion of QTAwk, "character" will be
- taken to mean any character from the extended ASCII sequence of
- characters from ASCII '1' to ASCII '255'. Appendix I contains a
- listing of the ASCII characters with both their decimal and
- hexadecimal equivalent.
-
- A string is a finite sequence of characters. The length of a
- string is the number of characters contained in the string. A
- special string is the empty string, also called the null string,
- which is of zero length, i.e., it contains no characters. We
- shall use the symbol 'ε' below to refer to the null string.
-
- Another way to think of a string is as the concatenation of a
- sequence of characters. Two strings may be concatenated to form
- another string. Concatenating the two strings:
-
- "abcdef"
-
- and
-
- "ghijklmn"
-
- forms the third string:
-
- "abcdefghijklmn"
-
- In many instances, it is desirable to describe a string with
- several alternatives for one or more of the characters. Thus we
- may wish to find the strings:
-
- FRED
-
- or
-
- TED
-
- A convenient manner of describing both strings with the same
- regular expression is
-
- /(FR|T)ED/
-
- Strings in QTAwk are enclosed in double quotes, ", and regular
- expressions are enclosed in slashes, '/'.
-
-
- QTAwk - 2-1 - QTAwk
-
-
-
-
-
-
- Section 2.1 Regular Expression
-
-
- E2.1 'OR' OperatorF
-
- The symbol '|' means "or" and so the above regular expression
- would be read as: The string "FR" or the string "T" concatenated
- with the string "ED". The parenthesis are used to group strings
- into equivalent positions in the resultant regular expression. In
- this manner it is possible to build a regular expression for
- several alternative strings.
-
- In many instances it is also desirable to build regular
- expressions that contain many alternatives for one character,
- i.e., one character strings. For example, we may want to find all
- instances of the words "doing" or "going". We could build the
- regular expression:
-
- /(d|g)oing/
-
- E2.2 Character ClassesF
-
- Although the last regular expression is a fairly simple example,
- it serves to introduce the notion of "character class". If we
- define the notation:
-
- [dg] = (d|e)
-
- then we may write the regular expression as:
-
- /[dg]oing/
-
- The character class notation saves us from having to explicitly
- write the "or" symbols in the regular expression. The "or" is
- implied between each character of the class.
-
- Now suppose that we wanted to expand our search to all five
- letter words ending in "ing" and starting with any lower-case
- letter and having any lower-case letter as the second character.
- We would write the regular expression:
-
- /(a|b|c|d|...|x|y|z)(a|b|c|d|...|x|y|z)ing/
-
- or
-
- /[abcd...xyz][abcd...xyz]ing/
-
- Regular expressions in these cases can not only get very long,
- but can be very tedious to write and are very prone to error. We
-
-
- QTAwk - 2-2 - QTAwk
-
-
-
-
-
-
- Section 2.2 Regular Expression
-
-
- introduce the notion of a range of characters into the character
- class and define:
-
- [a-z] = [abcd...xyz] = (a|b|c|d|...|x|y|z)
-
- The above regular expression can now be written:
-
- /[a-z][a-z]ing/
-
- a considerable savings and less error prone. The hyphen, '-', is
- recognized as expressing a range of characters only when it
- occurs within a character class. Within character classes, the
- hyphen loses this significance in the following three cases:
-
- 1. when it is the first character of the character class, e.g.,
-
- [-b] = (-|b)
-
- 2. when it is the last character of the character class, e.g.,
-
- [b-] = (b|-)
-
- 3. when the first character of the indicated range is greater
- in the ASCII collating sequence than the second character of
- the indicated range, e.g.,
-
- [z-a]
-
- would be recognized as:
-
- (z|-|a)
-
- In interpreting the range notation in character classes, QTAwk
- uses the ASCII collating sequence.
-
- [0-Z]
-
- is equivalent to:
-
- [0123456789:;<=>?@A-Z]
-
- Continuing the last example, if we did not want to limit the
- first character to lower-case, but also wanted to include the
- possibility of upper-case letters, we could use the following
- regular expression:
-
-
-
- QTAwk - 2-3 - QTAwk
-
-
-
-
-
-
- Section 2.2 Regular Expression
-
-
- /([A-Z]|[a-z])[a-z]ing/
-
- This regular expression allows the first letter to be any
- character in the range from A to Z or in the range from a to z.
- But the "or" is implied in character classes, shortening the
- above regular expression to:
-
- /[A-Za-z][a-z]ing/
-
- If we now wish to expand the above from all five letter words
- ending in "ing" to all six letter words ending in "ing", we could
- write the regular expression as:
-
- /[A-Za-z][a-z][a-z]ing/
-
- In general, if we did not want to specify the number of
- characters between the first letter and the "ing" ending, we
- could specify an regular expression as:
-
- /[A-Za-z](ε|[a-z])(ε|[a-z])...(ε|[a-z])ing/
-
- By specifying the null string in the 'or' regular expression,
- the regular expression allows a character in the range a to z or
- no character to match. The shortest string matched by this
- regular expression would be a single upper or lower case letter
- followed by "ing". The regular expression would also match any
- string starting with an upper or lower case letter with any
- number of lower case letters following and ending in "ing".
-
- E2.3 ClosureF
-
- What we need to describe this regular expression is a notation
- for specifying "zero or more" copies of a character or string.
- Such a notation exists and is written as:
-
- /[A-Za-z][a-z]*ing/
-
- where the notation
-
- [a-z]*
-
- means zero or more occurrences of the character class [a-z].
- This operation is called closure and the '*' is called the
- closure operator. In general, the notation may be used for any
- regular expression within a regular expression. The following are
- valid regular expressions using the notion of zero or more
-
-
- QTAwk - 2-4 - QTAwk
-
-
-
-
-
-
- Section 2.3 Regular Expression
-
-
- occurrences of an regular expression within another regular
- expression:
-
- /mis*ion/
-
- would match "miion", "mision", "mission", "misssion",
- "missssion", etc.
-
- /bot*om/
-
- would match "boom", "botom", "bottom", "botttom", "bottttom,
- etc.
-
- /(Fr|T)*ed/
-
- would match "ed", "Fred", "Ted", "FrFred", "TTed", "FrFrFred",
- "TTTed", "FrTFred", "FrFrTed", "TFrFred", etc.
-
- As an extension to the '*' operator, we frequently would want to
- search for one or more occurrences of a regular expression. As
- above we would write this as:
-
- /[A-Za-z][a-z][a-z]*ing/
-
- The [a-z][a-z]* construct would ensure that at least one letter
- occurred between the initial letter and the string "ing". This
- occurs often enough that the notation
-
- [a-z]+ = [a-z][a-z]*
-
- has been adopted to handle this situation. Thus use the operator
- '*' for zero or more occurrences and the operator '+' for one or
- more occurrences. The '+' operator is called the positive closure
- operator.
-
- In many cases it is desirable to search for either zero or one
- regular expression. For example, it would be desirable to search
- for names preceded by either Mr or Mrs The regular expression:
-
- /Mrs*/
-
- would find: Mr and Mrs and Mrss and Mrsss, etc. The following
- regular expression will accomplish what we really want in this
- case:
-
- /Mr(ε|s)/
-
-
- QTAwk - 2-5 - QTAwk
-
-
-
-
-
-
- Section 2.3 Regular Expression
-
-
- This regular expression would find 'Mr' followed by zero or one
- 's'.
-
- The operator '?' has been selected to denote 'zero or one' of
- the preceding regular expression. Thus,
-
- /Mrs?/ = /Mr(ε|s)/
-
- E2.4 Repetition OperatorF
-
- In some cases we wish to specify a minimum and maximum repeat
- count for a regular expression. For example, suppose it was
- desirable for a regular expression to contain a minimum of 2 and
- a maximum of 4 copies of "abc". We could specify this as:
-
- /abcabc(abc)?(abc)?/
-
- The notation {2,4} has been adopted for expressing this. The
- general form of the repetition operator is {n1,n2}. n1 and n2 are
- integers, with n1 greater than or equal to 1 and n2 greater than
- or equal to n1, 1 <= n1 <= n2. A repetition count would be
- specified as:
-
- /r{n1,n2}/ = /rrrrrrrrrrrrrr?r?r?r?r?r?/
- │<─── n1 ───>│ │
- │<──────── n2 ────────>│
-
- The above could be expressed as:
-
- /(abc){2,4}/ = /(abc)(abc)(abc)?(abc)?/
-
- Since the repetition operator repeats the immediately preceding
- regular expression, the parenthesis around "abc" are necessary to
- repeat the whole string. Without the parenthesis the regular
- expression would expand as:
-
- /abc{2,4}/ = /abccc?c?/
-
- The repetition operator can be used to repeat either single
- characters, groups of characters, character classes or quoted
- strings. The use of the operator is illustrated below for each
- case:
-
- 1. Single characters:
-
- /abc{2,4}/ = /abccc?c?/
-
-
- QTAwk - 2-6 - QTAwk
-
-
-
-
-
-
- Section 2.4 Regular Expression
-
-
- 2. Groups of regular expressions:
-
- /(abc){2,4}/ = /(abc)(abc)(abc)?(abc)?/
-
- 3. character classes:
-
- /[abc]{2,4}/ = /[abc][abc][abc]?[abc]?/
-
- 4. quoted string:
-
- /"abc"{2,4}/ = /"abcabc(abc)?(abc)?"/
-
- For quoted strings, the whole of the string contained within
- quotes is repeated, with all repetitions maintained within
- the quotes.
-
- 5. named expressions (described later):
-
- /{abc}{2,4}/ = /{abc}{abc}{abc}?{abc}?"/
-
- A special case exists for character classes in which the class
- of characters to exclude is greater than the class of characters
- to include. For example, suppose that we wanted in a certain
- character position to include all characters that weren't
- numerics. We could build a character class of all characters and
- leave the numerics out. An easier method is to use the
- "complemented" or "negated" character class. A special operator
- has been introduced for this purpose. The logical NOT symbol,
- '!', occurring as the first character in a character class,
- negates the class, i.e., any character NOT in the class is
- recognized at the character position.
-
- Thus, to define the negated character class of all characters
- which are not numerics, we would specify:
-
- [!0-9]
-
- To define all characters except the semi-colon, we would
- specify:
-
- [!;]
-
- Note that the symbol '!' has this special meaning only as the
- FIRST character in a character class. The caret symbol, '^', as
- the FIRST character in a character class may also be used to
- negate a character class. Traditionally, the caret been used for
-
-
- QTAwk - 2-7 - QTAwk
-
-
-
-
-
-
- Section 2.4 Regular Expression
-
-
- this purpose, but QTAwk allows the logical NOT operator, '!'
- also.
-
- Utilizing the above concepts for building regular expressions by
- concatenating characters, concatenating regular expressions to
- build more complicated regular expressions, using parenthesis to
- nest regular expressions within regular expressions, using
- character classes to denote constructs with implied "or"s, using
- the closure operators, '*', '+' and '?', and the repetition
- operator, {n1,n2}, for expressing multiple copies, very
- complicated regular expressions may be built for searching for
- strings in files.
-
- E2.5 Escape SequencesF
-
- To round out the ability for building regular expressions for
- searching, we need only a few more tools. In some cases we may
- wish for the regular expression to contain blanks or tab
- characters. In addition, other non-printable characters may be
- included in regular expressions. These characters are defined
- with "escape sequences". Escape sequences are two or more
- characters used to denote a single character. The first character
- is always the backslash, '\'. The second character is by
- convention a letter as follows:
-
- \a == bell (alert) ( \x07 )
- \b == backspace ( \x08 )
- \f == formfeed ( \x0c )
- \n == newline ( \x0a )
- \r == carriage return ( \x0d )
- \s == space ( \x20 )
- \t == horizontal tab ( \x09 )
- \v == vertical tab ( \x0b )
- \c == c [ \\ == \ ]
- \ooo == character represented by octal value ooo
- 1 to 3 octal digits acceptable
- \xhhh== character represented by hexadecimal value hhh
- 1 to 3 hexadecimal digits acceptable
-
- Any other character following the backslash is translated to
- mean that character. Thus '\c' would become a single 'c', '\['
- would become '[', etc. The latter is necessary in order to
- include such characters as '[', ']', '-', '!', '(', ')', '*',
- '+', '?' in regular expressions without invoking their special
- meanings as regular expression operators.
-
-
-
- QTAwk - 2-8 - QTAwk
-
-
-
-
-
-
- Section 2.6 Regular Expression
-
-
- E2.6 Position OperatorsF
-
- Three additional special characters have, by convention, been
- defined for use in writing regular expressions, namely the period
- '.', the caret, '^' and the dollar sign, '$'. The period has been
- assigned to mean "any character" in the set of characters except
- the newline character, '\n'. For our use the period means any
- character from ASCII 1 to ASCII 9 inclusive and ASCII 11 to ASCII
- 255 inclusive.
-
- The caret and the dollar sign are position indicators and not
- character indicators. The caret, '^', is used to indicate the
- beginning or start of the search string. Thus, any character
- following the caret in a regular expression must be the first
- character of the string to be searched otherwise the match fails.
- The dollar sign , '$', is used to indicate the end of the search
- string. Thus, any character preceding the dollar sign in a
- regular expression must be the last character of the string to be
- searched or the match fails.
-
- To indicate "beginning of line", the caret must be in the first
- character position of a regular expression. Similarly, to
- indicate "end of line", the dollar sign must be in the last
- character position of a regular expression. In any other
- position, these characters lose their special significance. Thus,
- the regular expression:
-
- /(^|[\s\t])A/
-
- means that 'A' must be the first character on a line, or be
- preceded by a space or tab character to match. Similarly
-
- /A($|[\s\t])/
-
- means that 'A' must be the last character on a line or be
- followed by a space or tab character.
-
- E2.7 ExamplesF
-
- The regular expression:
-
- /[A-Za-z][a-z]\s+.*/
-
- will match an upper or lower-case letter followed by a
- lower-case letter followed by one or more blanks followed by any
- character except a newline zero or more times.
-
-
- QTAwk - 2-9 - QTAwk
-
-
-
-
-
-
- Section 2.7 Regular Expression
-
-
- The regular expression:
-
- /\([A-Z]+\)[!\s]+/
-
- will match a left parenthesis followed by one or more uppercase
- letters followed by a right parenthesis followed by one or more
- characters which are not blanks.
-
- The regular expression:
-
- /[\s\t]+ARCHIVE([\s\t]+|$)/
-
- will match a blank or tab followed by the word (in upper-case)
- "ARCHIVE" followed either by one or more blanks or tabs or by the
- end of line. Note this kind of construct is handy for finding
- words as independent units and not buried within other words.
-
- The regular expression:
-
- /([\s\t]+|$)/
-
- is necessary to find words with trailing blanks or that end the
- search line. If only [\s\t]+ had been used then words ending the
- search line would not be found, since there are no trailing
- blanks or tabs.
-
- Note that for files with the newline character, '\n', at the end
- of all lines, commonly called ASCII text files, it is possible to
- search for regular expressions that may span more than one line.
- For example, if we wanted to find all sequences of the names
-
- Ted, Alice, George and Mary
-
- that were separated by spaces, tabs or line boundaries, we would
- write the following regular expression:
-
- /[\t-\r\s]+Ted[\t-\r\s]+Alice[\t-\r\s]+Mary[\t-\r\s]/
-
- The regular expression:
-
- /^As\s+(Fred|Ted|Jed|Ned)\s+(began|ended)(\s+|$)/
-
- will match the beginning of the search line followed by "As",
- i.e., 'A' as the first character of the search line, followed by
- one or more blanks followed by "Fred" or "Ted" or "Jed" or "Ned"
- followed by one or more blanks followed by "began" or "ended"
-
-
- QTAwk - 2-10 - QTAwk
-
-
-
-
-
-
- Section 2.7 Regular Expression
-
-
- followed by one or more blanks or the end of the search line.
- This could be modified slightly to be:
-
- /^As\s+(Fr|T|J|N)ed\s+(began|ended)(\s+|$)/
-
- or
-
- /^As\s+(Fr|[TJN])ed\s+(began|ended)(\s+|$)/
-
- either form will result in exactly the same search.
-
- E2.8 Look Ahead OperatorF
-
- Sometimes it is necessary to find a regular expression, but only
- when it is followed by another regular expression. Thus we wish
- to find "Mr", but only when it is followed by "Smith". The
- "look-ahead" operator, '@', is used to denote this situation. In
- general, if r is a regular expression we wish to match, but only
- when followed by the regular expression s, then we would express
- this as:
-
- /r@s/
-
- Thus, to find "Mr", but only when followed by "Smith", we have:
-
- /Mr@[\s\t]+Smith/
-
- E2.9 Match ClassesF
-
- There are also circumstances in which we wish to find pairs of
- characters. For example, we wish to find all clauses in a letter
- enclosed within parenthesis, "()", braces, "{}", or brackets,
- "[]". We could write several separate regular expressions which
- are identical except that one would use parenthesis, another
- braces, etc. A simpler method has been introduced using the
- concept of matched character classes. A matched character class
- is denoted as:
-
- [#\(\{\[] and [#\)\}\]]
-
- The first instance of a "matched character class" in a regular
- expression will match any character in the class. The second
- instance will match only the character in the position of the
- class matched by the first instance. For example, in the above
- two classes, if the character that matched the first class was
- '[', then only a ']' would match the second class and not a ')'
-
-
- QTAwk - 2-11 - QTAwk
-
-
-
-
-
-
- Section 2.9 Regular Expression
-
-
- or a '}'. Note the use of the backslash above to avoid any
- confusion in interpreting the characters "()", "{}", and "[]" as
- characters and regular expression operators. Except for ']', the
- backslash is not needed since the characters do act as operators
- within a character class. For the character ']', the backslash is
- necessary to prevent early termination of the character class.
-
- Note that matched character classes cannot be nested. Thus, the
- span of characters between two different matched character
- classes cannot overlap. If we wanted to find regular expressions
- contained within "([" and ")]" or within "{[" and "}]", the
- instances of each in the regular expression could not overlap,
- i.e., we could NOT write a regular expression like:
-
- this /[#\(\[] exp [#\{\[] contains [#\)\]] two [#\}\]] matched/
- │<────────────────────────────────>│ │
- │ │<────────────────────────────────>│
-
- This regular expression would be interpreted as:
-
- /this [#\(\[] exp [#\{\[] contains [#\)\]] two [#\}\]] matched/
- │<───────────────>│ │<───────────────>│
-
- E2.10 Named ExpressionsF
-
- If the strings to be found using regular expressions are
- complicated, the associated regular expressions can become very
- difficult to understand. This makes it very hard to determine if
- the regular expression is correct. For example, the regular
- expression (as one line):
-
- /^[A-Za-z_][A-Za-z0-9_]*([\s\t]+\**[A-Za-z_][A-Za-z0-9_]*)*
- \((([\s\t]*[\*&]*[A-Za-z_][A-Za-z0-9_]*[\s\t]*)(,([\s\t]*
- [\*&]*[A-Za-z_][A-Za-z0-9_]*[\s\t]*))*)*\)([\s\t]*
- (\/\*.*\*\/)[\s\t]*)*$/
-
- will find function definitions in C language programs.
- Constructing and analysing this regular expression as a single
- entity, is difficult.
-
- Breaking such regular expressions into smaller units, which are
- shorter and simpler, makes the task much easier. QTAwk has
- introduced the concept of "named expressions" for this purpose.
- Named expressions are QTAwk variable names enclosed in braces,
- '{' '}'. In translating the regular expression into internal form
- QTAwk, scans the regular expression for named expressions and
-
-
- QTAwk - 2-12 - QTAwk
-
-
-
-
-
-
- Section 2.10 Regular Expression
-
-
- substitutes the current value of the variable named. If a
- variable does not exist by the name specified, no substitution is
- made.
-
- By defining a variable:
-
- fst = "first words";
-
- Then the following regular expression:
-
- /The {fst} of the child/
-
- would expand into:
-
- /The first words of the child/
-
- Named expressions allow for building up regular expressions from
- smaller more easily understood regular expressions and for
- re-using the smaller regular expressions. The following example
- QTAwk utility builds the previous regular expression for
- recognizing C language function definitions (all on one line)
- from many smaller regular expressions. Each constituent regular
- expression is built to recognize a particular part of the
- function definition. When combined into the final regular
- expression, the three parts of the definition can be easily
- understood. The final regular expression is expanded in the final
- print statement. It spans several 80 character lines and is much
- more difficult to understand due to its length and complexity.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- QTAwk - 2-13 - QTAwk
-
-
-
-
-
-
- Section 2.10 Regular Expression
-
-
- Example:
- BEGIN {
- # define variables for use in regular expressions:
- # Define C name expression
- c_n = /[A-Za-z_][A-Za-z0-9_]*/;
- # Define C comment expression
- # Note: Does NOT allow comment to span lines
- c_c = /(\/\*.*\*\/)/;
- # Define single line comment
- c_slc = /({_w}*{c_c}{_w}*)*/;
- # Define C name with pointer
- c_np = /\**{c_n}/;
- # Define C name with pointer or address
- c_ni = /[\*&]*{c_n}/;
- # Define C function type and name declaration
- c_fname = /{c_n}({_w}+{c_np})*/;
- # Define expression for first argument in function list
- c_first_arg = /({_w}*{c_ni})/;
- # Define expression for remaining argument in function list
- c_rem_arg = /({_w}*,{c_first_arg})*/;
- # Define C function argument list
- c_arg_list = /\(({c_first_arg}{c_rem_arg})*\)/;
- #
- # Expression to find all C function definitions
- totl_name = /^{c_fname}{c_arg_list}{c_slc}$/;
- #
- # print total expression to illustrate expansion of named
- # expressions
- # Refer to the description of the 'replace' function
- #
- print replace(totl_name);
- }
-
- The string output by this utility is:
-
- ^[A-Za-z_][A-Za-z0-9_]*([\s\t]+\**[A-Za-z_][A-Za-z0-9_]*)*
- \((([\s\t]*[\*&]*[A-Za-z_][A-Za-z0-9_]*[\s\t]*)(,([\s\t]*
- [\*&]*[A-Za-z_][A-Za-z0-9_]*[\s\t]*))*)*\)([\s\t]*
- (\/\*.*\*\/)[\s\t]*)*$
-
- Note that in printing the regular expression, the leading and
- trailing slash, '/', were not printed.
-
-
-
-
-
-
- QTAwk - 2-14 - QTAwk
-
-
-
-
-
-
- Section 2.11 Regular Expression
-
-
- E2.11 Predefined NamesF
-
- In translating regular expressions, names starting with an
- underscore and followed by a single upper or lower case letter
- are reserved as predefined. The following predefined names are
- currently available for use in named expressions:
-
- Alphabetic
- {_a} == [A-Za-z]
- Brackets
- {_b} == [{}()[]<>]
- Control Character
- {_c} == [\x001-\x01f\x07f]
- Digit
- {_d} == [0-9]
- Exponent
- {_e} == [DdEe][-+]?{_d}{1,3}
- Floating point number
- {_f} == [-+]?({_d}+\.{_d}*|{_d}*\.{_d}+)
- Floating, optional exponent
- {_g} == {_f}({_e})?
- Hexadecimal digit
- {_h} == [0-9A-Fa-f]
- Integer
- {_i} == [-+]?{_d}+
- alpha-Numeric
- {_n} == [A-Za-z0-9]
- Octal digit
- {_o} == [0-7]
- Punctuation
- {_p} == [\!-/:-@[-`{-\x07f]
- double or single Quote
- {_q} == {_s}["'`]
- Real number
- {_r} == {_f}{_e}
- zero or even number of Slashes
- {_s} == (^|[!\\](\\\\)*)
- printable character
- {_t} == [\s-~]
- graphical character
- {_u} == [\x01f-~]
- White space
- {_w} == [\s\t]
- space, \t, \n, \v, \f, \r, \s
- {_z} == [\t-\r\s]
-
-
-
- QTAwk - 2-15 - QTAwk
-
-
-
-
-
-
- Section 2.11 Regular Expression
-
-
- The above predefined names will take precedence over any
- variables with identical names in replacing named expressions in
- regular expressions and the 'replace' function.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- QTAwk - 2-16 - QTAwk
-
-
-
-
-
-
- Section 2.12 Regular Expression
-
-
- E2.12 Operator SummaryF
-
- The QTAwk regular expression operators are summarized below:
-
- ^ matches Beginning of Line
- $ matches End of Line as last character of regular expression
- \c matches following (hexadecimal value shown in parenthesis):
- \a == bell (alert) ( \x07 )
- \b == backspace ( \x08 )
- \f == formfeed ( \x0c )
- \n == newline ( \x0a )
- \r == carriage return ( \x0d )
- \s == space ( \x20 )
- \t == horizontal tab ( \x09 )
- \v == vertical tab ( \x0b )
- \c == c [ \\ == \ ]
- \ooo == character represented by octal value ooo
- 1 to 3 octal digits acceptable
- \xhhh== character represented by hexadecimal value hhh
- 1 to 3 hexadecimal digits acceptable
-
- . matches any character except newline, '\n'
- [abc0-9] Character Class - match any character in class
- [^abc0-9] Negated Character Class - match any character not in
- class
- [!abc0-9] Negated Character Class - match any character not in
- class
- [#abc0-9] Matched Character Class - for second match, class
- character must match in corresponding position
- * - Closure, Zero or more matches
- + - Positive Closure, One or More matches
- ? - Zero or One matches
- r(s)t embedded regular expression s
- r|s|t '|' == logical 'or' operator. Regular expression r or s or
- t
- @ - Look-Ahead, r@t, matches regular expression 'r' only when r
- is followed by regular expression 't'. Regular expression t
- not contained in final match. Symbol loses special meaning
- when contained within parenthesis, '()', or character class,
- '[]'.
- r{n1,n2} - at least n1 and up to n2 repetitions of {re} r
- n1, n2 integers with 1 <= n1 <= n2
- r{2,6} ==> rrr?r?r?r?
- r{3,3} ==> rrr
- Expressions grouped by ", (), [], or names, "{name}"
- repeated as a group: (Note the treatment of quoted {ex}s)
-
-
- QTAwk - 2-17 - QTAwk
-
-
-
-
-
-
- Section 2.12 Regular Expression
-
-
- (r){2,6} ==> (r)(r)(r)?(r)?(r)?(r)?
- [r]{2,6} ==> [r][r][r]?[r]?[r]?[r]?
- {r}{2,6} ==> {r}{r}{r}?{r}?{r}?{r}?
- "r"{2,6} ==> "rr(r)?(r)?(r)?(r)?"
-
- {named_expr} - named expression. In regular expressions "{name}"
- is replaced by the string value of the corresponding
- variable. Unrecognized variable names are not replaced.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- QTAwk - 2-18 - QTAwk
-
-
-
-
-
-
- Section 3.0 Expressions
-
-
- E-3.0 EXPRESSIONSF-Ç
-
- QTAwk provides a rich set of operators which may be used in
- expressions. The QTAwk operators are listed below from highest to
- lowest precedence:
-
- Operation Operator Associativity
- grouping () left to right
- array subscripting [] left to right
- field $ left to right
- tag $$ left to right
- logical negation (NOT) ! left to right
- one's complement ~ left to right
- increment/decrement ++ -- right to left
- unary plus/minus + - left to right
- exponentiation ^ right to left
- multiply, divide, * / % left to right
- remainder
- binary plus/minus + - left to right
- concatenation left to right
- concatenation ∩ left to right
- shift left/right << >> left to right
- relational < <= > >= left to right
- equality == != left to right
- matching ~~ !~ left to right
- array membership in left to right
- bit-wise AND & left to right
- bit-wise XOR @ left to right
- bit-wise OR | left to right
- logical AND && left to right
- logical OR || left to right
- conditional ? : right to left
- assignment = ^= *= /= %= right to left
- += -= &= @= |=
- <<= >>= ∩=
- sequence , left to right
-
-
-
-
-
-
-
-
-
-
-
-
- QTAwk - 3-1 - QTAwk
-
-
-
-
-
-
- Section 3.1 Expressions
-
-
- E3.1 New/Changed OperatorsF
-
- Note that QTAwk has changed some operators from C and Awk. QTAwk
- has retained the Awk exponentiation operator (the C bitwise XOR
- operator) and made '@' the bitwise XOR operator. QTAwk has
- changed the Awk match operators to '~~' and '!~' to bring them
- more in alignment with the equality operators, '==' and '!='.
- This has freed up the single tilde to restore it to its C meaning
- of one's complement. QTAwk has also brought forward the remainder
- of the C operators: shift, '<<' and '>>', bit-wise operators,
- '&', '@' and '|', and the sequence operator, ','.
-
- QTAwk has retained the practice of forcing string concatenation
- by placing two constants, variables or function calls adjacent.
- QTAwk has introduced the string concatenation operator, '∩'
- (character 239, 0xef of the extended ASCII character set). The
- string concatenation operator has the advantage of making
- concatenation explicit and allowing the string concatenation
- assignment operator, '∩='. Thus, string concatenation operations
- which previously had to be written as:
-
- new_string = new_string old_string;
-
- may now be written:
-
- new_string ∩= old_string;
-
- Thus a loop to build a string of numerics which previously was
- written as:
-
- for( i = 8 , j = 9 ; i ; i-- ) j = j i;
-
- can be written as:
-
- for( i = 8 , j = 9 ; i ; i-- ) j ∩= i;
-
- and will produce a value for j of:
-
- "987654321"
-
- The string concatenation operator will make some constructs
- work as expected. For example, the statements:
-
- ostr = "prefix";
- suff = "suffix";
- k = 1;
-
-
- QTAwk - 3-2 - QTAwk
-
-
-
-
-
-
- Section 3.1 Expressions
-
-
- j = ostr ++k suff;
- print j;
- print ostr;
-
- will produce the seemly odd output:
-
- prefix1suffix
- 1
-
- This results from two factors:
-
- 1. In tokenizing the statements, white space is used to break
- keyword, variable and function names. Otherwise it is
- ignored.
-
- 2. The increment operator, '++', has higher precedence than
- string concatenation.
-
- Thus, QTAwk processes the following stream of tokens:
-
- 1. j
- 2. =
- 3. ostr
- 4. ++
- 5. k
- 6. suff
- 7. ;
-
- In interpreting the stream, '++' is encountered immediately
- after 'ostr' and is interpreted as a postfix operator operating
- on 'ostr' instead of a prefix operator operating on 'k'. Thus,
- the stream apears to QTAwk as:
-
- j = ostr++ k suff;
-
- After concatenating the current string value of ostr, "prefix",
- with the string value of k, "1", ostr is converted to a numeric,
- yielding a value of zero, 0, which is incremented to one, 1.
-
- This seemingly anomalous situation can be remedied in two ways:
-
- 1. Surround ++k with parenthesis, thus explicitly binding '++'
- to 'k':
-
- j = ostr (++k) suff;
-
-
-
- QTAwk - 3-3 - QTAwk
-
-
-
-
-
-
- Section 3.1 Expressions
-
-
- 2. Use the string concatenation operator, '∩', to make explicit
- the string concatenation:
-
- j = ostr ∩ ++k suff;
-
- or
-
- j = ostr ∩ ++k ∩ suff;
-
- The output produced by this, is what was really desired:
-
- prefix2suffix
- prefix
-
-
- In addition, QTAwk has added one operator, the tag operator,
- '$$'. The tag operator is analogous to the field operator, but
- can be followed only by the single numerical value of zero (0).
- This operator returns the string matched by a regular expression.
- When used in an action, the last regular expression match in the
- corresponding pattern will set the value of the tag operator. If
- there was no regular expression match in the pattern, $$0 is the
- null string. The operator may also be used in the 'sub' and
- 'gsub' functions in the same manner that '&' is used. Regular
- expressions used in actions will not disturb the value of the tag
- operator set by the pattern. The pattern/action pair:
-
- /[-+]?[0-9]+/ {
- print $$0;
- if ( $0 ~~ /[789]/ ) print $$0;
- }
-
- With the input line:
-
- this line contains an integer 12745
-
- both 'print' statements will output "12745"
-
- E3.2 Sequence OperatorF
-
- QTAwk uses the C sequence operator, the comma, ','. Using the
- sequence operator, expressions may be combined into an expr_list:
-
- expression_1 , expression_2 , expression_3 , ...
-
- As in C, a list of expressions separated by the sequence
-
-
- QTAwk - 3-4 - QTAwk
-
-
-
-
-
-
- Section 3.2 Expressions
-
-
- operator is valid anywhere an expression is valid. Such lists of
- expressions separated by the sequence operator will be referred
- to as an expression list or expr_list. Each expression in an
- expr_list is evaluated in turn. The final value of the expr_list
- is the value of the last expression. The sequence operator is
- very useful in the loop control statements discussed below.
-
- E3.3 Match Operator VariablesF
-
- QTAwk has defined two new built-in variables associated with the
- match operators, MLENGTH and MSTART. Whenever the match operator
- is executed MLENGTH is set equal to the length of the matching
- string or zero if no match is found. MSTART is set equal to the
- position of the start of the matching string or zero if no match
- is found. These built-in variables are completely analogous to
- the built-in variables RLENGTH and RSTART for the built-in
- 'match' function.
-
- E3.4 ConstantsF
-
- Expressions in QTAwk can contain several types of constants:
-
- 1. numeric constants
- 2. character constants
- 3. string constants
- 4. regular expressions
-
- Numeric constants have several forms: integer constants and
- floating point constants. Integers follow the C practice of
- allowing decimal, octal and hexadecimal base constants.
-
- Decimal constants match the form:
-
- [-+]?[0-9]+
-
- Octal constants match the form:
-
- 0[0-7]+
-
- Hexadecimal constants match the form:
-
- 0[xX][0-9A-Fa-f]+
-
- The results of all three of the following expressions are
- equivalent. All set the variable, int_cons, to the integer value,
- 11567.
-
-
- QTAwk - 3-5 - QTAwk
-
-
-
-
-
-
- Section 3.4 Expressions
-
-
-
- int_cons = 11567;
-
- int_cons = 026457;
-
- int_cons = 0x2d2f;
-
- Floating point numeric constants match the form:
-
- {_g}
-
- Note that octal and hexadecimal integers are only recognized in
- QTAwk expressions and not in the fields of input records. Only
- decimal and floating point numeric constants are recognized in
- input fields.
-
- String constants are character sequences enclosed in double
- quotes, ". The same escape sequences allowed in regular
- expressions are allowed in string constants.
-
- Character constants are single characters enclosed in single
- quotes, ', The same escape sequences allowed in strings and
- regular expressions are allowed in character constants. All three
- of the following expressions will set the variable, chr_cons, to
- 'A':
-
- chr_cons = 'A';
-
- chr_cons = '\x041';
-
- chr_cons = '\101'
-
- QTAwk will maintain variables set to character constants as
- single characters, but they may be used in arithmetic expressions
- as any other number and QTAwk will automatically convert them to
- their numeric value.
-
- The 'substr' function will return a character constant when the
- requested substring is only a single character wide.
-
-
-
-
-
-
-
-
-
- QTAwk - 3-6 - QTAwk
-
-
-
-
-
-
- Section 4.0 Strings and Regular Expressions
-
-
- E-4.0 STRINGS and REGULAR EXPRESSIONSF-Ç
-
- Strings and regular expressions in QTAwk are very similar, yet
- very different. Regular expressions can be used wherever strings
- are used and strings may be used in most cases where a regular
- expression may be used.
-
- E4.1 Regular Expression and String TranslationF
-
- Regular expressions and strings used as regular expressions are
- turned into an internal form for scanning the target string for a
- match. For regular expressions this process of conversion into
- the internal form is done once, when the regular expression is
- first used. For strings the process is done every time the string
- is used as a regular expression. The process of conversion into
- the internal form can be time consuming if done repeatedly. The
- judicious use of strings and regular expressions can give both
- flexibility and speed. By using regular expressions in those
- places where the content of the regular expression will not
- change after the first use, the speed of a single conversion can
- be attained. By using strings in those places where a regular
- expression is called for, e.g., the first argument of the 'gsub'
- function and the right hand expression for the match operators,
- the flexibility of dynamically changing expressions can be gained
- at the expense of speed.
-
- E4.2 Regular Expressions in PatternsF
-
- There are, however, some places where strings cannot be used as
- regular expressions. The most notable of these is as stand-alone
- regular expressions in patterns. Stand-alone regular expressions
- in patterns are a shorthand for:
-
- $0 ~~ /re/
-
- Thus, complex expressions may be built from stand-alone regular
- expressions in patterns. For example, the pattern:
-
- /re1/ && /re2/
-
- will match only those records for which both regular
- expressions re1 and re2 match. Using the logical, relational,
- equality and bit-wise operators, two or more regular expressions
- may be combined in patterns to test records against more than one
- regular expression. The following pattern:
-
-
-
- QTAwk - 4-1 - QTAwk
-
-
-
-
-
-
- Section 4.2 Strings and Regular Expressions
-
-
- /re1/ != /re2/
-
- will select only those records matching re1 and NOT matching
- re2 But records matching re2 and not matching re1 will also be
- selected.
-
- !/re1/
-
- will select those records not matching the regular expression.
- To use regular expressions in this manner the following logical
- truth table may be used for selecting desired records which match
- or do not match desired regular expressions:
-
- r1 T T F F
- r2 T F T F
-
- == T F F T
- != F T T F
- <= T F T T
- < F F T F
- > F T F F
- >= T T F T
- & T F F F
- | T T T F
- @ F T T F
- && T F F F
- || T T T F
-
- Thus, if you wanted to select only those records that matched
- both regular expressions and reject those records that did not
- match both, the following patterns are the only ones to do so:
-
- /re1/ & /re2/
-
- or
-
- /re1/ && /re2/
-
- To select those records matching only re1 and not re2 or both,
- the following patterns could be used:
-
- /re1/ > /re2/
-
- or
-
- /re1/ && !/re2/
-
-
- QTAwk - 4-2 - QTAwk
-
-
-
-
-
-
- Section 4.2 Strings and Regular Expressions
-
-
- Regular expressions and strings may also be used in 'case'
- statements as described later. However, strings are not
- equivalent to regular expressions in the 'case' statement.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- QTAwk - 4-3 - QTAwk
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- QTAwk - 4-4 - QTAwk
-
-
-
-
-
-
- Section 5.0 Pattern-Actions
-
-
- E-5.0 PATTERN-ACTIONSF-Ç
-
- QTAwk recognizes utilities in the following format:
-
- pattern { action }
-
- The opening brace, '{', of the action must be on the same line
- as the pattern. Patterns control the execution of actions. When a
- pattern matches a record, the associated action is executed.
- Patterns consist of valid QTAwk expressions or regular
- expressions. The sequence operator acquires a special meaning in
- pattern expressions and loses its meaning as a sequence operator.
-
- QTAwk follows the C practice in logical operations of
- considering a nonzero numeric value as true and a zero numeric
- value as false. This has been expanded in QTAwk for strings by
- considering the null string as false and any non-null string as
- true. When a logical operation is performed, the operation
- returns an integer value of one (1) for a true condition and an
- integer value of zero (0) for a false condition.
-
- E5.1 QTAwk PatternsF
-
- QTAwk recognizes the following type of patterns:
-
- 1. { action }
- the pattern is assumed TRUE for every record and the action
- is executed for all records.
-
- 2. expression
- the default action {print;} is executed for every record for
- which expression evaluates to TRUE.
-
- 3. expression { action }
- the actions are executed for each record for which expression
- evaluates to TRUE.
-
- 4. /regular expression/ { action }
- the actions are executed for each record for which the
- regular expression matches a string in the record (TRUE
- condition). The regular expression may be specified
- explicitly as shown or specified by a variable with a regular
- expression value. For example, setting the variable, var_re,
- as:
-
- var_re = /Replacement String/;
-
-
- QTAwk - 5-1 - QTAwk
-
-
-
-
-
-
- Section 5.1 Pattern-Actions
-
-
- and specifying the pattern as:
-
- var_re { action }
-
- would be identical to:
-
- /Replacement String/ { action }
-
- The use of a variable has the advantage of being able to
- change to the value of the variable. Changing the variable to
- another regular expression gives QTAwk utility the capability
- of dynamically changing patterns recognized.
-
- 5. compound pattern { action }
- the pattern combines regular expressions with logical NOT,
- '!', logical AND, '&&', logical OR, '||', bit-wise AND, '&',
- bit-wise OR, '|', bit-wise XOR, '@', the relational
- operators, '<=', '<', '>', '>=', the equality operators, '=='
- and '!=', and the matching operators, '~~' and '!~'. The
- action is executed for each record for which the compound
- pattern is TRUE.
-
- 6. expression1 , expression2 { action }
- range pattern. The action is executed for the first record
- for which expression1 is TRUE and every record until
- expression2 evaluates TRUE. The range is inclusive. This
- illustrates the special meaning of the sequence operator in
- patterns.
-
- 7. predefined pattern { action }
- the predefined patterns are described next
-
- E5.2 QTAwk Predefined PatternsF
-
- QTAwk provides five predefined patterns, all of which (except
- for the 'GROUP' pattern) require actions. The five predefined
- patterns are:
-
- 1. BEGIN { action }
- the action(s) associated with the BEGIN pattern are executed
- once prior to opening the first input file. There may be
- multiple BEGIN { action } combinations. Each action is
- executed in the order in which it is specified.
-
- 2. INITIAL { action }
- or
-
-
- QTAwk - 5-2 - QTAwk
-
-
-
-
-
-
- Section 5.2 Pattern-Actions
-
-
- INITIALIZE { action }
- the action(s) associated with the INITIAL (INITIALIZE)
- pattern are executed after each input file is opened and
- before the first record is read. There may be multiple
- INITIAL { action } combinations. Each action is executed in
- the order in which it is specified.
-
- 3. GROUP re { action }
- GROUP re { action }
- GROUP re { action }
- GROUP re { action }
- or
- GROUP re
- GROUP re { action }
- GROUP re
- GROUP re { action }
- or
- GROUP re
- GROUP re { action }
- GROUP re
- GROUP re
-
- the pattern associated with the 'GROUP' pattern keyword may
- be a single regular expression constant, a string constant or
- a variable name. All consecutive GROUP/action pairs are
- grouped and the search for the regular expressions optimized
- over the group. Each regular expression of the GROUP may have
- a separate action associated with it. In this case the
- appropriate action is executed if the regular expression is
- matched on the current input record. If the action for a
- regular expression is not given, then the next action
- explicitly given is executed. If no action is given for the
- last regular expression of a GROUP, then the default action
-
- { print ; }
-
- is assigned to it. When one of the regular expressions of
- the GROUP is matched, the built-in variable, NG, is set equal
- to the number of the regular expression. The numbering of the
- regular expressions in the GROUP starts with one, 1.
-
- There may be more than one GROUP of regular expression
- patterns. Any pattern not preceded with the 'GROUP' keyword
- will cause a GROUP to be terminated. The occurrence of the
- 'GROUP' keyword again will start a new GROUP and the
- numbering of the new group starts at one, 1.
-
-
- QTAwk - 5-3 - QTAwk
-
-
-
-
-
-
- Section 5.2 Pattern-Actions
-
-
- GROUP patterns are discussed in more detail later.
-
- 4. NOMATCH { ACTION }
- the action(s) associated with the NOMATCH pattern are
- executed for each record for which no pattern is TRUE. There
- may be multiple NOMATCH { action } combinations. Each action
- is executed in the order in which it is specified.
-
- 5. FINAL
- or
- FINALIZE
- the actions associated with the FINAL (FINALIZE) pattern are
- executed after the last record of each input file has been
- read and before the file is closed. There may be multiple
- FINAL { action } combinations. Each action is executed in the
- order in which it is specified.
-
- 6. END ( action )
- the action(s) associated with the END pattern are executed
- once after the last input file has been closed. There may be
- multiple END { action } combinations. Each action is executed
- in the order in which it is specified.
-
- Note that there may be multiple predefined pattern-action pairs
- defined in an QTAwk utility. Each action is executed at the
- appropriate time in the order defined.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- QTAwk - 5-4 - QTAwk
-
-
-
-
-
-
- Section 6.0 Variables and Arrays
-
-
- E-6.0 VARIABLES and ARRAYSF-Ç
-
- Variables in QTAwk are of four kinds:
- 1. user defined
- 2. built-in
- 3. field
- 4. tag
-
- The names of user defined variables start with an upper or lower
- case character or underscore optionally followed by one or more
- upper or lower case characters, digits or underscores. Most QTAwk
- built-in variables are named with upper case letters and
- underscores (only three are defined with lower case characters).
-
- Variables are defined by using them in expressions. Variables
- have numeric, string or regular expression values or all three
- depending upon the context in which they are used in expressions
- or function calls. Except for variables defined with the 'local'
- keyword, all variables are global in scope. That is they are
- accessible and can be changed anywhere within a QTAwk utility.
- Local variables will be discussed later when the 'local' keyword
- is discussed. All variables are initialized with a zero (0)
- numeric value and the null string value when created by
- reference. The value of the variable is changed with the
- assignment operator, '=' or 'op='.
-
- var1 = 45.87;
-
- var2 = "string value";
-
- var3 = /[\s\t]+[A-Za-z_][A-Za-z0-9_]+/;
-
- var1 has a numeric value of 45.87 from the assignment
- statement. It has a string value of "45.87" and a value as a
- regular expression of /45\.87/. The string and regular expression
- values of var1 may be changed by changing the value of the
- built-in variable "OFMT". The string value of OFMT is used to
- convert numeric values to string and regular expression values.
- OFMT is initialized with a value of "%.6g" and can be changed
- with an assignment statement. Such changes would then affect the
- string and regular expression values of numeric quantities. For
- example, if OFMT is assigned a value of "%u", then the string and
- regular expression values of var1 would become "45" and /45/
- respectively.
-
- The numeric values of both var2 and var3 is zero (0).
-
-
- QTAwk - 6-1 - QTAwk
-
-
-
-
-
-
- Section 6.0 Variables and Arrays
-
-
- The string value of var3 is "[\s\t]+[A-Za-z_][A-Za-z0-9_]+".
- Note that the tab escape sequence, '\t', is not expanded in
- converting the regular expression to a string. The reverse is not
- true. One difference between strings and regular expressions is
- the time at which escape sequences such as '\t' are translated to
- ASCII hexadecimal characters. For strings, the translation is
- done when the strings are read from the QTAwk utility. For
- regular expressions the escape sequences are translated when the
- regular expression is converted to internal form. For this
- reason, strings used in the place of regular expressions undergo
- a double translation, first when read from the QTAwk utility and
- second when converted into the internal regular expression form.
- The second translation of strings used for regular expressions is
- the reason backslash characters, '\', must be doubled for strings
- used in this manner.
-
- E6.1 QTAwk ArraysF
-
- Arrays in QTAwk are a blending of Awk and C. The use of the Awk
- associative arrays is continued and expanded to allow integer
- indices. The use of the comma to delineate multiple array indices
- is discontinued. The comma is now the sequence operator and will
- be so treated in array index expressions. Thus, the reference
-
- A[i,j]
-
- will now reference the element of A subscripted by the current
- value of the variable j. As a consequence of this the Awk
- built-in variable SUBSEP has been dropped. QTAwk allows
- multidimensional arrays referenced in the same manner as C. Thus:
-
- A[i][j]
-
- references the jth column of the ith row of the two-dimensional
- array A. Array subscripts may be strings. Thus:
-
- A[i]["state"]
-
- would reference the "state" element of the ith row of the two
- dimensional array, A. QTAwk allows array indices to be either
- integers or strings. Integer or string indices may be used on the
- same array. Integer indices are stored before string indices,
- integer indices follow the usual numeric ordering and string
- indices follow the ASCII collating sequence. The ordering will be
- apparent in use of the 'in' form of the 'for' statement:
-
-
-
- QTAwk - 6-2 - QTAwk
-
-
-
-
-
-
- Section 6.1 Variables and Arrays
-
-
- for ( k in A ) statement
-
- k is stepped through the indices of the singly dimensioned
- array, A, in the order stored. Thus if A has the following
- indices: 1, 3, 5, 7, 8, 9, 10, 12, 14, "county", "state", "zip".
- Then k would be stepped through the indices in that order. Note
- that allowing both string and integer indices overcomes the
- disconcerting order of the "stringized numerical" indices of Awk.
- Specifically, index 10 does not precede 2 as "10" does precede
- "2" in Awk. QTAwk still allows the use of numeric strings such as
- "10", "2", etc., but in most cases where such strings would be
- used, the user should be aware that integer indices are now
- available and will prevent the counterintuitive ordering of Awk.
-
- Note that only indexed elements of an array actually referenced
- exist. Thus, for the array A above, the elements for indices 2,
- 4, 6 and 13 do not exist since they have not been referenced.
- This follows the general philosophy that a variable does not
- exist until it has been referenced.
-
- E6.2 QTAwk Arrays in Arithmetic ExpressionsF
-
- When Arrays are used in arithmetic expressions in QTAwk, the
- entire array is operated on or assigned. For example, if the
- variable 'B' is a 3x3 array with the following values:
-
- B[1][1] = 11, B[1][2] = 12, B[1][3] = 13
- B[2][1] = 21, B[2][2] = 22, B[2][3] = 23
- B[3][1] = 31, B[3][2] = 32, B[3][3] = 33
-
- Assigning B to the variable 'A':
-
- A = B
-
- will duplicate the entire array into A.
-
- A[1][1] = 11, A[1][2] = 12, A[1][3] = 13
- A[2][1] = 21, A[2][2] = 22, A[2][3] = 23
- A[3][1] = 31, A[3][2] = 32, A[3][3] = 33
-
- If A and B are array variables and C is a scalar (non-array)
- variable, then the following expression forms for the assignment
- operators, 'op=', are legal:
-
- 1. A = B
- assign one array to a second. The original elements of array
-
-
- QTAwk - 6-3 - QTAwk
-
-
-
-
-
-
- Section 6.2 Variables and Arrays
-
-
- A are deleted and the the elements of B duplicated into A.
-
- 2. C = B
- assigning an array to a variable currently a scalar. Again
- the elements of B are duplicated into elements of C which
- becomes an array.
-
- 3. A = C
- assigning a scalar to a variable which is an array. The
- elements of the array are discarded and the variable becomes
- a scalar.
-
- 4. A = B[i]...[j]
- assigning an array element to a variable which is currently
- an array. Since the element of an array is a scalar, this
- case is essentially the same as the immediately previous
- case.
-
- 5. A[i]...[j] = B[k]...[l]
- since array elements are scalars, this is the usual scalar
- assignment case.
-
- 6. A op= C
- the 'op=' operator is applied to every element of A. Thus, A
- += 2, would add '2' to every element of A.
-
- 7. A op= B
- the 'op=' operator is applied to every element of A for which
- an element exists in B with identical indices. No elements
- are created in A to match elements of B with indices
- different from any element of A. Thus, the sequence of
- statements:
-
- A = B;
- A += B;
-
- would leave every element of A with twice the value of the
- corresponding element of B.
-
- There are two cases of using arrays with the assignment
- operators that are not legal and for which QTAwk will issue an
- error message at runtime.
-
- 1. A[i]...[j] = B
- 2. A[i]...[j] op= B
- 3. C op= B
-
-
- QTAwk - 6-4 - QTAwk
-
-
-
-
-
-
- Section 6.2 Variables and Arrays
-
-
- These are all variations on the same expression. In the first
- case, the expression is attempting to assign an array to a
- scalar, an array element. Since an array element cannot be
- further expanded into an array, the assignment is not allowed. In
- the second and third cases, the expressions are attempting to
- operate on a scalar with an array and assign the result to the
- scalar. Both of these expressions fail for the same reason, an
- array cannot operate on a scalar. It is possible for a single
- value, a scalar, to operate on every element of an array, but the
- reverse, having each element of the array operate on the scalar
- is not permitted.
-
- The reasoning prohibiting the second and third case above is
- extended to all binary expressions involving arrays in QTAwk. In
- general, arrays are allowed in expressions with binary arithmetic
- operators:
-
- ~ ^ * / % + - << >> & @ |
-
- as well as string concatenation:
-
- A B (equivalent to A ∩ B)
-
- In such expressions, arrays are allowed in the following forms:
-
- 1. A op B
- 2. A op C
-
- But not as
-
- C op A
-
- It could be argued that expressions such as,
-
- 2 + A
-
- should be allowed since '+' is commutative and the expression
- could be written equivalently as,
-
- A + 2
-
- This is true for addition, but not for all of the binary
- arithmetic operators. For example, the division operator is not
- commutative.
-
- 2 / A
-
-
- QTAwk - 6-5 - QTAwk
-
-
-
-
-
-
- Section 6.2 Variables and Arrays
-
-
- could not be written equivalently as:
-
- A / 2
-
- For this reason, QTAwk does not allow any array expressions of
- the form:
-
- scalar op array
-
- The unary arithmetic operators may also be used to operate on
- entire arrays:
-
- ++A (pre-fix increment operator)
-
- --A (pre-fix decrement operator)
-
- A++ (post-fix increment operator)
-
- A-- (post-fix decrement operator)
-
- -A (Unary minus operator)
-
- +A (Unary plus operator)
-
- ~A (Unary one's complement operator)
-
- An expression such as:
-
- A + B
-
- will result in an array with element indices identical to those
- of A, and with values which are the sum of the elements of A and
- B, which have identical indices. If A has an element for which B
- does not have a corresponding element, the resultant element
- value is equal to the A element value. Elements of B which have
- no corresponding element in A are not represented in the reultant
- array.
-
- An array with elements of double the value of the elements of B
- can created as:
-
- A = B;
- D = A + B;
-
- or as
-
-
-
- QTAwk - 6-6 - QTAwk
-
-
-
-
-
-
- Section 6.2 Variables and Arrays
-
-
- D = B + B;
-
- or as
-
- D = B * 2;
-
- any of the above sequence of statements will result in an array,
- D, with elements with indices identical to B, and with double the
- element values. The array A could be made an array with elements
- twice the element values of B with the statement:
-
- A = B;
- A *= 2;
-
- Arrays may be used in expressions with arithmetic operators and
- the whole array will be utilized in the expression. This does not
- extend to the logical operators:
-
- ! < <= >= > == != ~~ !~ && ||
-
- Using an array with a logical operator will result in the first
- element in the array only being used in the expression.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- QTAwk - 6-7 - QTAwk
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- QTAwk - 6-8 - QTAwk
-
-
-
-
-
-
- Section 7.0 Group Patterns
-
-
- E-7.0 GROUP PATTERNSF-Ç
-
- GROUP patterns follow the syntax:
-
- GROUP re1 { optional action }
- GROUP re2 { optional action }
- GROUP re3 { optional action }
- GROUP re4 { optional action }
- GROUP re5 { optional action }
- GROUP re6 { optional action }
-
- Actions are optional with any particular regular expression in
- the group. If no action is given, the next action specified in
- the group is executed. If no action is specified for the last
- regular expression in a group, the default action, "{print;}" is
- assigned to it.
-
- Any utility may have more than one GROUP of patterns. A group is
- terminated by any pattern not starting with the 'GROUP' keyword.
-
- E7.1 GROUP Pattern AdvantageF
-
- GROUP patterns have two distinct advantages in QTAwk:
-
- 1. the regular expressions contained in the GROUP are optimized
- to decrease search time, and
- 2. input records are searched once for all regular expressions
- in a GROUP. If the regular expressions were organized as
- individual pattern-actions, each record is searched
- separately for each regular expression.
-
- For utilities containing many regular expression patterns for
- which to search, a program organized into a one or more GROUPs
- can be many times faster than a utility organized as ordinary
- pattern/action pairs. For example, the QTAwk utility ansicstd.exp
- shown in Appendix III searches a C source file listing for ANSI C
- Standard defined names. The utility organizes the search into a
- single GROUP and will search a source file approximately 6 times
- faster than the same utility organized as separate pattern/action
- pairs without the use of a GROUP.
-
- E7.2 GROUP Pattern DisadvantageF
-
- GROUP patterns have one disadvantage compared to ordinary
- pattern/action pairs. QTAwk will find only one of the regular
- expressions in a GROUP. A set of GROUP patterns:
-
-
- QTAwk - 7-1 - QTAwk
-
-
-
-
-
-
- Section 7.2 Group Patterns
-
-
- GROUP re1 { action1; }
- GROUP re2 { action2; }
- GROUP re3 { action3; }
-
- is similar in execution to:
-
- re1 { action1; next; }
- re2 { action2; next; }
- re3 { action3; next; }
-
- If more than one regular expression in a group will match a
- given string in the input record, the regular expression listed
- first in the GROUP will be matched and the appropriate action
- executed. If all regular expression patterns in a GROUP must be
- found in input records, then separate pattern-action pairs must
- be used.
-
- E7.3 GROUP Pattern Regular ExpressionsF
-
- The regular expressions associated with the GROUP pattern can be
- either regular expression constants, e.g.,
-
- GROUP /regular expression constant/
-
- a string constant, e.g.,
-
- GROUP "string constant"
-
- or a variable, e.g.,
-
- GROUP var_name
-
-
- GROUP patterns are converted into an internal form for regular
- expressions only once, when the pattern is first used to scan an
- input line. Any variables in a GROUP pattern will be evaluated,
- converted to string form and interpreted as a regular expression.
-
-
-
-
-
-
-
-
-
-
-
- QTAwk - 7-2 - QTAwk
-
-
-
-
-
-
- Section 8.0 Statements
-
-
- E-8.0 STATEMENTSF-Ç
-
- QTAwk has departed from Awk by using the C convention of using
- the semi-colon, ';', as a statement terminator. QTAwk treats
- newline characters as white space and ignores them, except for
- terminating comments. Comments are introduced by the symbol, '#',
- and continue to the next newline character. Thus the Awk practice
- of letting newlines terminate some statements can no longer be
- used. The Awk rules for terminating statements with the newline
- except under some conditions can now be forgotten. In QTAwk,
- terminate all statements with a semi-colon, ';'.
-
- QTAwk provides braces for grouping statements to form compound
- statements. Various keywords are available for controlling the
- logical flow of statement execution and for looping over
- statements multiple times.
-
- E8.1 QTAwk KeywordsF
-
- The QTAwk keywords are:
-
- 1. break
- 2. case
- 3. continue
- 4. cycle
- 5. default
- 6. delete
- 7. deletea
- 8. do
- 9. else
- 10. endfile
- 11. exit
- 12. for
- 13. if
- 14. in
- 15. local
- 16. next
- 17. return
- 18. switch
- 19. while
-
- The keywords 'cycle', 'deletea', 'local' and 'endfile' are new
- to QTAwk. The keywords 'switch', 'case' and 'default' have been
- appropriated from C with expanded functionality over C.
-
- E8.2 'cycle' and 'next'F
-
-
- QTAwk - 8-1 - QTAwk
-
-
-
-
-
-
- Section 8.2 Statements
-
-
- The 'cycle' and 'next' statements allow the user to control the
- execution of the QTAwk outer loop which reads records from the
- current input file and compares them against the patterns. Both
- statements, restart the pattern matching.
-
- The 'next' statement causes the next input record to be read
- before restarting the outer pattern matching loop with the first
- pattern-action pair.
-
- The 'cycle' statement may use the current input record or the
- next input record for restarting the outer pattern matching loop.
- As each input record is read from the current input file, the
- built-in variable CYCLE_COUNT is set to one. The 'cycle'
- statement increments the numeric value of CYCLE_COUNT by one and
- compares the new value to the numeric value of the built-in
- MAX_CYCLE variable. One of two actions is taken depending on the
- result of this comparison:
-
- 1. If CYCLE_COUNT is greater than MAX_CYCLE, then the next
- input record is read, setting NR, FNR, $0, NF and the record
- fields $1, $2, ... $NF, before restarting the outer pattern
- matching loop. This is identical to the action of the 'next'
- keyword.
-
- 2. If CYCLE_COUNT is less than or equal to MAX_CYCLE, the
- current values of NR, FNR, $0, NF and the record fields are
- utilized when restarting the outer pattern matching loop.
-
- The default value of MAX_CYCLE is 100. Both CYCLE_COUNT and
- MAX_CYCLE are built-in variables and may be set by the user's
- utility. Setting MAX_CYCLE is useful to control the number of
- iterations possible on a record. Setting MAX_CYCLE to 1 would
- make the 'cycle' and 'next' keywords identical.
-
- If the value of CYCLE_COUNT is set by the user's utility, care
- should be taken to prevent the possibility of the utility
- entering a loop from which it cannot exit.
-
- The 'cycle' statement is useful when it is necessary to process
- the current input record through the outer pattern match loop
- more than once. The following utility is a trivial example of one
- such use. This utility will print each record with the record
- number multiple times. The number of times is determined by the
- value assigned MAX_CYCLE in the 'BEGIN' action.
-
- BEGIN {
-
-
- QTAwk - 8-2 - QTAwk
-
-
-
-
-
-
- Section 8.2 Statements
-
-
- MAX_CYCLE = 10;
- }
-
- {
- print FNR,$0;
- cycle;
- }
-
- E8.3 'delete' and 'deletea'F
-
- The 'delete' and 'deletea' statements allow the user to delete
- individual elements of an array or an entire array respectively.
- The form of the 'delete' and 'deletea' statements are:
-
- delete A[expr_list];
-
- and
-
- deletea A;
-
- The first form will delete the element of array A referenced by
- the subscript determined by 'expr_list'. The second form will
- delete the entire array. Note that for singly dimensioned arrays,
- the 'deletea' statement is equivalent to the statement:
-
- for ( j in A ) delete A[j];
-
- The use of the 'deletea' statement is encouraged for simplicity
- and speed of execution. The 'delete' statement may be used for
- arrays of any dimension. However, for arrays with dimension
- greater than 2, the elements of the array are not deleted, but
- simply initialized to zero and the null string. This behavior has
- to do with the structure of arrays and the 'holes' which could be
- left by deleting elements. For singly dimensioned arrays, there
- is no problem, since there can be no 'hole' left by deleting an
- element. For example consider the singly dimensioned array:
-
- A[1] A[2] A[3] A[4] A[5] A[6] A[7] A[8] A[9]
-
- If the array element A[5] is deleted
-
- A[1] A[2] A[3] A[4] ____ A[6] A[7] A[8] A[9]
-
- Then the remaining elements 'shift' to fill the 'hole'.
-
- A[1] A[2] A[3] A[4] A[6] A[7] A[8] A[9]
-
-
- QTAwk - 8-3 - QTAwk
-
-
-
-
-
-
- Section 8.3 Statements
-
-
- For two-dimensional arrays a complication arises in trying to
- fill the 'hole' left by deleting an array element.
-
- A[1][1] A[1][2] A[1][3] A[1][4] A[1][5] A[1][6]
- A[2][1] A[2][2] A[2][3] A[2][4] A[2][5] A[2][6]
- A[3][1] A[3][2] A[3][3] A[3][4] A[3][5] A[3][6]
- A[4][1] A[4][2] A[4][3] A[4][4] A[4][5] A[4][6]
- A[5][1] A[5][2] A[5][3] A[5][4] A[5][5] A[5][6]
- A[6][1] A[6][2] A[6][3] A[6][4] A[6][6] A[6][6]
-
- If element A[4][4] is deleted, then we have the 'hole':
-
- A[1][1] A[1][2] A[1][3] A[1][4] A[1][5] A[1][6]
- A[2][1] A[2][2] A[2][3] A[2][4] A[2][5] A[2][6]
- A[3][1] A[3][2] A[3][3] A[3][4] A[3][5] A[3][6]
- A[4][1] A[4][2] A[4][3] _______ A[4][5] A[4][6]
- A[5][1] A[5][2] A[5][3] A[5][4] A[5][5] A[5][6]
- A[6][1] A[6][2] A[6][3] A[6][4] A[6][6] A[6][6]
-
- In trying to fill the 'hole', we have a choice of shifting the
- elements below the deleted element up to fill the 'hole', column
- priority, or shifting the elements to the right of the deleted
- element to fill the 'hole', row priority. In QTAwk, row priority
- is used in filling the 'hole':
-
- A[1][1] A[1][2] A[1][3] A[1][4] A[1][5] A[1][6]
- A[2][1] A[2][2] A[2][3] A[2][4] A[2][5] A[2][6]
- A[3][1] A[3][2] A[3][3] A[3][4] A[3][5] A[3][6]
- A[4][1] A[4][2] A[4][3] A[4][5] A[4][6]
- A[5][1] A[5][2] A[5][3] A[5][4] A[5][5] A[5][6]
- A[6][1] A[6][2] A[6][3] A[6][4] A[6][6] A[6][6]
-
- For arrays of higher dimensions the situation is even more
- complicated. Not only do elements have to be "shifted", but
- elements in the array will have to be discarded to do so. For
- example, if A is a 3x3x3 array and element A[2][2][2] is deleted,
- then element A[2][2][3], if it existed, would also be deleted by
- shifting other elements to fill the 'hole'. QTAwk will in this
- case initialize the element A[2][2][2] to zero and the null
- string rather than delete the element and lose other elements.
- Thus, the 'delete' statement only truely deletes elements for one
- and two dimensional arrays.
-
- The 'deletea' statement, however, works on arrays of any
- dimension. For multi-dimensional arrays, the 'deletea' would be
- equivalent to nested 'for' statements. For example, if the
-
-
- QTAwk - 8-4 - QTAwk
-
-
-
-
-
-
- Section 8.3 Statements
-
-
- 'delete' statement truely deleted elements of a three dimensional
- array, then the 'deletea' statement could be imagined as
- equivalent to:
-
- for ( i in A )
- for ( j in A[i] )
- for ( k in A[i][j] ) delete A[i][j][k]
-
- E8.4 'if'/'else'F
-
- The 'if' and 'else' keywords provide for executing one of
- possibly two statements conditioned upon the TRUE or FALSE value
- of an expr_list. The form of the 'if'/'else' statement is:
-
- if ( expr_list ) statement1
-
- or
-
- if ( expr_list ) statement1 else statement2
-
- If expr_list when evaluated, produces a TRUE value then
- statement1 is executed. If the expr_list produces a FALSE value,
- then for the second form, statement2 is executed.
-
- E8.5 'in'F
-
- The 'in' keyword allows the user to test membership in arrays in
- expressions. The form of an expression containing the 'in'
- keyword is:
-
- expression in A
-
- if the value of 'expression' is a current subscript value of
- the array A, the expression yields a TRUE value, otherwise FALSE.
- For multidimensional arrays, the statement:
-
- expression in A[i]
-
- would test if 'expression' is a valid column subscript in the
- ith row of array A. Note that A may have more than two dimensions
- for this statement to be correct. The next higher dimension than
- stated in the expression is always tested.
-
- E8.6 'switch', 'case', 'default'F
-
- QTAwk includes an expanded form of the C 'switch'/'case'
-
-
- QTAwk - 8-5 - QTAwk
-
-
-
-
-
-
- Section 8.6 Statements
-
-
- statements. In C, the 'switch'/'case' statements must be of the
- form:
-
- switch ( expr_list ) {
- case constant1: statement
- case constant2: statement
- case constant3: statement
- case constant4: statement
- default: statement
- }
-
- The expr_list of the 'switch' statement must evaluate to an
- integral value and 'constant1', 'constant2', 'constant3', and
- 'constant4', must be compile-time integral constant values. In
- QTAwk, the 'case' statement may contain any valid QTAwk
- expression or expr_list:
-
- switch ( expr_list ) {
- case expr_list1: statement
- case expr_list2: statement
- case expr_list3: statement
- case expr_list4: statement
- default: statement
- }
-
- The expr_lists of the case statements are evaluated in turn at
- execution time. The resultant value is checked against the value
- of the expr_list of the 'switch' statement using the following
- logic.
-
- if ( cexpr is a regular expression ) sexpr ~~ cexpr;
- else sexpr == cexpr;
-
- where cexpr is the value of the case expr_list and sexpr is the
- value of the 'switch' statement expr_list. Thus if cexpr is a
- regular expression, a match operation is performed. If cexpr is a
- string, a string comparison is performed. If cexpr is a numeric,
- a numerical comparison is performed. It is possible to have case
- statements with differing types of expr_list values in the same
- 'switch' statement and the proper comparison is made.
-
- Once a TRUE value is returned by a case statement comparison,
- the execution falls through from 'case' to 'case' with no further
- comparisons made. The fall through of execution is broken by the
- use of the 'break' statement as in C.
-
-
-
- QTAwk - 8-6 - QTAwk
-
-
-
-
-
-
- Section 8.6 Statements
-
-
- Note that the expr_list of a 'case' statement is evaluated at
- execution time and it is possible for some 'case' expr_lists to
- never be evaluated. Thus side effects from the evaluation of
- 'case' expr_lists should not be relied upon. This is particularly
- true where execution falls through from one 'case' statement to
- the next.
-
- If the expr_list of a 'case' statement evaluates to a regular
- expression, then two built-in variables are set when the match
- operation is performed: CLENGTH and CSTART. CLENGTH is set to the
- length of the matching string found (or zero) and CSTART is set
- to the starting position of the matching string found (or zero).
- CLENGTH and CSTART are completely analogous to RLENGTH and RSTART
- set for the 'match' function and MLENGTH and MSTART for the match
- operators, '~~' and '!~'.
-
- The 'default' keyword is provided in analogy to C. The
- statements following the 'default' statement are executed if the
- 'switch' expr_list matches no 'case' expr_list. The 'default'
- statement may be combined with other 'case' statements. It need
- not be the last statement as shown.
-
- E8.7 LoopsF
-
- QTAwk has four forms of loop control statements:
-
- 1. for ( exp1 ; exp2 ; exp3 ) stmt
- 2. for ( var in array ) stmt
- 3. while ( exp ) stmt
- 4. do stmt while ( exp );
-
- E8.8 'while'F
-
- The 'while' statement has the form:
-
-
- while ( expr_list ) stmt
-
- the expr_list is evaluated and if TRUE 'stmt' is executed and
- expr_list is re-evaluated. This cycle continues until expr_list
- evaluates to FALSE, at which point the cycle is terminated and
- execution resumes with the utility after 'stmt'.
-
- E8.9 'for'F
-
- The 'for' statement has two forms:
-
-
- QTAwk - 8-7 - QTAwk
-
-
-
-
-
-
- Section 8.9 Statements
-
-
- 1. for ( exp1 ; exp2 ; exp3 ) stmt
- 2. for ( var in array ) stmt
-
- In the first form the following sequence of operations are
- performed:
- 1. The expressions in expr_list1 are evaluated,
- 2. The expressions in expr_list2 are evaluated,
- 3. The action taken is depenedent upon whether the resultant
- value of expr_list2 is true or false:
- a) TRUE
- 1: Execute 'stmt', which may be a compound statement.
- 2: Execute the expressions in expr_list3.
- 3: Control returns to item 2. above.
- b) FALSE - terminate loop
-
-
- The second form may also be used for multi-dimensional arrays:
-
- for ( var in array[s_expr_list]...[s_expr_list] ) stmt
-
- For each subscript in the next higher index level in the array
- reference, var is set to the index value and 'stmt' is executed.
- 'stmt' may be a compound statement. For a multidimensional array,
- the second form may be used to loop sequentially through the
- indices of the next higher index level. Thus for a two
- dimensional array:
-
- for ( i in A )
- for ( j in A[i] )
-
- will loop through the indices in the array in row order.
-
- E8.10 'do'/'while'F
-
- The form of the 'do'/'while' statement is:
-
- do stmt while ( expr_list );
-
- 'stmt' is executed, expr_list evaluated and if TRUE 'stmt' is
- executed again else the loop is terminated. Note that 'stmt' is
- executed at least once.
-
- E8.11 'local'F
-
- The 'local' keyword is used to define variables within a
- compound statement that are local to the compound statement and
-
-
- QTAwk - 8-8 - QTAwk
-
-
-
-
-
-
- Section 8.11 Statements
-
-
- that disappear when the statement is exited. The 'local' keyword
- may be used within any compound statement, but is especially
- useful in user-defined functions as described later. Variables
- defined with the 'local' keyword may be assigned an initial value
- in the statement and multiple variables may be defined with a
- single statement. If a variable is not assigned an initial value,
- it is initialized to zero and the null string just as global
- variables are initialized.
-
- Thus:
-
- local i, j = 12, k = substr(str,5);
-
- will define three variables local to the enclosing compound
- statement:
- 1. i initialized to zero/null string,
- 2. j initialized to 12, and
- 3. k initialized to a substring of the variable 'str'
-
- Local variables initialized explicitly in 'local' statements may
- be initialized to constants, the values of global variables,
- values returned by built-in functions, values returned by
- user-defined functions or previously defined local variables. If
- the value is set to that of a previously defined local variable,
- the variable may not be defined in the same 'local' statement.
- Thus:
-
- local k = 5;
- local j = k;
-
- is correct, but
-
- local k = 5, j = k;
-
- is not. In the latter case QTAwk will quietly assume that the k,
- to which j is assigned, is a global variable.
-
- E8.12 'endfile'F
-
- The 'endfile' keyword causes the utility to behave as if the end
- of the current input file has been reached. Any 'FINAL' actions
- are executed, if any input files remain to be processed from the
- command line, the next is opened for processing. If no further
- input files remain to be processed, any 'END' actions are
- executed.
-
-
-
- QTAwk - 8-9 - QTAwk
-
-
-
-
-
-
- Section 8.13 Statements
-
-
- E8.13 'break'F
-
- This keyword will terminate the execution of the enclosing
- 'while', 'for', 'do'/'while' loop or break execution in cascaded
- 'case' statements.
-
- E8.14 'continue'F
-
- This keyword will cause execution to jump to just after the last
- statement in the loop body and execute the next iteration of the
- enclosing loop. The loop may be any 'for', 'while' or
- 'do'/'while'.
-
- E8.15 'exit opt_expr_list'F
-
- This statement causes the utility to behave as if the end of the
- current input file had been reached. Any further input files
- specified are ignored. If there are any FINAL or END actions,
- they are executed.
-
- If encountered in a FINAL action, the action is terminated, any
- further input files are ignored and any END actions are executed.
-
- If encountered in an END action, the execution of the action is
- terminated and utility execution is terminated.
-
- The optional expr_list is evaluated and the resultant value
- returned to DOS upon termination by QTAwk as the exit status. If
- no expr_list is present, or no 'exit' statement encountered,
- QTAwk returns a value of zero for the exit status.
-
- E8.16 'return opt_expr_list'F
-
- This statement will cause execution to return from a user
- defined function. If the optional expr_list is present, it is
- evaluated and the resultant value returned as the functional
- value.
-
-
-
-
-
-
-
-
-
-
-
- QTAwk - 8-10 - QTAwk
-
-
-
-
-
-
- Section 9.0 Built-in Functions
-
-
- E-9.0 BUILT-IN FUNCTIONSF-Ç
-
- QTAwk offers a rich set of built-in arithmetic, string, I/O,
- array and system functions. The array of built-in functions
- available has been extended over that available with Awk. The I/O
- functions have been changed to match the functional syntax of all
- other built-in and user defined functions.
-
- E9.1 Arithmetic FunctionsF
-
- QTAwk offers the following built-in arithmetic functions. Those
- marked with an asterisk, '*', are new to QTAwk:
-
- 1. acos(x) ==> return arc-cosine of x (refer to the DEGREES
- built-in variable).
-
- 2. asin(x) ==> return arc-sine of x (refer to the DEGREES
- built-in variable).
-
- 3. atan2(y,x) ==> return arc-tangent of y/x, -π to π (refer to
- the DEGREES built-in variable).
-
- 4. cos(x) ==> return cosine of x (refer to the DEGREES built-in
- variable).
-
- 5. * cosh(x) ==> return hyperbolic cosine of x
-
- 6. exp(x) ==> return e^x
-
- 7. * fract(x) ==> return fractional portion of x
-
- 8. int(x) ==> return integer portion of x
-
- 9. log(x) ==> return natural (base e) logarithm of x
-
- 10. * log10(x) ==> return base 10 logarithm of x
-
- 11. * pi() ==> return pi
-
- 12. * pi ==> return pi
-
- 13. rand() ==> return random number r, 0 <= r < 1
-
- 14. sin(x) ==> return sine of x (refer to the DEGREES built-in
- variable).
-
-
-
- QTAwk - 9-1 - QTAwk
-
-
-
-
-
-
- Section 9.1 Built-in Functions
-
-
- 15. * sinh(x) ==> return hyperbolic sine of x
-
- 16. sqrt(x) ==> return square root of x
-
- 17. srand(x) ==> set x as new seed for rand()
-
- 18. srand() ==> set current system time as new seed for rand()
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- QTAwk - 9-2 - QTAwk
-
-
-
-
-
-
- Section 9.2 Built-in Functions
-
-
- E9.2 String FunctionsF
-
- QTAwk offers the following built-in string handling functions.
- Those marked with an asterisk, '*', are new to QTAwk:
-
- 1. * center(s,w) ==> return string s centered in w blank
- characters.
-
- 2. * center(s,w,c) ==> return string s centered in w 'c'
- characters.
-
- 3. * copies(s,n) ==> return n copies of string s.
-
- 4. * deletec(s,p,n) ==> return string s with n characters
- deleted starting at position p.
-
- 5. gsub(r,s) ==> substitute s for strings matched by regular
- expression, r, globally in $0, return number of substitutions
- made.
-
- 6. gsub(r,s,t) ==> substitute s for strings matched by regular
- expression, r, globally in string t, return number of
- substitutions made.
-
- 7. index(s1,s2) ==> return position of string s2 in string s1.
-
- 8. * insert(s1,s2,p) ==> return string formed by inserting
- string s2 into string s1 starting at position p.
-
- 9. * justify(a,n,w) ==> return string w characters long formed
- by justifying n elements of array a padded with blanks. If n
- elements of array a with at least one blank between elements
- would exceed width w, then the number of elements justified
- is reduced to fit in the length w.
-
- 10. * justify(a,n,w,c) ==> return string w characters long
- formed by justifying n elements of array a padded with
- character 'c'. If n elements of array a with at least one 'c'
- character between elements would exceed width w, then the
- number of elements justified is reduced to fit in the length
- w.
-
- 11. length ==> return number of characters in $0.
-
- 12. length() ==> return number of characters in $0.
-
-
-
- QTAwk - 9-3 - QTAwk
-
-
-
-
-
-
- Section 9.2 Built-in Functions
-
-
- 13. length(s) ==> return number of characters in string s.
-
- 14. match(s,r) ==> return true/false if string s contains a
- substring matched by r. Set RLENGTH to length of substring
- matched (or zero) and RSTART to start position of substring
- matched (or zero).
-
- 15. * overlay(s1,s2,p) ==> return string formed by overlaying
- string s2 on string s1 starting at position p. May extend
- length of s1. If p > length(s1), s1 padded with blanks to
- appropriate length.
-
- 16. * remove(s,c) ==> return string formed by removing all 'c'
- characters from string s
-
- 17. * replace(s) ==> return string formed by replacing all
- repeated expressions, {n1,n2}, and named expressions, {name},
- in string s. Same operation performed for strings used as
- regular expressions.
-
- 18. * sdate(fmt) ==> return current system date formatted
- according to integer value of fmt.
- mname == full month name
- amname == abbreviated month name (3 characters)
- wkday == full day name
- aday == abbreviated day name (3 characters)
- integer value of fmt:
- 0 - mm/dd/yy
- 1 - mm/dd/yyyy
- 2 - dd/mm/yy
- 3 - dd/mm/yyyy
- 4 - amname dd, yyyy
- 5 - mname dd, yyyy
- 6 - aday mm/dd/yyyy
- 7 - wkday mm/dd/yyyy
- 8 - aday, amname dd, yyyy
- 9 - wkday, mname dd, yyyy
- 10 - return amname
- 11 - return month name
- 12 - return aday
- 13 - return wkday
- 14 - return current system date in form yymmdd for sorting
- 15 - return number of days this century
- 16 - return number of days this year
- a value greater than 16, gives a run-time error and QTAwk
- halts execution.
-
-
- QTAwk - 9-4 - QTAwk
-
-
-
-
-
-
- Section 9.2 Built-in Functions
-
-
- 19. split(s,a) ==> split string s into array a on field
- separator FS. Return number of fields. The same rules applied
- to FS for splitting the current input record apply to the use
- of fs in splitting s into a.
-
- 20. split(s,a,fs) ==> split string s into array a on field
- separator fs. Return number of fields. The same rules applied
- to FS for splitting the current input record apply to the use
- of fs in splitting s into a.
-
- 21. * srange(c1,c2) ==> return string formed from character by
- concatenating characters from c1 to c2 inclusive. If c2 < c1
- null string returned. Thus,
-
- srange('a','k') == "abcdefghijk".
-
- 22. * srev(s) ==> return string formed by reversing string s.
-
- srev(srange('a','k')) == "kjihgfedcba".
-
- 23. * stime(fmt) ==> return current system time formatted
- according to the integer value of fmt.
- 0 - hh:mm:ss:00, 0 <= hh <= 24
- 1 - hh:mm:ss, 0 <= hh <= 24
- 2 - hh:mm, 0 <= hh <= 24
- 3 - hh:mm:ss am/pm
- 4 - hh:mm am/pm
- a value greater than 4, gives a run-time error and QTAwk
- halts execution.
-
- 24. * stran(s) ==> return string formed by translating
- characters in string s matching characters in string value of
- built-in variable, TRANS_FROM, to corresponding character in
- string value of built-in variable, TRANS_TO. if no
- corresponding character in TRANS_TO, then replace with blank.
- TRANS_FROM and TRANS_TO intially set to:
-
- TRANS_FROM = srange('A','Z');
-
- TRANS_TO = srange('a','z');
-
- 25. * stran(s,st) ==> return string formed by translating
- characters in string s matching characters in string value of
- built-in variable, TRANS_FROM, to corresponding character in
- st. if no corresponding character in st then replace with
- blank.
-
-
- QTAwk - 9-5 - QTAwk
-
-
-
-
-
-
- Section 9.2 Built-in Functions
-
-
- 26. * stran(s,st,sf) ==> return string formed by translating
- characters in string s matching characters in sf to
- corresponding character in st. if no corresponding character
- in st then replace with blank.
-
- 27. * strim(s) ==> return string formed by trimming leading and
- tailing white space from string s. Leading white space
- matches the regular expression /^[\s\t]+/. Tailing white
- space matches the regular expression /[\s\t]+$/.
-
- 28. * strim(s,le) ==> return string formed by trimming string
- matching le and tailing white space from string s. Differing
- actions are taken depending the type of le:
-
- ────────────┬────────────────────────────────────────────────────────
- le type │ action
- ────────────┼────────────────────────────────────────────────────────
- regular │
- expression │ delete first string matching regular expression
- │
- string │ convert to regular expression and delete first
- │ matching string
- │
- single │
- character │ delete all leading characters equal to 'le'
- │
- non-zero │
- numeric │ delete leading white space matching /^[\s\t]+/
- │
- zero │
- numeric │ ignore
- ────────────┴────────────────────────────────────────────────────────
-
- strim(s,TRUE) is equivalent to the form strim(s)
-
- The following all delete the leading dashes from the given
- string:
-
- strim("------ remove leading -------",/^-+/);
- strim("------ remove leading -------",/-+/);
- strim("------ remove leading -------",'-');
- ==> "remove leading -------"
-
- 29. * strim(s,le,te) ==> return string formed by trimming
- string matching le and string matching te from s. 'le' and
- 'te' may be a regular expression, a string, a single
-
-
- QTAwk - 9-6 - QTAwk
-
-
-
-
-
-
- Section 9.2 Built-in Functions
-
-
- character or a numeric. Differing actions are taken depending
- the type of le and te:
-
- ────────────┬────────────────────────────────────────────────────────
- le/te type │ action
- ────────────┼────────────────────────────────────────────────────────
- regular │
- expression │ delete first string matching regular expression
- │
- string │ convert to regular expression and delete first
- │ matching string
- │
- single │
- character │ delete all leading/tailing characters equal to
- │ 'le'/'te' respectively
- │
- non-zero │
- numeric │ delete leading/tailing white space matching /^[\s\t]+/
- │ or /[\s\t]+$/ respectively
- │
- zero │
- numeric │ ignore
- ────────────┴────────────────────────────────────────────────────────
-
- strim(s,TRUE,TRUE) is equivalent to the form strim(s)
-
-
- strim("======remove leading and tailing-------",'=','-')
- or
- strim("======remove leading and tailing-------",/^=+/,'-')
- or
- strim("======remove leading and tailing-------",'+',/-+$/)
- or
- strim("======remove leading and tailing-------",/^=+/,/-+$/)
- ==> "remove leading and tailing"
-
- strim("======remove leading-------",'=',FALSE)
- ==> "remove leading-------"
-
- strim("======remove tailing-------",FALSE,'-')
- ==> "======remove tailing"
-
- 30. * strlwr(s) ==> return string s translated to lower-case.
-
- 31. * strupr(s) ==> return string s translated to upper-case.
-
-
-
- QTAwk - 9-7 - QTAwk
-
-
-
-
-
-
- Section 9.2 Built-in Functions
-
-
- 32. sub(r,s) ==> substitute s for leftmost string matched by
- regular expression, r, in $0, return number of substitutions
- made (0/1).
-
- 33. sub(r,s,t) ==> substitute s for leftmost string matched by
- regular expression, r, in t, return number of substitutions
- made (0/1).
-
- 34. substr(s,p) ==> return string formed from suffix of string
- s starting at position p.
-
- 35. substr(s,p,n) ==> return string formed from n characters of
- string s starting at position p. If n == 1, a character
- constant is returned.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- QTAwk - 9-8 - QTAwk
-
-
-
-
-
-
- Section 9.3 Built-in Functions
-
-
- E9.3 I/O FunctionsF
-
- QTAwk offers the following built-in I/O functions. Those marked
- with an asterisk, '*', differ from those in AWK.
-
- 1. * getline();
- or
- * getline ; ==> reads next record from current input file
- into $0. Sets fields, NF, NR and FNR. Returns the number of
- characters read, 0 if end-of-file was encountered or -1 if an
- error occurred.
-
- 2. * getline(v);
- or
- * getline v ; ==> reads next record from current input file
- into variable v. Sets NR and FNR. Returns the number of
- characters read, 0 if end-of-file was encountered or -1 if an
- error occurred.
-
- 3. * fgetline(F) ==> reads next record from file F into $0.
- Sets fields and NF. Returns the number of characters read, 0
- if end-of-file was encountered or -1 if an error occurred.
-
- 4. * fgetline(F,v) ==> reads next record from file F into
- variable v. Returns the number of characters read, 0 if
- end-of-file was encountered or -1 if an error occurred.
-
- 5. * fprint(F);
- or
- * fprint F ; ==> prints $0 to file 'F' followed by ORS.
- Returns number of characters printed.
-
- 6. * fprint(F,...);
- or
- * fprint F,... ; ==> prints expressions in the expr_list,
- '...', to the file 'F', each separated by OFS. The last
- expression is followed by ORS. Returns number of characters
- printed.
-
- 7. fprintf(F,fmt,...) ==> print expr_list, ..., to file 'F'
- according to format 'fmt'. Returns the number of characters
- printed.
-
- 8. print();
- or
- print ; ==> prints $0 to standard output file followed by
-
-
- QTAwk - 9-9 - QTAwk
-
-
-
-
-
-
- Section 9.3 Built-in Functions
-
-
- ORS. Returns number of characters printed.
-
- 9. print(...);
- or
- print ... ; ==> prints expressions in the expr_list, ..., to
- the standard output file, each separated by OFS. The last
- expression is followed by ORS. Returns number of characters
- printed.
-
- 10. printf(fmt,...) ==> print expr_list, ..., to standard
- output file according to format, fmt. Returns number of
- characters printed.
-
- 11. sprintf(fmt,...) ==> return string formed by formatting
- expr_list, ... , according to format, fmt.
-
- 12. close(F) ==> close file F.
-
- The use of the re-direction and pipeline operators, '<', '>',
- '>>' and '|', have been discontinued as error prone. The use of
- the syntax:
-
- { print $1, $2 > $3 }
-
- has been replaced by the 'fprint' function:
-
- { fprint($3,$1,$2); }
-
- or
-
- { fprint $3,$1,$2; }
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- QTAwk - 9-10 - QTAwk
-
-
-
-
-
-
- Section 9.4 Built-in Functions
-
-
- E9.4 Miscellaneous FunctionsF
-
- E9.4.1 Expression TypeF
-
- *e_type(expr) ==> returns the type of 'expr'. Function evaluates
- the expression 'expr' and returns the type of the final result.
- The return is an integer defining the type:
-
- Return Type
- 0 Un-initialized (returned when 'expr' is a variable which
- has not had a value assigned to it. Also
- if not been assigned since acted on
- by "deleta" statement)
- 1 Regular Expression Value
- 2 String Value
- 3 Single Character Value
- 4 Integral Value
- 5 Floating Point Value
-
- local lvar;
- e_type(lvar) ==> 0
- e_type(/string test/) ==> 1
- e_type("string test") ==> 2
- e_type('a') ==> 3
- e_type(45) ==> 4
- e_type(45.6) ==> 5
- e_type(45.6 ∩ "") ==> 2
- e_type("45.6" + 0.0) ==> 5
- e_type("45" + 0) ==> 4
-
-
-
- E9.4.2 Execute StringF
-
- QTAwk offers two forms of a function to execute QTAwk dynamic
- expressions or statements. The first form will execute strings as
- QTAwk expressions or statements. The second will execute array
- elements as QTAwk expressions or elements.
-
- *execute(s[,se[,rf]]) ==> execute string s as an QTAwk statement
- or expression. If se == TRUE, then string s is executed as an
- expression and the resultant value is returned by the 'execute'
- function. If se == FALSE, then string s is executed as a
- statement and the constant value of one, 1, is returned. The se
- parameter is optional and defaults to FALSE. Any built-in or
- user-defined function may be executed in the 'execute' function
-
-
- QTAwk - 9-11 - QTAwk
-
-
-
-
-
-
- Section 9.4.2 Built-in Functions
-
-
- except the 'execute' function itself. New variables may be
- defined as well as new constant strings and regular expressions.
-
- The optional rf parameter is the error recovery flag. If rf =
- FALSE (the default value), an error encountered in parsing or
- executing the string s will cause QTAwk to issue the appropriate
- error message and halt execution. If rf == TRUE, an error
- encountered in parsing or executing the string s will cause QTAwk
- to issue the appropriate error message, discontinue parsing or
- execution of the string and continue executing the current QTAwk
- utility. Attempting to execute the 'execute' function from within
- the 'execute' function is a fatal error and will always cause
- QTAwk to halt execution.
-
- The following string can be executed as either an expression or
- statement:
-
- nvar = "power2 = 2 ^ 31;";
-
- If executed as an expression:
-
- print execute(nvar,1);
-
- the output will be: 2147483648
-
- If executed as a statement:
-
- print execute(nvar,0);
-
- or
-
- print execute(nvar);
-
- the output will be: 1
-
- Multiple statements/expressions may be executed with a compound
- statement of the form:
-
- pvar = "{ pow8 = 2 ^ 8; pow16 = 2 ^ 16; pow31 = 2 ^ 31; }";
-
- Then
-
- execute(pvar,0);
-
- or
-
-
-
- QTAwk - 9-12 - QTAwk
-
-
-
-
-
-
- Section 9.4.2 Built-in Functions
-
-
- execute(pvar);
-
- will set the three variables:
-
- 1. pow8
- 2. pow16
- 3. pow31
-
- even if the variables were not previously defined. If the
- variables were not previously defined, they will added to the
- list of the utility variables.
-
- Note that attempting to execute pvar as an expression:
-
- execute(pvar,1);
-
- will result in the error message "Undefined Symbol". All three
- expressions may be executed, as an expression, by the use of the
- sequence operator in the following manner:
-
- pvar = "pow8 = 2 ^ 8 , pow16 = 2 ^ 16 , pow31 = 2 ^ 31;";
-
- *execute(a[,se[,rf]]) ==> execute the elements of array a as an
- QTAwk statement or expression. The se and rf parameters have the
- same function and default values as above. For example, the
- compound statement contained in pvar above may be split amoung
- the elements of an array:
-
- avar[1] = "{";
- avar[2] = "pow8 = 2 ^ 8;";
- avar[3] = "pow16 = 2 ^ 16;";
- avar[4] = "pow31 = 2 ^ 31;";
- avar[5] = "}";
-
- and executed as:
-
- execute(avar);
-
- or
-
- execute(avar,0);
-
-
-
- E9.4.3 Array FunctionF
-
-
-
- QTAwk - 9-13 - QTAwk
-
-
-
-
-
-
- Section 9.4.3 Built-in Functions
-
-
- QTAwk offers the following built-in array function.
-
- rotate(a) - the values of the array are rotated.
- The value of the first element goes to the last element, the
- second to the first, third to the second, etc. If the array has
- the following elements:
-
- 1. a[1] = 1
- 2. a[2] = 2
- 3. a[3] = 3
- 4. a[4] = 4
-
- then rotate(a) will have the result:
-
- 1. a[1] = 2
- 2. a[2] = 3
- 3. a[3] = 4
- 4. a[4] = 1
-
- It is not necessary to specify one-dimensional arrays. If:
-
- 1. a[1][1] = 1
- 2. a[1][2] = 2
- 3. a[1][3] = 3
- 4. a[1][4] = 4
-
- Then rotate(a[1]) will produce the result:
-
- 1. a[1][1] = 2
- 2. a[1][2] = 3
- 3. a[1][3] = 4
- 4. a[1][4] = 1
-
-
-
- E9.4.4 System Control FunctionF
-
- 1. system(e) ==> executes the system command specified by the
- string value of the expression e.
-
-
-
- E9.4.5 Variable AccessF
-
- There are two built-in functions available for access to
- variables. The first, "pd_sym", accesses pre-defined variables
-
-
- QTAwk - 9-14 - QTAwk
-
-
-
-
-
-
- Section 9.4.5 Built-in Functions
-
-
- and the second, ud_sym, accesses user-defined variables. Each has
- two forms:
-
- pd_sym(name_str)
-
- or
-
- pd_sym(name_num,name_str)
-
- 1. To access pre-defined variables, the function "pd_sym" may
- be used. This function has been supplied to provide a
- pre-defined variable access function similar to the function
- "ud_sym" for accessing user-defined variables. The forms and
- returns are similar.
-
- 2. To access user-defined variables where the variable name may
- not be known in advance, the function "ud_sym" has been
- supplied. The first form:
-
- ud_sym(name_expr)
-
- is useful in situations where the variable name is not known
- until the statement is to be executed. In these cases,
- name_expr may be any expression or variable with the string
- value of the unknown variable. In this form, the string value
- of name_expr is used to access the variable. ud_sym returns
- the variable in question, if one exists with the string value
- passed.
-
- The functional return value may be used in any expression
- just as the variable itself would. This includes operating on
- the return value with the array index operators, "[]".
-
- Note: This form may be used to access both local and global
- variables. If both a local and global variable have been
- defined with the desired name and the local variable is
- within scope, then the local variable is returned.
-
- The second form:
-
- ud_sym(name_expr,name_str)
-
- is useful in those situations where it may be impractical to
- use string values to access the variables, e.g., in a "for",
- "while" or "do" loop, but a numeric value can be used to
- access the variables.
-
-
- QTAwk - 9-15 - QTAwk
-
-
-
-
-
-
- Section 9.4.5 Built-in Functions
-
-
- The user variables are accessed in the order defined in the
- user utility starting with one (1). If the integer value of
- name_expr exceeds the number of user-defined variables, then
- a non-variable is returned. The second parameter must be a
- variable. Upon return, this variable will have a string value
- equal to the name of the variable found or the null string if
- name_expr exceeds the number of user-defined variables. The
- return value of this variable may be tested to assure that a
- variable was found.
-
- The functional return value may be used in any expression
- just as the variable itself would. This includes operating on
- the return value with the array index operators, "[]".
-
- Note: This form may be used to access global variables ONLY.
- Local variables cannot be accessed with this form of the
- function.
-
- The following short function will return the number of
- user-defined global variables:
-
- # function to return the current number of
- # GLOBAL variables defined in utility
- function var_number(display) {
- local cnt, j, jj;
-
- for ( cnt = 1, j = ud_sym(cnt,jj) ; jj ; j = ud_sym(++cnt,jj) )
- if ( display ) print cnt ∩ " : " ∩ jj ∩ " ==>" ∩ j ∩ "<==";
- return cnt - 1;
- }
-
- The following function may be called with the name of the
- variable desired. The value of the variable will be returned.
- Note that the appropriate variables have been defined in the
- "BEGIN" action.
-
- BEGIN {
- #define the conversion variables
- _kilometers_to_statute_miles_ = 1.609344; # kilometers / mile (exact)
- _statute_miles_to_kilometers_ = 1/1.609344; # kilometers / mile (exact)
- _inches_to_centimeters_ = 2.54;
- _centimeters_to_inches_ = 1/2.54;
- _radians_to_degrees_ = 180/pi;
- _degrees_to_radians_ = pi/180;
- }
- # function to return the appropriate conversion
-
-
- QTAwk - 9-16 - QTAwk
-
-
-
-
-
-
- Section 9.4.5 Built-in Functions
-
-
- function conversion_factor(conversion_name) {
- local name = '_' ∩ conversion_name ∩ '_';
- return ud_sym(name);
- }
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- QTAwk - 9-17 - QTAwk
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- QTAwk - 9-18 - QTAwk
-
-
-
-
-
-
- Section 10.0 Format Specification
-
-
- E-10.0 FORMAT SPECIFICATIONF-Ç
-
- QTAwk follows the Draft ANSI C language standard for the format
- string in the 'printf' and 'fprintf' functions except for the 'P'
- and 'n' types, which are not supported and will give
- unpredictable results.
-
- A format specification has the form:
-
- %[flags][width][.precision][h | l | L]type
-
- which is matched by the following regular expression:
-
- /%{flags}?{width}?{precision}?[hlL]?{type}/
-
- with:
-
- flags = /[-+\s#0]/;
- width = /([0-9]+|\*)/;
- precision = /(\.([0-9]+|\*))/;
- type = /[diouxXfeEgGcs]/;
-
- Each field of the format specification is a single character or
- a number signifying a particular format option. The type
- character, which appears after the last optional format field,
- enclosed in braces '[..]', determines whether the associated
- argument is interpreted as a character, a string, or a number.
- The simplest format specification contains only the percent sign
- and a type character (for example, %s). The optional fields
- control other aspects of the formatting, as follows:
-
- 1. flags ==> Control justification of output and printing of
- signs, blanks, decimal points, octal and hexadecimal
- prefixes.
-
- 2. width ==> Control minimum number of characters output.
-
- 3. precision ==> Controls maximum number of characters printed
- for all or part of the output field, or minimum number of
- digits printed for integer values.
-
- 4. h, l, L ==> Prefixes that determine size of argument
- expected (this field is retained only for compatibility to C
- format strings).
-
- a) h ==> Used as a prefix with the integer types d, i, o,
-
-
- QTAwk - 10-1 - QTAwk
-
-
-
-
-
-
- Section 10.0 Format Specification
-
-
- x, and X to specify that the argument is short int, or
- with u to specify a short unsigned int
-
- b) l == > Used as a prefix with d, i, o, x, and X types to
- specify that the argument is long int, or with u to
- specify a long unsigned int; also used as a prefix with
- e, E, f, g, and G types to specify a double, rather than
- a float
-
- c) L ==> Used as a prefix with e, E, f, g, and G types to
- specify a long double
-
- If a percent sign, '%', is followed by a character that has no
- meaning as a format field, the character is simply copied to the
- output. For example, to print a percent-sign character, use "%%".
-
- E10.1 Output TypesF
-
- Type characters:
-
- 1. d ==> integer, Signed decimal integer
- 2. i ==> integer, Signed decimal integer
- 3. u ==> integer, Unsigned decimal integer
- 4. o ==> integer, Unsigned octal integer
- 5. x ==> integer, Unsigned hexadecimal integer, using "abcdef"
- 6. X ==> integer, Unsigned hexadecimal integer, using "ABCDEF"
- 7. f ==> float, Signed value having the form
-
- [-]dddd.dddd
-
- where dddd is one or more decimal digits. The number of
- digits before the decimal point depends on the magnitude of
- the number, and the number of digits after the decimal point
- depends on the requested precision.
-
- 8. e ==> float, Signed value having the form
-
- [-]d.dddde[sign]ddd,
-
- where d is a single decimal digit, dddd is one or more
- decimal digits, ddd is exactly three decimal digits, and sign
- is + or -.
-
- 9. E ==> float, Identical to the e format, except that E
- introduces the exponent instead of e.
- 10. g ==> float, Signed value printed in f or e format,
-
-
- QTAwk - 10-2 - QTAwk
-
-
-
-
-
-
- Section 10.1 Format Specification
-
-
- whichever is more compact for the given value and precision.
- The e format is used only when the exponent of the value is
- less than -4 or greater than the precision argument. Trailing
- zeros are truncated and the decimal point appears only if one
- or more digits follow it.
-
- 11. G ==> float, Identical to the g format, except that E
- introduces the exponent (where appropriate) instead of e.
-
- 12. c ==> character, Single character
-
- 13. s ==> string, Characters printed up to the first null
- character ('\0') or until the precision value is reached.
-
- E10.2 Output FlagsF
-
- Flag Characters
-
- 1. '-' ==> Left justify the result within the given field
- width. Default: Right justify.
-
- 2. '+' ==> Prefix the output value with a sign (+ or -) if the
- output value is of a signed type. Default: Sign appears only
- for negative signed values (-).
-
- 3. blank (' ') ==> Prefix the output value with a blank if the
- output value is signed and positive. The blank is ignored if
- both the blank and + flags appear. Default: No blank.
-
- 4. '#' ==> When used with the o, x, or X format, the # flag
- prefixes any nonzero output value with 0, 0x, or 0X,
- respectively. Default: No blank.
-
- 5. '#' ==> When used with the e, E, or f format, the # flag
- forces the output value to contain a decimal point in all
- cases. Default: Decimal point appears only if digits follow
- it.
-
- 6. '#' ==> When used with the g or G format, the # flag forces
- the output value to contain a decimal point in all cases and
- prevents the truncation of trailing zeros. Default: Decimal
- point appears only if digits follow it. Trailing zeros are
- truncated.
-
- 7. '#' ==> Ignored when used with c, d, i, u or s
-
-
-
- QTAwk - 10-3 - QTAwk
-
-
-
-
-
-
- Section 10.2 Format Specification
-
-
- 8. '0' ==> For d, i, o, u, x, X, e, E, f, g, and G conversions,
- leading zeros (following any indication of sign or base) are
- used to pad to the field width; no space padding is
- performed. If the 0 and - flags both appear, the 0 flag will
- be ignored. For d, i, o, u, x, and X conversions, if a
- precision is specified, the 0 flag will be ignored. For other
- conversions the behavior is undefined. Default: Use blank
- padding
-
- If the argument corresponding to a floating-point specifier is
- infinite or indefinite, the following output is produced:
-
- + infinity ==> 1.#INFrandom-digits
- - infinity ==> -1.#INFrandom-digits
- Indefinite ==> digit.#INDrandom-digits
-
- E10.3 Output WidthF
-
- The width argument is a non-negative decimal integer controlling
- the minimum number of characters printed. If the number of
- characters in the output value is less than the specified width,
- blanks are added to the left or the right of the values
- (depending on whether the - flag is specified) until the minimum
- width is reached. If width is prefixed with a 0 flag, zeros are
- added until the minimum width is reached (not useful for
- left-justified numbers).
-
- The width specification never causes a value to be truncated; if
- the number of characters in the output value is greater than the
- specified width, or width is not given, all characters of the
- value are printed (subject to the precision specification).
-
- The width specification may be an asterisk (*), in which case an
- integer argument from the argument list supplies the value. The
- width argument must precede the value being formatted in the
- argument list. A nonexistent or small field width does not cause
- a truncation of a field; if the result of a conversion is wider
- than the field width, the field expands to contain the conversion
- result.
-
- E10.4 Output PrecisionF
-
- The precision specification is a non-negative decimal integer
- preceded by a period, '.', which specifies the number of
- characters to be printed, the number of decimal places, or the
- number of significant digits. Unlike the width specification, the
-
-
- QTAwk - 10-4 - QTAwk
-
-
-
-
-
-
- Section 10.4 Format Specification
-
-
- precision can cause truncation of the output value, or rounding
- in the case of a floating-point value.
-
- The precision specification may be an asterisk, '*', in which
- case an integer argument from the argument list supplies the
- value. The precision argument must precede the value being
- formatted in the argument list.
-
- The interpretation of the precision value, and the default when
- precision is omitted, depend on the type, as shown below:
-
- 1. d,i,u,o,x,X ==> The precision specifies the minimum number
- of digits to be printed. If the number of digits in the
- argument is less than precision, the output value is padded
- on the left with zeros. The value is not truncated when the
- number of digits exceeds precision. Default: If precision is
- 0 or omitted entirely, or if the period (.) appears without a
- number following it, the precision is set to 1.
-
- 2. e, E ==> The precision specifies the number of digits to be
- printed after the decimal point. The last printed digit is
- rounded. Default: Default precision is 6; if precision is 0
- or the period (.) appears without a number following it, no
- decimal point is printed.
-
- 3. f ==> The precision value specifies the number of digits
- after the decimal point. If a decimal point appears, at least
- one digit appears before it. The value is rounded to the
- appropriate number of digits. Default: Default precision is
- 6; if precision is 0, or if the period (.) appears without a
- number following it, no decimal point appears.
-
- 4. g, G ==> The precision specifies the maximum number of
- significant digits printed. Default: Six significant digits
- are printed, without any trailing zeros that are truncated.
-
- 5. c ==> No effect. Default: Character printed
-
- 6. s ==> The precision specifies the maximum number of
- characters to be printed. Characters in excess of precision
- are not printed. Default: All characters of the string are
- printed.
-
-
-
-
-
-
- QTAwk - 10-5 - QTAwk
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- QTAwk - 10-6 - QTAwk
-
-
-
-
-
-
- Section 11.0 User-Defined Functions
-
-
- E-11.0 USER-DEFINED FUNCTIONSF-Ç
-
- QTAwk supports user-defined functions and has enhanced them over
- Awk in several important aspects.
-
- E11.1 Local VariablesF
-
- In QTAwk it is no longer necessary to declare local variables as
- excess arguments in the function definition. QTAwk has included
- the 'local' keyword. This keyword may be used in any compound
- statement, but was invented specifically for user-defined
- functions. Consider the simple function to accumulate words from
- the current input record in the formatting utility:
-
- # accumulate words for line
- function addword(w) {
- local lw = length(w); # length of added word
-
- # check new line length
- if ( cnt + size + lw >= width ) printline(yes);
- line[++cnt] = w; # add word to line array
- size += lw;
- }
-
- That lw is local to the function and will disappear when the
- function is exited is obvious from the definition of lw. It is
- also now easy to pick out the function arguments, 'w' in this
- case. The initialization of lw to the length of the argument
- passed is also easily picked up from the definition. The 'local'
- keyword thus truly separates local variables from the function
- arguments.
-
- E11.2 Argument CheckingF
-
- By using the '_arg_chk' built-in variable, it is also possible
- to have QTAwk now do some argument checking for the user. If
- _arg_chk is TRUE, then QTAwk will, at run-time, check the number
- of arguments passed against the number of arguments defined. If
- the number passed differs from the number defined, then a
- run-time error is issued and QTAwk halts. When '_arg_chk' is
- FALSE, QTAwk will check at run-time only that the number of
- arguments passed is less than or equal to the number defined.
- This follows the Awk practice and allows for the use of arguments
- defined, but not passed, as local variables. The default value of
- '_arg_chk' is FALSE. It is recommended that '_arg_chk' be set to
- TRUE and the 'local' keyword used to define variables meant to be
-
-
- QTAwk - 11-1 - QTAwk
-
-
-
-
-
-
- Section 11.2 User-Defined Functions
-
-
- local to a function.
-
- E11.3 Variable Length Argument ListsF
-
- QTAwk allows user-defined functions to be defined with a
- variable number of arguments. The actual number of arguments will
- be determined from the call at run-time. QTAwk follows the C
- syntax for defining a function with a variable number of
- arguments:
-
- # function to determine maximum value
- function max(...) {
- local max = vargv[1];
- local i;
-
- for ( i = 2 ; i <= vargc ; i++ )
- if ( max < vargv[i] ) max = vargv[i];
- return max;
- }
-
- The ellipses, '...', is used as the last argument in a
- user-defined argument list to indicate that a variable number of
- arguments follow. In the max function shown, no fixed arguments
- are indicated. Within the function, the variable arguments are
- accessed via the built-in singly-dimensioned array, 'vargv'. The
- built-in variable 'vargc' is set equal to the number of elements
- of the array and, hence, the variable number of arguments passed
- to the function. Since the variable arguments are passed in a
- singly dimensioned array, the 'for' statement may be used to
- access each in turn:
-
- # function to determine maximum value
- function max(...) {
- local max = vargv[1];
- local i;
-
- for ( i in vargv )
- if ( max < vargv[i] ) max = vargv[i];
- return max;
- }
-
- A user-defined function may have fixed arguments and a variable
- number of arguments following:
-
- # function with both fixed and variable number of arguments
- function sample(arg1,arg2,...) {
-
-
- QTAwk - 11-2 - QTAwk
-
-
-
-
-
-
- Section 11.3 User-Defined Functions
-
-
- .
- .
- .
- }
-
- If a user-defined function is to have a variable number of
- arguments, then the 'local' keyword must be used to define local
- variables. The ellipses denoting the variable arguments must be
- last in the function definition argument list.
-
- E11.4 Null Argument ListF
-
- A user defined function may be defined with no arguments.
- Consider the function to accumulate words from input records for
- the text formatter:
-
- # function to add current line to parsed text
- function add_line() {
- for ( i = 1 ; i <= NF ; i++ ) if ( length($i) ) addword($i);
- }
-
- In the case of a user-defined function with no arguments to be
- passed, the function may be invoked with no parenthesized
- parameter list. Consider the invocation of the add-line function
- in the text formatter. The action executed for input records
- which do not start with a format control word is:
-
- {
- if ( format ) add_line;
- else if ( table_f ) format_table($0);
- else output_line($0,FALSE);
- }
-
- In QTAwk, the add_line function may be invoked as "add_line" as
- above or as "add_line()", with a null length parameter list.
-
- QTAwk has also relaxed the Awk rule that the left parenthesis of
- the parameter list must immediately follow a user-defined
- function invocation. QTAwk allows blanks between the name and the
- left parenthesis. The blanks are ignored.
-
- E11.5 Arrays and Used-Defined FunctionsF
-
- Just as arrays are integrated into QTAwk expressions, arrays are
- also integrated into the passing of arguments to, and the return
- value from, user-defined functions. Used-defined functions may
-
-
- QTAwk - 11-3 - QTAwk
-
-
-
-
-
-
- Section 11.5 User-Defined Functions
-
-
- return arrays as well as scalars. This will be illustrated in a
- sample utility later.
-
- QTAwk passes scalar arguments to user-defined functions by
- value, i.e., if a scalar variable is specified as an argument to
- a function, a copy of the variable is passed to the function and
- not the variable itself. This is called pass by value. Thus, if
- the function alters the value of the argument, the variable is
- not altered, only the copy. When the function terminates, the
- copy is discarded, and the variable still retains its original
- value.
-
- In contrast, QTAwk passes array variables by "reference". This
- means that the local variable represented by the function
- argument, is the referenced variable and not a copy. Any changes
- to the local variable are actually made to the referenced
- variable.
-
- In QTAwk, function arguments may also be constant arrays and not
- variable arrays, i.e., the argument may be the result of an
- arithmetic operation on an array. For example, if A is an array,
- then the result of the expression
-
- "A + 10"
-
- is an array and would be passed as a constant array as a
- function argument. Such arrays are discard at function
- termination.
-
- QTAwk passes by reference under three conditions:
-
- 1. The argument is a global or local variable and an array,
-
- 2. The argument is a global or local variable and used as an
- array, i.e., indexed or referenced by an 'in' statement, in
- the called function. This is true whether the referenced
- variable is a scalar or array when the function is called. If
- the referenced variable was a scalar when the function is
- called, then at function termination, if the statement(s) in
- which the argument was indexed WERE EXECUTED, the referenced
- variable will be an array with the index values referenced.
- This behaviour is identical to creating array elements in
- global variables by referencing the elements.
-
- 3. The argument is a global or local scalar variable and at
- function termination the argument is an array. In this case,
-
-
- QTAwk - 11-4 - QTAwk
-
-
-
-
-
-
- Section 11.5 User-Defined Functions
-
-
- the argument may not have been directly referenced as an
- array, but may be the result of an operation involving an
- array. Alternatively the argument may have been passed to
- another function which referenced it as an array or set it to
- the result of operations on arrays.
-
- The following QTAwk utility with a user-defined function will
- illustrate the use of arrays and scalars as function arguments
- and the return of arrays by user-defined functions.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- QTAwk - 11-5 - QTAwk
-
-
-
-
-
-
- Section 11.5 User-Defined Functions
-
-
- BEGIN {
- # create arrays 'a' and 'b'
- for ( i = 1 ; i < 6 ; i++ ) a[i] = b[i] = i;
- # create scalars 'c' and 'f'
- c = f = 10;
- # pass scalar variables/values and return scalar value
- print "scalar : "set_var(c,c,c)" and c == "c;
- # pass two arrays, 'a' & 'b',
- # and one scalar constant, 'c+0'
- # function will return an array "== a + b + (c+0)"
- d = set_var(a,b,c+0);
- # print returned array 'd' (== a + b + (c+0))
- for ( i in d ) print "d["i"] = "d[i];
- #print scalar 'c' to show unchanged
- print c;
- # pass two arrays, 'a' & 'b',
- # and one scalar variable, 'c'
- # function will return an array "== a + b + c"
- e = set_var(a,b,c);
- # print returned array
- for ( i in e ) print "e["i"] = "e[i];
- # print former scalar, 'c',
- # converted to array by operation c = b + 2;
- for ( i in c ) print "c["i"] = "c[i];
- # pass two arrays, 'a' & 'b', and constant array, 'b+0'
- h = set_var(a,b,b+0);
- # print returned array
- for ( i in h ) print "h["i"] = "h[i];
- # print array 'b' to assure that unchanged
- for ( i in b ) print "b["i"] = "b[i];
- # attempt illegal operation in function: w = f + b
- # adding array, 'b', to scalar, 'f'.
- # error message will be issued and execution halted
- g = set_var(f,b,f);
- }
-
- function set_var(x,y,z) {
- # create local variable
- local w = x + y + z;
- # alter third argument
- # if first & second arguments arrays,
- # this will convert third to an array
- # (if not already passed as an array).
- z = y + 2;
- return w;
- }
-
-
- QTAwk - 11-6 - QTAwk
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- QTAwk - 11-7 - QTAwk
-
-
-
-
-
-
- Section 11.5 User-Defined Functions
-
-
- This QTAwk utility illustrates several ideas in using arrays and
- user-defined functions in QTAwk. The line:
-
- print "scalar : "set_var(c,c,c)" and c == "c;
-
- calls the function 'set_var' with three scalar variables, all
- 'c'. Three copies of 'c' are actually passed. The local variable,
- 'w', is computed using scalar quantities and is a scalar
- quantity. Since argument 'y' is a scalar quantity, the result of
- the expression:
-
- z = y + 2;
-
- is a scalar and the third argument, 'c', is unchanged. A
- functional value of 30 (== c + c + c) is returned.
-
- The line:
-
- d = set_var(a,b,c+0);
-
- passes arrays as the first and second arguments. The third
- argument is a constant scalar value, and thus cannot be changed
- by the function called. The return value of the function:
-
- w = x + y + z; (== a + b + 10;)
-
- is an array. The line:
-
- for ( i in d ) print "d["i"] = "d[i];
-
- prints the values of the array:
-
- d[1] = 12
- d[2] = 14
- d[3] = 16
- d[4] = 18
- d[5] = 20
-
- The line:
-
- e = set_var(a,b,c);
-
- passes arrays as the first and second arguments. The third
- argument is a variable scalar value, and thus can be changed by
- the function called if the third argument at function termination
- is an array. The return value of the function:
-
-
- QTAwk - 11-8 - QTAwk
-
-
-
-
-
-
- Section 11.5 User-Defined Functions
-
-
-
- w = x + y + z; (== a + b + c;)
-
- is an array as above. Note that at function termination, the
- third argument is now an array since it was set to the result of
- an operation on an array:
-
- z = y + 2;
-
- which is now equivalent to:
-
- z = b + 2;
-
- Thus, at function termination the scalar variable 'c' has been
- converted to an array. The line:
-
- for ( i in c ) print "c["i"] = "c[i];
-
- will print the values of the array elements:
-
- c[1] = 3 ( == b[1] + 2)
- c[2] = 4 ( == b[2] + 2)
- c[3] = 5 ( == b[3] + 2)
- c[4] = 6 ( == b[4] + 2)
- c[5] = 7 ( == b[5] + 2)
-
- The line:
-
- h = set_var(a,b,b+0);
-
- passes arrays as the first and second arguments. The third
- argument is a constant array value, and thus cannot be changed by
- the function called. The return value of the function is an array
- as above. Note that at function termination, the third argument
- is again an array as above. However, the third argument has been
- passed as a constant array and thus no variable is changed as 'c'
- was above. The third argument is discarded at function
- termination. The line:
-
- for ( i in b ) print "b["i"] = "b[i];
-
- prints the array 'b' to assure that it was not changed.
-
- The line:
-
- g = set_var(f,b,f);
-
-
- QTAwk - 11-9 - QTAwk
-
-
-
-
-
-
- Section 11.5 User-Defined Functions
-
-
- will result in an illegal operation in the function:
-
- local w = x + y + z; (== f + b + f;)
-
- this operation is now attempting to add an array to a scalar:
-
- f + b
-
- This operation will result in an error message and halt
- execution.
-
- The above sample QTAwk utility illustrates the power of
- user-defined functions in automatically handling scalars and
- arrays as both arguments and return values and adjusting
- accordingly. The same function may be used interchangably for
- both arrays and scalars with natural and predictable results.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- QTAwk - 11-10 - QTAwk
-
-
-
-
-
-
- Section 12.0 Trace Statements
-
-
- E-12.0 TRACE STATEMENTSF-Ç
-
- QTAwk has added a facility for debugging utilities. This
- facility is activated through the built-in variable 'TRACE'.
- QTAwk can trace the loop control statements, 'if', 'while', 'do',
- 'for' (both forms), and 'switch'. In addition, built-in functions
- and user-defined functions are traced.
-
- By default, TRACE is set to FALSE and no tracing is done. The
- variable may be set to any value, numeric, string or regular
- expression and the value will determine the statements traced. If
- TRACE has a nonzero numeric value then QTAwk will trace all
- statements of the type listed.
-
- E12.1 Selective Statement TracingF
-
- If TRACE has a string value, then the string is compared against
- the keywords:
-
- 1. if
- 2. while
- 3. do
- 4. for
- 5. switch
- 6. function_b (built-in functions)
- 7. function_u (user-defined functions)
-
- If an exact match (case is important) is found, then the
- statement is traced. If TRACE is set to a regular expression,
- then the keywords are matched against the regular expression. If
- a match is found, then the statement is traced.
-
- E12.2 Trace OutputF
-
- In tracing a statement, QTAwk issues a message to the standard
- output file. The message issued will have the form:
-
- Stmt Trace: stmt_str value_str
- Action File line: xxxx
- Scanning File: FILENAME
- Line: xxxxx
- Record: xxxxxx
-
- where stmt_str is the appropriate keyword listed above for the
- statement traced and value_str is a value dependent upon the
- statement traced as listed below:
-
-
- QTAwk - 12-1 - QTAwk
-
-
-
-
-
-
- Section 12.2 Trace Statements
-
-
- keyword value string
- if ==> 0/1 conditional expression TRUE/FALSE
- while ==> 0/1 conditional expression TRUE/FALSE
- do ==> 0/1 conditional expression TRUE/FALSE
- for ==> 0/1 conditional expression TRUE/FALSE
- for ==> subscript value
- switch ==> switch expression value
- function_b ==> function name
- function_u ==> function name
-
- When a statement that can be traced is encountered, the value
- of the statement is determined, e.g., for an 'if' statement, the
- value of the conditional is evaluated before issuing the trace
- statement.
-
- The following TRACE values will trace the statements indicated:
-
- 1. TRACE = "if";
- This value will trace all 'if' statements, indicating the
- TRUE/FALSE value of the conditional.
-
- 2. TRACE = /^[iwd]/;
- This value will trace all 'if', 'while' and 'do' statements,
- indicating the TRUE/FALSE value of the conditional.
-
- 3. TRACE = /_u$/;
- This value will trace all user-defined functions, indicating
- the function name in the trace message.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- QTAwk - 12-2 - QTAwk
-
-
-
-
-
-
- Section 13.0 Built-in Variables
-
-
- E-13.0 BUILT-IN VARIABLESF-Ç
-
- QTAwk offers the following built-in variables. The variables may
- be set by the user. Those marked with an asterisk, '*', are new
- to QTAwk:
-
- 1. * _arg_chk ==> TRUE/FALSE. Default value = FALSE. If FALSE,
- the number of arguments passed to a user defined function is
- checked only to ensure that the number is not more than
- defined. Arguments defined, but not passed are initialized
- and passed for use as local variables as in Awk. If TRUE, the
- number of arguments passed to a user defined function is
- checked for number against the number defined for the
- function, unless the function was defined with a variable
- number of arguments. If the number passed is not exactly
- equal to the number defined, an error message is issued and
- execution halted. For this case, any local variables must be
- defined with the 'local' keyword.
-
- 2. ARGC ==> set equal to the number of arguments passed to
- QTAwk as in Awk.
-
- 3. * ARGI ==> equal to the index value in ARGV of the next
- command line argument to be processed. This value may be
- changed and will change the array element of ARGV processed
- next. When the last element of ARGV is the current input
- file, ARGI is set to one of two integer values:
-
- a) the integer value of the index of the last element of
- ARGV plus one, or
- b) if the last element of ARGV has a string index, ARGI is
- set to zero.
-
- Setting ARGI to zero, ARGC or a value for which there is no
- element of ARGV with a corresponding index value, will cause
- the current input file to be the last command line argument
- processed.
-
- 4. ARGV ==> one-dimensional array with elements equal to the
- arguments passed to QTAwk as in Awk. The index values are
- integers ranging in value from zero to ARGC. ARGV[0] ==
- filename by which QTAwk invoked, including full path
- information.
-
- 5. *CYCLE_COUNT ==> value for the current cycle through the
- outer pattern match loop for the current record. Value
-
-
- QTAwk - 13-1 - QTAwk
-
-
-
-
-
-
- Section 13.0 Built-in Variables
-
-
- incremented by the 'cycle' statement.
-
- 6. * DEGREES ==> TRUE/FALSE. Default value = FALSE. If FALSE
- trigonometric functions assume radian values are passed and
- return radian values. If TRUE trigonometric functions assume
- degree values are passed and return degree values.
-
- 7. ENVIRON ==> one-dimensional array with elements equal to the
- environment strings passed to QTAwk. The index values are
- integers ranging in value from zero to the number of
- environment strings defined less one.
-
- 8. * FALSE ==> predefined with zero, 0, constant value.
-
- 9. FILENAME ==> equal to string value of current filename,
- including any path specified. If assigned a new value, the
- file with a name equal to the new string value is opened (or
- an error message displayed if the filename is illegal). The
- new file becomes the current input file. The former input
- file is not closed and may continue to be input by
- re-assigning FILENAME, putting the name in ARGV for future
- use or read with the 'fgetline' function.
-
- 10. FNR ==> equal to current record number of current input
- file.
-
- 11. FS ==> value of current input field separator. The default
- value for FS is /[\t-\r\s]+/, i.e., any consecutive white
- space characters. If FS is set on the command line or in the
- user utility then the following rules apply (see also RS
- below):
- a) setting to a single blank, ' ' or " ", will set FS to the
- default value of /[\t-\r\s]+/,
- b) setting to a single character or a value which when
- converted to a string yields a string of a single
- character in length, 'x' or "x", will cause the single
- character to be used as the input record field separator,
- c) setting to a regular expression, a multiple character
- string or a value which when converted to a string yields
- a multiple character string, the string will be
- considered a regular expression and converted to the
- regular expression internal form when the assignment is
- made. Input records are scanned for a string matching the
- regular expression and matching strings become field
- separators. The length of matching strings is governed by
- the LONGEST_EXPR built-in variable.
-
-
- QTAwk - 13-2 - QTAwk
-
-
-
-
-
-
- Section 13.0 Built-in Variables
-
-
- 12. * LONGEST_EXP ==> TRUE/FALSE. default == TRUE. If TRUE
- longest string matching a regular expression is found in:
- a) patterns
- b) match operators, '~~' and '!~'
- c) 'match' function
- d) 'gsub' function
- e) 'sub' function
- f) 'strim' function
- g) input record separator strings when RS is considered a
- regular expression,
- h) input record field separator strings when FS is
- considered a regular expression,
- i) field separator matching in 'split' function when field
- separator is a regular expression
-
- If FALSE then the first string matching the regular
- expression is found.
-
- 13. *MAX_CYCLE ==> Maximum value for CYCLE_COUNT for cycling
- through outer pattern match loop with current input record.
- Default value == 100.
-
- 14. NF ==> equal to the number of fields in the current input
- record
-
- 15. * NG ==> set to the number of the expression matching the
- current input record for GROUP patterns.
-
- 16. NR ==> total number of records read so far across all input
- files.
-
- 17. OFMT ==> output and string conversion format for numbers.
- Default value of "%.6g".
-
- 18. OFS ==> output field separator. Default value of a single
- blank, ' '.
-
- 19. ORS ==> output record separator. Default value of a single
- newline character, '\n'.
-
- 20. * RETAIN_FS ==> TRUE/FALSE. Default value = FALSE. If FALSE
- then OFS is used between fields to reconstruct $0 whenever a
- field value is altered. If TRUE the original field separator
- characters are retained in reconstructing $0 whenever a field
- value is altered.
-
-
-
- QTAwk - 13-3 - QTAwk
-
-
-
-
-
-
- Section 13.0 Built-in Variables
-
-
- 21. RS ==> input record separator. The default value for RS is
- a single newline character, '\n'. If RS is set on the command
- line or in the user utility then the following rules apply
- (see also FS above):
- a) setting to the null string, "", will set RS to the
- regular expression /\n\n/. Thus, blank lines, i.e., two
- consecutive newline characters, bound input records.
- b) setting to a single character or a value which when
- converted to a string yields a string of a single
- character in length, 'x' or "x", will cause the single
- character to be used as the input record separator,
- c) setting to a regular expression, a multiple character
- string or a value which when converted to a string yields
- a multiple character string, the string will be
- considered a regular expression and converted to the
- regular expression internal form when the assignment is
- made. The input file character stream is scanned for a
- string matching the regular expression and matching
- strings become record separators. The length of matching
- strings is governed by the LONGEST_EXPR built-in
- variable.
-
- 22. * TRACE ==> control statement tracing. Default value =
- FALSE. Determines whether statements are traced during
- execution.
-
- 23. * TRANS_FROM ==> string used by 'stran' function for
- translating from. Default value is
- "ABCDEFGHIJKLMNOPQRSTUVWXYZ".
-
- 24. * TRANS_TO ==> string used by 'stran' function for
- translating to. Default value is
- "abcdefghijklmnopqrstuvwxyz".
-
- 25. * TRUE ==> predefined with one, 1, constant value.
-
- 26. * CLENGTH ==> length of string matched in 'case' statement
-
- 27. * CSTART ==> start of string matched in 'case' statement
-
- 28. * MLENGTH ==> length of string matched in match operators,
- '~~' and '!~'
-
- 29. * MSTART ==> start of string matched in match operators,
- '~~' and '!~'
-
-
-
- QTAwk - 13-4 - QTAwk
-
-
-
-
-
-
- Section 13.0 Built-in Variables
-
-
- 30. RLENGTH ==> length of string matched in 'match' function
-
- 31. RSTART ==> start of string matched in 'match' function
-
- E13.1 User Function Variable Argument ListsF
-
- 1. * vargc ==> count of variable arguments passed to current
- invocation
-
- 2. * vargv ==> singly-dimensioned array of variable arguments
- passed to current function invocation. Indexing numeric
- starting at one, 1.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- QTAwk - 13-5 - QTAwk
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- QTAwk - 13-6 - QTAwk
-
-
-
-
-
-
- Section 14.0 Invoking QTAwk
-
-
- E-14.0 COMMAND LINE INVOCATIONF-Ç
-
- There are two ways of specifying utilities to QTAwk:
- 1. Specifying the utility on the command line, e.g.,
-
- QTAwk "/^$/{if(!bc++)print;next;}{bc=FALSE;print;}" file1
-
- This short command line utility will read file1, printing
- only the first blank line in a series of blank lines. All
- other lines are printed.
-
- Note that the "utility" has been enclosed in double quotes,
- ". This is necessary to keep PC/MS-DOS from interpreting the
- utility as a file. In addition, if the utility contains
- symbols recognized by PC/MS-DOS, e.g., the re-direction
- operators, '<' or '>', the quotes keep PC/MS-DOS from
- recognizing the symbols. If the utility contains quotes,
- e.g., a constant string definition, then the imbedded quotes
- should be preceded by a back-slash, '\'.
-
- For example, the short utility:
-
- QTAwk "{print FNR\" : \"$0;}" file1
-
- prints each line of file1 preceded by the line number. The
- constant string,
-
- " : "
-
- separates the line number from the line. Back-slashes must
- precede the quotes surrounding the constant string.
-
- 2. -futilityfile
- or
- -f utilityfile
-
- When a utility may be used frequently or grows too long to
- include on the command line as above, it becomes necessary to
- store it in a file. The utility may then be specified to
- QTAwk with this option. The blank between the 'f' command
- line option and the utility file name is optional.
-
- E14.1 Multiple QTAwk UtilitiesF
-
- More than one utility file may be specified to QTAwk in this
- manner. Each utility file specified is read in the order
-
-
- QTAwk - 14-1 - QTAwk
-
-
-
-
-
-
- Section 14.1 Invoking QTAwk
-
-
- specified and combined into a single QTAwk utility. In this
- manner it is possible to keep constantly used pattern-actions or
- user-defined functions in separate files and combine them into
- QTAwk utilities as necessary. The order of the utility files is
- not important except for the order in which predefined patterns
- are executed and the order in which pattern-action pairs are
- executed. Thus if a utility file contained only common
- user-defined functions, it may be defined in any order in
- relation to other utility files.
-
- Scanning of the command line for arguments may be stopped with
- the double hyphen command line argument, "--". This argument is
- not passed to the QTAwk utility.
-
- This method of specifying utilities to QTAwk cannot be combined
- with the command line utility definition method.
-
- The command line is scanned for all utility files specified with
- the 'f' option prior to reading the utility files or any input
- files. The utility files are then "removed" from the command line
- and the command line argument count.
-
- E14.2 Setting the Field SeparatorF
-
- The QTAwk input record separator, FS, may be set on the command
- line with the 'F' option.
-
- QTAwk -F "/:/"
-
- or
-
- QTAwk -F/:/
-
- The blank between the 'F' and the string or regular expression
- defining the new input record separator is optional. This option
- may only be specified once on the command line. The command line
- is scanned for all 'F' options prior to reading any utility files
- or input files. The option and the new value for FS are then
- "removed from the command line and the command line count.
-
- Another method is available for setting FS prior to reading the
- input files. This method is more general, may be used multiple
- times on the command line and may be used to set any utility
- variable and not just FS.
-
- E14.3 Setting Variables on the Command LineF
-
-
- QTAwk - 14-2 - QTAwk
-
-
-
-
-
-
- Section 14.3 Invoking QTAwk
-
-
- Including the following on the command line:
-
- var = value
-
- or
-
- var=value
-
- will set the variable 'var' to the value specified. 'var' may
- be any built-in or user-defined variable in the QTAwk utility.
- 'var' must be a variable defined in the current QTAwk utility or
- a run-time error will occur and QTAwk will stop processing.
-
- E14.4 QTAwk Execution SequenceF
-
- QTAwk execution follows the following sequence:
-
- 1. The command line is scanned for any 'f' and 'F' options. If
- any such options are found, they are removed from the command
- line.
-
- 2. The QTAwk utility is read and processed. If any 'f' options
- were found in the preceding step, the associated utility
- files are opened, read and processed in the order specified.
- If no 'f' options were specified, the first command line
- argument is processed as the QTAwk utility and then removed
- from the command line arguments.
-
- 3. The ARGC and ARGV built-in variables are set according to
- the command line parameters. The ARGI built-in variable is
- set to 1.
-
- 4. Any "BEGIN" actions in the QTAwk utility are executed. This
- is done prior to any further interpretation of the command
- line arguments.
-
- 5. The command line argument, ARGV[ARGI], is examined. One of
- two actions is taken depending on the form of the argument:
-
- a) An argument of the form:
-
- var = value
-
- or
-
- var=value
-
-
- QTAwk - 14-3 - QTAwk
-
-
-
-
-
-
- Section 14.4 Invoking QTAwk
-
-
- is interpreted as setting the variable specified, to
- the value specified.
-
- b) Any other argument is interpreted as a file name. The
- file specified is opened for input. If the file does not
- exist, an error message is issued and execution halted.
- If a single hyphon, '-', is specified, it is interpreted
- as representing the standard input file. If no command
- line arguments are specified beyond the QTAwk utility or
- variable setting commands, the standard input file is
- read for input.
-
- 6. Any "INITIAL" actions in the QTAwk utility are executed.
-
- 7. The input file is read record by record and matched against
- the patterns present in the QTAwk utility. If no
- pattern/action pairs are given in the QTAwk utility, each
- record is read, the NF, FNR, NR and field variables are set
- and the record is then discarded. If an 'exit' or 'endfile'
- statement is executed, action passes to the next step below.
-
- 8. When the end of the input file is reached or an "exit" or
- "endfile" statement is executed, any 'FINAL' actions are
- executed.
-
- 9. The input file is closed.
-
- 10. If an "exit" statement was executed, processing passes to
- step 11) below, else the following steps are executed:
-
- a) The element of ARGV corresponding the the current index
- value of ARGI is sought. If none is found, processing
- proceeds as if the "exit" statement was executed.
- b) ARGI is set to the index value of the next element of
- ARGV. If there is no next element of ARGV, processing
- proceeds as if the "exit" statement was executed.
- c) processing continues with step 5) above.
-
- 11. Any "END" actions in the QTAwk utility are executed.
-
- 12. QTAwk execution halts.
-
-
-
-
-
-
-
- QTAwk - 14-4 - QTAwk
-
-
-
-
-
-
- Section 15.0 QTAwk Limits
-
-
- E-15.0 LIMITSF-Ç
-
- QTAwk has the following limitations:
-
- 1024 fields
-
- 4096 characters per input record
-
- 4096 characters per formatted output record
-
- 256 characters in character class (with character ranges
- expanded)
-
- 256 user-defined functions
-
- 256 local variables
-
- 256 global variables
-
- 1024 characters in constant strings
-
- 1024 characters in regular expressions on input
-
- 4096 characters in regular expressions after expansion of named
- expressions and repetition operators.
-
- 4096 characters in strings used as regular expressions after
- expansion of named expressions and repetition operators.
-
- 4096 characters in strings returned by 'replace' functions
-
- 4096 characters in input strings read by 'getline' and fgetline'
- functions
-
- 4096 characters in strings after substitution for 'gsub' and
- 'sub' functions
-
- 4096 characters maximum in strings returned by following
- functions:
- 1. copies
- 2. deletec
- 3. insert
- 4. overlay
- 5. remove
-
-
-
-
- QTAwk - 15-1 - QTAwk
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- QTAwk - 15-2 - QTAwk
-
-
-
-
-
-
- Section 16.0 Appendix I
-
-
- E-16.0 Appendix IF-Ç
-
- ASCII character set
- ( escape sequences shown for non-printable characters )
-
- dec hex char dec hex char dec hex char dec hex char
- ╤ ╤ ╤
- 0 00 NUL │ 32 20 \s │ 64 40 @ │ 96 60 `
- 1 01 ^ SOH │ 33 21 ! │ 65 41 A │ 97 61 a
- 2 02 ^ STX │ 34 22 " │ 66 42 B │ 98 62 b
- 3 03 ^ ETX │ 35 23 # │ 67 43 C │ 99 63 c
- 4 04 ^ EOT │ 36 24 $ │ 68 44 D │ 100 64 d
- 5 05 ^ ENQ │ 37 25 % │ 69 45 E │ 101 65 e
- 6 06 ^ ACK │ 38 26 & │ 70 46 F │ 102 66 f
- 7 07 ^\a BEL │ 39 27 ' │ 71 47 G │ 103 67 g
- 8 08 ^\b BS │ 40 28 ( │ 72 48 H │ 104 68 h
- 9 09 ^ \t HT │ 41 29 ) │ 73 49 I │ 105 69 i
- 10 0A ^ \n LF │ 42 2A * │ 74 4A J │ 106 6A j
- 11 0B ^\v VT │ 43 2B + │ 75 4B K │ 107 6B k
- 12 0C ^\f FF │ 44 2C , │ 76 4C L │ 108 6C l
- 13 0D ^\r CR │ 45 2D - │ 77 4D M │ 109 6D m
- 14 0E ^ SO │ 46 2E . │ 78 4E N │ 110 6E n
- 15 0F ^ SI │ 47 2F / │ 79 4F O │ 111 6F o
- 16 10 ^ DLE │ 48 30 0 │ 80 50 P │ 112 70 p
- 17 11 ^ DC1 │ 49 31 1 │ 81 51 Q │ 113 71 q
- 18 12 ^ DC2 │ 50 32 2 │ 82 52 R │ 114 72 r
- 19 13 ^ DC3 │ 51 33 3 │ 83 53 S │ 115 73 s
- 20 14 ^ DC4 │ 52 34 4 │ 84 54 T │ 116 74 t
- 21 15 ^ NAK │ 53 35 5 │ 85 55 U │ 117 75 u
- 22 16 ^ SYN │ 54 36 6 │ 86 56 V │ 118 76 v
- 23 17 ^ ETB │ 55 37 7 │ 87 57 W │ 119 77 w
- 24 18 ^ CAN │ 56 38 8 │ 88 58 X │ 120 78 x
- 25 19 ^ │ 57 39 9 │ 89 59 Y │ 121 79 y
- 26 1A SUB │ 58 3A : │ 90 5A Z │ 122 7A z
- 27 1B ^ ESC │ 59 3B ; │ 91 5B [ │ 123 7B {
- 28 1C ^ FS │ 60 3C < │ 92 5C \ │ 124 7C |
- 29 1D ^ GS │ 61 3D = │ 93 5D ] │ 125 7D }
- 30 1E ^ │ 62 3E > │ 94 5E ^ │ 126 7E ~
- 31 1F ^ │ 63 3F ? │ 95 5F _ │ 127 7F
-
-
-
-
-
-
-
-
-
- QTAwk - 16-1 - QTAwk
-
-
-
-
-
-
- Section 16.0 Appendix I
-
-
- ASCII character sets. (continued)
-
- dec hex char dec hex char dec hex char dec hex char
- ╤ ╤ ╤
- 128 80 ^Ç │ 160 A0 á │ 192 C0 └ │ 224 E0 α
- 129 81 ^ü │ 161 A1 í │ 193 C1 ┴ │ 225 E1 ß
- 130 82 ^é │ 162 A2 ó │ 194 C2 ┬ │ 226 E2 Γ
- 131 83 ^â │ 163 A3 ú │ 195 C3 ├ │ 227 E3 π
- 132 84 ^ä │ 164 A4 ñ │ 196 C4 ─ │ 228 E4 Σ
- 133 85 ^à │ 165 A5 Ñ │ 197 C5 ┼ │ 229 E5 σ
- 134 86 ^å │ 166 A6 ª │ 198 C6 ╞ │ 230 E6 µ
- 135 87 ^ç │ 167 A7 º │ 199 C7 ╟ │ 231 E7 τ
- 136 88 ^ê │ 168 A8 ¿ │ 200 C8 ╚ │ 232 E8 Φ
- 137 89 ^ë │ 169 A9 ⌐ │ 201 C9 ╔ │ 233 E9 Θ
- 138 8A ^è │ 170 AA ¬ │ 202 CA ╩ │ 234 EA Ω
- 139 8B ^ï │ 171 AB ½ │ 203 CB ╦ │ 235 EB δ
- 140 8C ^î │ 172 AC ¼ │ 204 CC ╠ │ 236 EC ∞
- 141 8D ^ì │ 173 AD ¡ │ 205 CD ═ │ 237 ED φ
- 142 8E ^Ä │ 174 AE « │ 206 CE ╬ │ 238 EE ε
- 143 8F ^Å │ 175 AF » │ 207 CF ╧ │ 239 EF ∩
- 144 90 ^É │ 176 B0 ░ │ 208 D0 ╨ │ 240 F0 ≡
- 145 91 ^æ │ 177 B1 ▒ │ 209 D1 ╤ │ 241 F1 ±
- 146 92 ^Æ │ 178 B2 ▓ │ 210 D2 ╥ │ 242 F2 ≥
- 147 93 ^ô │ 179 B3 │ │ 211 D3 ╙ │ 243 F3 ≤
- 148 94 ^ö │ 180 B4 ┤ │ 212 D4 ╘ │ 244 F4 ⌠
- 149 95 ^ò │ 181 B5 ╡ │ 213 D5 ╒ │ 245 F5 ⌡
- 150 96 ^û │ 182 B6 ╢ │ 214 D6 ╓ │ 246 F6 ÷
- 151 97 ^ù │ 183 B7 ╖ │ 215 D7 ╫ │ 247 F7 ≈
- 152 98 ^ÿ │ 184 B8 ╕ │ 216 D8 ╪ │ 248 F8 °
- 153 99 ^Ö │ 185 B9 ╣ │ 217 D9 ┘ │ 249 F9 ∙
- 154 9A ^Ü │ 186 BA ║ │ 218 DA ┌ │ 250 FA ·
- 155 9B ^¢ │ 187 BB ╗ │ 219 DB █ │ 251 FB √
- 156 9C ^£ │ 188 BC ╝ │ 220 DC ▄ │ 252 FC ⁿ
- 157 9D ^¥ │ 189 BD ╜ │ 221 DD ▌ │ 253 FD ²
- 158 9E ^₧ │ 190 BE ╛ │ 222 DE ▐ │ 254 FE ■
- 159 9F ^ƒ │ 191 BF ┐ │ 223 DF ▀ │ 255 FF
-
-
-
-
-
-
-
-
-
-
-
-
- QTAwk - 16-2 - QTAwk
-
-
-
-
-
-
- Section 17.0 Appendix II
-
-
- E-17.0 Appendix IIF-Ç
-
- Major differences between QTAwk and Awk.
-
- 1. Expanded Regular Expressions
- All of the Awk regular expression operators are allowed plus
- the following:
- a) complemented character class using the Awk notation,
- '[^...]', as well as the Awk/QTAwk and C logical negation
- operator, '[!...]'.
-
- b) Matched character classes, '[#...]'. These classes are
- used in pairs. The position of the character matched in
- the first class of the pair, determines the character
- which must match in the position occupied by the second
- class of the pair.
-
- c) Look-ahead Operator. r@t regular expression r is matched
- only when followed by regular expression t.
-
- d) Repetition Operator. r{n1,n2} at least n1 and up to n2
- repetitions of regular expression r. 1 <= n1 <= n2
-
- e) Named Expressions.
- {named_expr} is replaced by the string value of the
- corresponding variable.
-
- 2. Consistent statement termination syntax. The QTAwk Utility
- Creation Tool utilizes the semi-colon, ';', to terminate all
- statements. The practice in Awk of using newlines to
- "sometimes" terminate statements is no longer allowed.
-
- 3. Expanded Operator Set
- The Awk set of operators has been changed to more closely
- match those of C. The Awk match operator, '~', has been
- changed to '~~' so that the similarity between the match
- operators, '~~' and '!~', to the equality operators, '==' and
- '!=", is complete. The single tilde symbol, '~', reverts to
- the C one's complement operator, an addition to the operator
- set over Awk. The introduction of the explicit string
- concatenation operator. The remaining "new" operators to
- QTAwk are:
-
-
-
-
-
-
- QTAwk - 17-1 - QTAwk
-
-
-
-
-
-
- Section 17.0 Appendix II
-
-
- Operation Operator
- tag $$
- one's complement ~
- concatenation ∩
- shift left/right << >>
- matching ~~ !~
- bit-wise AND &
- bit-wise XOR @
- bit-wise OR |
- sequence ,
-
- The carot, '^', remains as the exponentiation operator. The
- symbol '@' is used for the exclusive OR operator.
-
- 4. Expanded set of recognized constants in QTAwk utilities:
- a) decimal integers,
- b) octal integers,
- c) hexadecimal integers,
- d) character constants, and
- e) floating point constants.
-
- 5. Expanded Pre-defined patterns giving more control:
- a) INIITAL - similar to BEGIN. Actions executed after
- opening each input file and before reading first record.
- b) FINAL - similar to END. Actions executed after reading
- last record of each input file and before closing file.
- c) NOMATCH - actions executed for each input record for
- which no pattern was matched.
- d) GROUP - used to group multiple regular expressions for
- search optimization. Can speed search by a factor of six.
-
- 6. True multi-dimensional arrays
- The use of the comma in index expressions to simulate
- multiple array indices is no longer supported. True multiple
- indices are supported. Indexing is in the C manner,
- 'a[i1][i2]'. The SUBSEP built-in variable of AWK has been
- dropped since it is no longer necessary.
-
- 7. Integer array indices as well as string indices
- Array indices have been expanded to include integers as well
- as the string indices of Awk. Indices are not automatically
- converted to strings as in Awk. Thus, for true integer
- indices, the index ordering follows the numeric sequence with
- an integer index value of '10' following a value of '2'
- instead of preceeding it.
-
-
-
- QTAwk - 17-2 - QTAwk
-
-
-
-
-
-
- Section 17.0 Appendix II
-
-
- 8. Arrays integrated into QTAwk
- QTAwk integrates arrays with arithemetic operators so that
- the operations are carried out on the entire array. QTAwk
- also integrates arrays into user-defined functions so that
- they can be passed to and returned from such functions in a
- natural and intuitive manner. Awk does not allow returning
- arrays from user-defined functions or allow arithmetic
- operators to operate on whole arrays.
-
- 9. NEW keywords:
-
- a) cycle
- similar to 'next' except that may use current record in
- restarting outer pattern matching loop.
- b) deletea
- similiar to 'delete' except that ALL array values
- deleted.
- c) switch, case, default
- similiar to C syntax with the allowed 'switch' and 'case'
- values expanded to include any legal QTAwk expression,
- evaluated at run-time. The expressions may evaluate to
- any value including any numeric value, string or regular
- expression.
- d) local
- new keyword to allow the declaration and use of local
- variables within compound statements, including
- user-defined functions. Its use in user defined functions
- instead of the Awk practice of defining excess formal
- parameters, leads to easier to read and maintain
- functions. The C 'practice' of allowing initialization in
- the 'local' statement is followed.
- e) endfile
- similar to 'exit'. Simulates end of current input file
- only, any remaining input files are still processed.
-
- 10. Expanded arithmetic functions
- QTAwk includes 18 built-in arithmetic functions. All of the
- functions supported by Awk plus the following:
- a) acos(x)
- b) asin(x)
- c) cosh(x)
- d) fract(x)
- e) log10(x)
- f) pi() or pi
- g) sinh(x)
-
-
-
- QTAwk - 17-3 - QTAwk
-
-
-
-
-
-
- Section 17.0 Appendix II
-
-
- 11. Expanded string functions
- QTAwk includes 33 built-in string functions. All of the
- functions supported by Awk plus the following:
- a) center(s,w) or center(s,w,c)
- b) copies(s,n)
- c) deletec(s,p,n)
- d) insert(s1,s2,p)
- e) justify(a,n,w) or justify(a,n,w,c)
- f) overlay(s1,s2,p)
- g) remove(s,c)
- h) replace(s)
- i) sdate(fmt)
- j) srange(c1,c2)
- k) srev(s)
- l) stime(fmt)
- m) stran(s) or stran(s,st) or stran(s,st,sf)
- n) strim(s) or strim(s,c) or strim(s,c,d)
- o) strlwr(s)
- p) strupr(s)
-
- 12. New Miscellaneous functions
- a) The function 'rotate(a)' is provided to rotate the
- elements of the array a.
- b) execute(s) or execute(s,se) or execute(s,se,rf) - execute
- string s
- c) execute(a) or execute(a,se) or execute(a,se,rf) - execute
- array a
- d) pd_sym - access pre-defined symbols
- e) ud_sym - access user defined symbols
-
- 13. New I/O functions
- I/O function syntax has been made consistent with syntax of
- other functions. The redirection operators, '<', '>' and
- '>>', and pipeline operator, '|', have been deleted as
- excessively error prone in expressions. The functional syntax
- of the 'getline' function has been made identical to that of
- the other built-in functions. The new functions 'fgetline',
- 'fprint' and 'fprintf' have been introduced for reading and
- writing to files other than the current input file. The new
- functions 'getc()' and 'fgetc()' have been introduced for
- single character input.
-
- 14. Expanded capability of formatted Output.
- The limited output formatting available with the Awk 'printf'
- function has been expanded by adopting the complete output
- format specification of the draft ANSI C standard.
-
-
- QTAwk - 17-4 - QTAwk
-
-
-
-
-
-
- Section 17.0 Appendix II
-
-
- 15. Use of 'local' keyword
- The 'local' keyword has been introduced to allow for
- variables local to user-defined functions (and any compound
- statement). This expansion makes the Awk practice of defining
- 'extra' formal parameters no longer necessary.
-
- 16. Expanded user-defined functions
- With the 'local' keyword, QTAwk allows the user to define
- functions that may accept a variable number of arguments.
- Functions, such as finding the minimum/maximum of a variable
- number of variables, are possible with one function rather
- than defining separate functions for each possible
- combination of arguments.
-
- 17. User controlled trace capability
- A user controlled statement trace capability has been added.
- This gives the user a simple to use mechanism to trace
- utility execution. Rather than adding 'print' statements,
- merely re-defining the value of a built-in variable will give
- utility execution trace information, including utility line
- number.
-
- 18. Expanded built-in variable list
- With 30 built-in variables, QTAwk includes all (with the
- exception of SUBSEP) of the built-in variables of Awk plus
- the following:
- a) _arg_chk - used to determine whether to check number of
- arguments passed to user-defined functions.
- b) ARGI - index value in ARGV of next command line argument.
- Gives more control of command line argument processing.
- c) CYCLE_COUNT - count number of outer loop cycles with
- current input record.
- d) DEGREES - if TRUE, trigonometric functions assume degree
- values, radians if FALSE.
- e) ENVIRON - array of environment strings passed to QTAwk
- f) FALSE - pre-defined with constant value, 0.
- g) TRUE - predefined with constant value, 1
- h) LONGEST_EXP - used to control whether the longest or the
- first string matching a regular expression is found.
- i) MAX_CYCLE - maximum number of outer loop cycles permitted
- with current input record.
- j) NG - equal to the number of the regular expression in a
- group matching a string in the current input record.
- k) RETAIN_FS - if TRUE the original characters separating
- the fields of the current input record are retained
- whenever a field is changed, causing the input record to
-
-
- QTAwk - 17-5 - QTAwk
-
-
-
-
-
-
- Section 17.0 Appendix II
-
-
- be re-constructed. If FALSE the output field separator,
- OFS, is used to separate fields in the current input
- record during reconstruction. The latter practice is the
- only method available in Awk.
- l) TRACE - value used to determine utility statement
- tracing.
- m) TRANS_FROM/TRANS_TO - strings used by 'stran' function if
- second and/or third arguments not specified.
- n) CLENGTH - similiar to 'RLENGTH' of Awk. Set whenever a
- 'case' value evaluates to a regular expression.
- o) CSTART - similiar to 'RSTART' of Awk. Set whenever a
- 'case' value evaluates to a regular expression.
- p) MLENGTH - similiar to 'RLENGTH' of Awk. Set whenever a
- stand-alone regular expression is encountered in
- evaluting a pattern.
- q) MSTART - similiar to 'RSTART' of Awk. Set whenever a
- stand-alone regular expression is encountered in
- evaluting a pattern.
- r) vargc - used only in used-defined functions defined with
- a variable number of arguments. At runtime, set equal to
- the actual number of variable arguments passed.
- s) vargv - used only in used-defined functions defined with
- a variable number of arguments. At runtime, an single
- dimensioned array with each element set to the argument
- actually passed.
-
- 19. Definition of built-in variable, RS, expanded to include
- string form. If RS set to a string longer than one character,
- then string intrepreted as a regular expression and any
- string matching regular expression becomes record separator.
-
- 20. In QTAwk, setting built-in variable, "FILENAME", to another
- value will change the current input file. Setting the
- variable in Awk, has no effect on current input file.
-
- 21. Corrected admitted problems with Awk. The problems
- mentioned on page 182 of "The Awk Programming Language" have
- been corrected. Specifically: 1) true multi-dimensional
- arrays have been implemented, 2) the 'getline' syntax has
- been made to match that of other functions, 3) declaring
- local variables in user-defined functions has been corrected,
- 4) intervening blanks are allowed between the function call
- name and the opening parenthsis (in fact, under QTAwk it is
- permissable to have no opening parenthesis or argument list
- for user-defined functions that have been defined with no
- formal arguments).
-
-
- QTAwk - 17-6 - QTAwk
-
-
-
-
-
-
- Section 18.0 Appendix III
-
-
- E-18.0 Appendix IIIF-Ç
-
-
- The following QTAwk utility is designed to search C source code
- files for keywords defined in the ANSI C standard. It is included
- here to illustrate the use of the the 'GROUP' keyword.
-
- # QTAwk utility to scan C source files for keywords
- # defined in the ANSI C standard keywords:
- # macro or function names defined in the standard
- # types or constants defined in the standard
- #
- # program to illustrate GROUP pattern keyword
- #
- # input: C source file
- # output: all lines containing ANSI C standard defined keywords
- #
- # use 'GROUP' pattern keyword to form one large GROUP of
- # patterns to speed search. Only two actions defined:
- # 1) action to print macro or function names
- # 2) action to print types or constants
- #
- #
- BEGIN {
- #
- # ANSI C key words
- #
- # expression for leader
- ldr = /(^|[\s\t])/;
- # opening function parenthesis - look-ahead to find
- o_p = /@[\s\t]*\(/;
- #
- # define strings for formatted output
- #
- tls = "Total Lines Scanned: %lu\n";
- tlc = "Total Line containing macro/function names: %lu\n";
- tlt = "Total Line containing type/constant names: %lu\n";
- }
- #
- #
- # Following are macro or functions names as defined
- # by ANSI C standard
- #
- # 1
- GROUP /{ldr}assert{o_p}/
- # 2
-
-
- QTAwk - 18-1 - QTAwk
-
-
-
-
-
-
- Section 18.0 Appendix III
-
-
- # Following regular expression split across 2 lines
- # for documentation only.
- GROUP /{ldr}is(al(num|pha)|cntrl|x?digit|graph|
- p(rint|unct)|space|(low|upp)er){o_p}/
- # 3
- GROUP /{ldr}to(low|upp)er{o_p}/
- # 4
- GROUP /{ldr}set(locale|v?buf){o_p}/
- # 5
- GROUP /{ldr}a(cos|sin|tan2?|bort){o_p}/
- # 6
- GROUP /{ldr}(cos|sin|tan)h?{o_p}/
- # 7
- GROUP /{ldr}(fr|ld)?exp{o_p}/
- # 8
- GROUP /{ldr}log(10)?{o_p}/
- # 9
- GROUP /{ldr}modf{o_p}/
- # 10
- GROUP /{ldr}pow{o_p}/
- # 11
- GROUP /{ldr}sqrt{o_p}/
- # 12
- GROUP /{ldr}ceil{o_p}/
- # 13
- GROUP /{ldr}(f|l)?abs{o_p}/
- # 14
- GROUP /{ldr}f(loor|mod){o_p}/
- # 15
- GROUP /{ldr}jmp_buf{o_p}/
- # 16
- GROUP /{ldr}(set|long)jmp{o_p}/
- # 17
- GROUP /{ldr}signal{o_p}/
- # 18
- GROUP /{ldr}raise{o_p}/
- # 19
- GROUP /{ldr}va_(arg|end|list|start){o_p}/
- # 20
- GROUP /{ldr}re(move|name|wind){o_p}/
- # 21
- GROUP /{ldr}tmp(file|nam){o_p}/
- # 22
- GROUP /{ldr}(v?[fs])?printf{o_p}/
- # 23
- GROUP /{ldr}[fs]?scanf{o_p}/
-
-
- QTAwk - 18-2 - QTAwk
-
-
-
-
-
-
- Section 18.0 Appendix III
-
-
- # 24
- GROUP /{ldr}f?get(c(har)?|s|env){o_p}/
- # 25
- GROUP /{ldr}f?put(c(har)?|s){o_p}/
- # 26
- GROUP /{ldr}ungetc{o_p}/
- # 27
- # Following regular expression split across 2 lines
- # for documentation only.
- GROUP /{ldr}f(close|flush|(re)?open|read|write|
- [gs]etpos|seek|tell|eof|ree|pos_t){o_p}/
- # 28
- GROUP /{ldr}clearerr{o_p}/
- # 29
- GROUP /{ldr}[fp]error{o_p}/
- # 30
- GROUP /{ldr}ato[fil]{o_p}/
- # 31
- # Following regular expression split across 2 lines
- # for documentation only.
- GROUP /{ldr}str(to(d|k|u?l)|n?c(py|at|mp)|
- coll|r?chr|c?spn|pbrk|str|error|len){o_p}/
- # 32
- GROUP /{ldr}s?rand{o_p}/
- # 33
- GROUP /{ldr}(c|m|re)?alloc{o_p}/
- # 34
- GROUP /{ldr}_?exit{o_p}/
- # 35
- GROUP /{ldr}(f|mk|asc|c|gm|local|strf)?time{o_p}/ {
- printf("Macro/function\n%uE - %luR: %s\n%s\n",NG,FNR,$0,$$0);
- mf_count++;
- }
- #
- # following are types or constants
- #
- # 36
- GROUP /errno/
- # 37
- GROUP /NULL/
- # 38
- GROUP /offsetof/
- # 39
- GROUP /(fpos|ptrdiff|size|wchar)_t/
- # 41
- GROUP /NDEBUG/
-
-
- QTAwk - 18-3 - QTAwk
-
-
-
-
-
-
- Section 18.0 Appendix III
-
-
- # 42
- GROUP /LC_(ALL|COLLATE|CTYPE|NUMERIC|TIME)/
- # 43
- GROUP /E(DOM|RANGE|OF)/
- # 44
- GROUP /HUGE_VAL/
- # 45
- GROUP /sig_atomic_t/
- # 46
- GROUP /SIG(_(DFL|ERR|IGN)|ABRT|FPE|ILL|INT|SEGV|TERM)/
- # 47
- GROUP /FILE/
- # 48
- GROUP /_IO[FLN]BF/
- # 49
- GROUP /BUFSIZ/
- # 50
- GROUP /L_tmpnam/
- # 51
- GROUP /(OPEN|RAND|TMP|U(CHAR|INT|LONG|SHRT))_MAX/
- # 52
- GROUP /SEEK_(CUR|END|SET)/
- # 53
- GROUP /std(err|in|out)/
- # 54
- GROUP /l?div_t/
- # 55
- GROUP /CLK_TCK/
- # 56
- GROUP /(clock|time)_t/
- # 57
- GROUP /tm_(sec|min|hour|[mwy]day|mon|year|isdst)/
- # 58
- GROUP /CHAR_(BIT|M(AX|IN))/
- # 59
- GROUP /(INT|LONG|S(CHAR|HRT))_(M(IN|AX))/
- # 60
- GROUP /(L?DBL|FLT)_((MANT_)?DIG|EPSILON|M(AX|IN)(_(10_)?EXP)?)/
- # 61
- GROUP /FLT_R(ADIX|OUNDS)/ {
- printf("type/constant\n%uE - %luR: %s\n%s\n",NG,FNR,$0,$$0);
- tc_count++;
- }
-
- FINAL {
- printf(tls,FNR);
-
-
- QTAwk - 18-4 - QTAwk
-
-
-
-
-
-
- Section 18.0 Appendix III
-
-
- printf(tlc,mf_count);
- printf(tlt,tc_count);
- }
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- QTAwk - 18-5 - QTAwk
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- QTAwk - 18-6 - QTAwk
-
-
-
-
-
-
- Section 19.0 Appendix IV
-
-
- E-19.0 Appendix IVF-Ç
-
- This is a complete copy of the data file, states.dta, used in to
- illustrate QTAwk. The fields of the first record for the default
- field separator FS = /{_z}+/ is shown below followed by the
- fields for the record separator FS = /{_w}+[\#()]({_w}+|$)/
-
-
- Fields for Default FS = /{_z}+/
- 1. US -- country/continent name
- 2. # -- separator
- 3. 47750 -- area, square miles
- 4. # -- separator
- 5. 4515 -- population in thousands
- 6. # -- separator
- 7. LA -- abbreviation (US & Canada only)
- 8. # -- separator
- 9. Baton -- first half capital city name
- 10. Rouge -- second half capital city name
- 11. ( -- separator
- 12. Louisiana -- state/country name
- 13. ) -- Terminator
-
- .so off Fields for FS = /[\s\t]+[\#()]([\s\t]+|$)/:
- 1. US -- country/continent name
- 2. 47750 -- area, square miles
- 3. 4515 -- population in thousands
- 4. LA -- abbreviation (US & Canada only)
- 5. Baton Rouge -- full capital city name
- 6. Louisiana -- state/country name
-
- US # 10461 # 4375 # MD # Annapolis ( Maryland )
- US # 40763 # 5630 # VA # Richmond ( Virgina )
- US # 2045 # 620 # DE # Dover ( Delaware )
- US # 24236 # 1995 # WV # Charleston ( West Virginia )
- US # 46047 # 12025 # PA # Harrisburg ( Pennsylvania )
- US # 7787 # 7555 # NJ # Trenton ( New Jersey )
- US # 52737 # 17895 # NY # Albany ( New York )
- US # 9614 # 535 # VT # Montpelier ( Vermont )
- US # 9278 # 975 # NH # Concord ( New Hampshire )
- US # 33265 # 1165 # ME # Augusta ( Maine )
- US # 8286 # 5820 # MA # Boston ( Massachusetts )
- US # 5019 # 3160 # CT # Hartford ( Conneticut )
- US # 1212 # 975 # RI # Providence ( Rhode Island )
- US # 52669 # 6180 # NC # Raleigh ( North Carolina )
- US # 31116 # 3325 # SC # Columbia ( South Carolina )
-
-
- QTAwk - 19-1 - QTAwk
-
-
-
-
-
-
- Section 19.0 Appendix IV
-
-
- US # 58914 # 5820 # GA # Atlanta ( Georgia )
- US # 51704 # 4015 # AL # Montgomery ( Alabama )
- US # 42143 # 4755 # TN # Nashville ( Tennessee )
- US # 40414 # 3780 # KY # Frankfort ( Kentucky )
- US # 58668 # 10925 # FL # Tallahassee ( Florida )
- US # 68139 # 4395 # WA # Olympia ( Washington )
- US # 412582 # 8985 # OR # Salem ( Oregon )
- US # 147045 # 830 # MT # Helena ( Montana )
- US # 83566 # 1020 # ID # Boise ( Idaho )
- US # 110562 # 945 # NV # Carson City ( Nevada )
- US # 84902 # 1690 # UT # Salt Lake City ( Utah )
- US # 97808 # 525 # WY # Cheyenne ( Wyoming )
- US # 104094 # 3210 # CO # Denver ( Colorado )
- US # 158704 # 25620 # CA # Sacramento ( California )
- US # 121594 # 1425 # NM # Sante Fe ( New Mexico )
- US # 114002 # 3040 # AZ # Phoenix ( Arizona )
- US # 70702 # 690 # ND # Bismark ( North Dakota )
- US # 77120 # 715 # SD # Pierre ( South Dakota )
- US # 77350 # 1615 # NE # Lincoln ( Nebraska )
- US # 82282 # 2450 # KS # Topeka ( Kansas )
- US # 69697 # 5040 # MO # Jefferson City ( Missouri )
- US # 69957 # 3375 # OK # Oklahoma City ( Oklahoma )
- US # 266805 # 16090 # TX # Austin ( Texas )
- US # 86614 # 4205 # MN # St Paul ( Minnesota )
- US # 56275 # 2970 # IA # Des Moines ( Iowa )
- US # 53191 # 2375 # AR # Little Rock ( Arkansas )
- US # 47750 # 4515 # LA # Baton Rouge ( Louisiana )
- US # 47691 # 2640 # MS # Jackson ( Mississippi )
- US # 57872 # 11620 # IL # Springfield ( Illinois )
- US # 66213 # 4800 # WI # Madison ( Wisconsin )
- US # 97107 # 9090 # MI # Lansing ( Michigan )
- US # 36417 # 5585 # IN # Indianapolis ( Indiana )
- US # 44786 # 10760 # OH # Columbus ( Ohio )
- US # 591004 # 515 # AK # Juneau ( Alaska )
- US # 6473 # 1045 # HI # Honolulu ( Hawaii )
- Canada # 255285 # 2370 # AB # Edmonton ( Alberta )
- Canada # 366255 # 2885 # BC # Victoria ( British Columbia )
- Canada # 251000 # 1060 # MB # Winnipeg ( Manitoba )
- Canada # 251700 # 1010 # SK # Regina ( Saskatchewan )
- Canada # 21425 # 875 # NS # Halifax ( Nova Scotia )
- Canada # 594860 # 6585 # PQ # Quebec ( Quebec )
- Canada # 2184 # 126 # PE # Charlottetown ( Prince Edward Island )
- Canada # 156185 # 585 # NF # St John's ( New Foundland )
- Canada # 28354 # 715 # NB # Fredericton ( New Brunswick )
- Canada # 412582 # 8985 # ON # Toronto ( Ontario )
- Canada # 1304903 # 51 # NW # Yellowknife ( Northwest Territories
-
-
- QTAwk - 19-2 - QTAwk
-
-
-
-
-
-
- Section 19.0 Appendix IV
-
-
- )
- Canada # 186300 # 23 # YU # Whitehorse ( Yukon Territory )
- Europe # 92100 # 14030 # Bonn ( West Germany )
- Europe # 211208 # 55020 # Paris ( France )
- Europe # 94092 # 56040 # London ( United Kingdom )
- Europe # 27136 # 3595 # Dublin ( Ireland )
- Europe # 194882 # 38515 # Madrid ( Spain )
- Europe # 116319 # 56940 # Rome ( Italy )
- Europe # 8600383 # 275590 # Moscow ( Russia )
- Europe # 120728 # 37055 # Warsaw ( Poland )
- Europe # 32377 # 7580 # Vienna ( Autria )
- Europe # 35921 # 10675 # Budapest ( Hungary )
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- QTAwk - 19-3 - QTAwk
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- QTAwk - 19-4 - QTAwk
-
-
-
-
-
-
- Section 20.0 Appendix V
-
-
- E-20.0 Appendix VF-Ç
-
- QTAwk error returns. When QTAwk encounters an error which it
- cannot correct, it genmerates and displays an error message in
- the format:
-
- 1: Error (xxxx): Error Message Text
- 2: From 'execute' Function.
- 3: Action File line: llll
- 4: Scanning File: utility filename
- 5: Line: llll
- 6: Record: rrrr
-
- Line 2 is generated only if the error occured during execution
- of the 'execute' function. Lines 4 to 6 are displayed only if an
- input file is currently being scanned.
-
- On a normal exit QTAwk returns a value of zero, 0, to PC/MS-DOS.
- This value may be set with the 'exit' statement. On encountering
- an error which generates an error message, QTAwk exits with a
- non-zero value between 1 and 6. The worning messages below will
- exit with a value of zero. The exit values generated on detecting
- an error are:
-
- 1. Warning Errors ==> 0 , error value < 1000
- 2. File Errors ==> 2 , 2000 <= error value < 3000
- 3. Regular Expression Errors ==> 3 , 3000 <= error value < 4000
- 4. Run-Time Errors ==> 4 , 4000 <= error value < 5000
- 5. Interpretation Errors ==> 5 , 5000 <= error value < 6000
- 6. Memory Error ==> 6 , 6000 <= error value < 7000
-
- The 'error value' range shown in the above list, shown the range
- of the numeric value shown in the error message for that type of
- error.
-
- The error number displayed on line 1 may be used to find the
- error diagnostic from the following listing.
-
- 1. Warning Errors
- 0
- Invalid Option.
- The only valid command line options are:
- -- -> to stop command line option scanning
- -f -> to specify a utility filename
- -F -> to specify the input record field separator.
-
-
-
- QTAwk - 20-1 - QTAwk
-
-
-
-
-
-
- Section 20.0 Appendix V
-
-
- 10
- Warning, Attempt To Close File Not Open.
- An attempt has been made to close a file with the 'close'
- function, that is not currently open.
-
- 2. File Errors
- 2000
- 2010
- File Not Found: {filename}
- The filename given in the error message, was been
- specified on the command line. The file named does not
- exist. QTAwk displays this error message and terminates
- processing.
-
- 3. Regular Expression Errors
- 3000
- Stop pattern without a start
- The range pattern has the form:
-
- expression , expression
-
- The comma, ',', is used to separate the expressions of
- the pattern. The associated action is executed when the
- first or start expression is TRUE. Execution continues
- for every input record until, and including, the second
- or stop expression is TRUE. A comma, ',', has been found
- in a pattern without the first expression. This is
- usually caused by inbalanced braces, "{}". Check all
- prior braces to ensure that every left brace, '{', has an
- associated terminating right brace, '}'.
-
- 3010
- Already have a stop pattern
- The range pattern has the form: expression , expression
- The comma, ',', is used to separate the expressions of
- the pattern. The associated action is executed when the
- first or start expression is TRUE. Execution continues
- for every input record until, and including, the second
- or stop expression is TRUE. A second comma, ',', has been
- found in a pattern. This may be caused by the unbalanced
- braces as for error number 3000 above. A second cause may
- stem from the fact that new patterns for pattern/action
- pairs must be separated from previous patterns by a
- new-line if no action, i.e., the dafault action, is
- associated with the previous pattern.
-
-
-
- QTAwk - 20-2 - QTAwk
-
-
-
-
-
-
- Section 20.0 Appendix V
-
-
- 4. Run-Time Errors
- 4000
- Command Line Variable Set - Not Used.
- Only variables defined in the QTAwk utility may be set on
- the command line with the form "variable = value"
-
- 4010
- Missing Opening Parenthesis On 'while'.
- The proper syntax for the 'while' statement is:
-
- while ( conditional_expression ) statement
-
- The left parenthesis, '(', starting the conditional
- expression was not found following the 'while' keyword.
- Check that the syntax conforms to the form above.
-
- 4020
- Missing Opening Parenthesis On 'switch'.
- The form of the 'switch' construct is:
-
- switch ( switch_expression ) statement
-
- The left parenthesis, '(', was not found following the
- 'switch' keyword.
-
- 4030
- Unable to compile regular Expression
- QTAwk was unable to convert a regular expression to
- internal form. Please contact the QTAwk author with
- information on the circumstances of this error message.
-
- 4040
- Internal Array Error.
- Internal error. Please contact the QTAwk author with
- information on the circumstances of this error message.
-
- 4050
- pre/post '++' or '--' Needs a Variable.
- The pre/post ++/-- operators operate on variables only.
- This error is usually generated because of an incorrect
- understanding of the precedence rules. The operator was
- associated by QTAwk when the utility line was parsed than
- the user expected. Check the precedence rules and the
- syntax of the line cited.
-
- 4060
-
-
- QTAwk - 20-3 - QTAwk
-
-
-
-
-
-
- Section 20.0 Appendix V
-
-
- '$$' will accept '0' argument only.
- The '$$' operator assumes the value of the string matched
- by the last explicit or implicit match operator, '~~' or
- '!~'. Implicit matching is done in patterns. The only
- value which is permissable for the '$$' operator is zero.
-
- 4070
- Undefined Symbol.
- A symbol has been found which QTAwk does not recognize.
- This error should not occur and represents an internal
- error. Please contact the QTAwk author with information
- on the circumstances of this error message.
-
- 4080
- Internal Error #200
- Internal error. Please contact the QTAwk author with
- information on the circumstances of this error message.
-
- 4090
- Attempt To Delete Non-existent Array Element.
- The 'delete' statement was followed with a reference to
- an array element that does not exist.
-
- 4100
- Internal GROUP parse error 1001.
- Internal error. Please contact the QTAwk author with
- information on the circumstances of this error message.
-
- 4100
- Warning, Attempt To Close File Not Successful.
- An attempt has been made to close a file with the 'close'
- function. The close action has not been successful,
- usually because the file named does not exist. Check the
- name specified.
-
- 4110
- 'strim' Function Result Exceeds Limits.
- The built-in function 'strim' has been called with a
- string to trim which exceeds the maximum limits of 4096
- characters.
-
- 4120
- Cannot Nest 'execute' Function.
- The 'execute' function cannot be executed with a
- string/array executed by this function. An attempt has
- been made to do this. Check the string/array which was
-
-
- QTAwk - 20-4 - QTAwk
-
-
-
-
-
-
- Section 20.0 Appendix V
-
-
- executed.
-
- 4130
- '(g)sub' Function Result Exceeds Limits.
- The function 'sub' or 'gsub' has been called to replace
- matching strings and the resultant string after
- replacement would exceed the limit of 4096 characters.
-
- 4140
- Missing ')' for Function Call.
- A built-in function has been called with a left
- parenthesis starting the argument list, but no right
- parenthesis terminating the argument list. Check the line
- in question.
-
- 4150
- [sf]printf functions take a minimum of 1 argument.
- The first arguments for the 'fprintf' and 'sprintf'
- functions are necessary to specify the file or string
- respectively as the target for the output string.
-
- 4160
- [sf]printf needs format string as first argument
- The 'fprintf' and 'sprintf' functions need a format
- string which specifies the output. The format string is
- the second argument and must be specified for these
- functions.
-
- 4170
- 4180
- 4190
- Format Specifications Exceed Arguments To Print.
- The 'printf', fprintf' and 'sprintf' functions use a
- format string to control the output. Certain characters
- strings in the format control the output of numerics and
- imbedded strings. There must be exactly one extra
- argument for each of these character control strings.
- This error occurs when there are more control strings
- than extra arguments.
-
- 4220
- Third Argument For '(g)sub' Function Must Be A Variable.
- The optional third argument of the 'sub' and 'gsub'
- functions must be a variable. The string value of this
- variable is replaced after string substitution has been
- accomplished.
-
-
- QTAwk - 20-5 - QTAwk
-
-
-
-
-
-
- Section 20.0 Appendix V
-
-
- 4230
- Excessive Length Specified 'substr' Function.
- The form of the 'substr' function is: substr(s,p[,n]).
- The third argument is optional, but if specified cannot
- exceed 4096.
-
- 4240
- Start Position Specified Too Great, 'substr' Function.
- The form of the 'substr' function is: substr(s,p[,n]).
- The second argument cannot exceed 4096.
-
- 4250
- Incorrect Time Format.
- he form of the time function is: stime(fmt) here fmt is
- converted to an integer and must be in the range:
-
- 0 <= fmt <= 4
-
- 4260
- Incorrect Date Format.
- The form of the time function is: sdate(fmt) where fmt is
- converted to an integer and must be in the range:
-
- 0 <= fmt <= 16
-
- 4270
- 'rotate' Function Needs Array Member As Argument.
- The argument for the 'rotate' function must be an array.
- If a variable is used, make sure that it is an array when
- the function is called.
-
- 4280
- Excessive Width Specified 'center' Function.
- The second argument specifies the width of the line in
- which to center the string value of the first argument.
- The width specified cannot exceed 4096.
-
- 4290
- Excessive Copies Specified 'copies' Function.
- The second argument of the 'copies' function specifies
- the number of copies of the string value of the first
- argument to return. The number of copies specified cannot
- exceed 65,536. See error number 4300 below also.
-
- 4300
- 'copies' Function Return Exceeds Limits.
-
-
- QTAwk - 20-6 - QTAwk
-
-
-
-
-
-
- Section 20.0 Appendix V
-
-
- The 'copies' function returns the string value of the
- first argument, copied the number of times specified by
- the second argument. The total length of the returned
- string:
-
- arg2 * length(arg1)
-
- cannot exceed 4096 characters.
-
- 4310
- Excessive Characters Specified 'deletec' Function.
- The 'deletec' function deletes the number of characters
- specified by the third argument starting at the position
- specified by the second argument from the string value of
- the first argument. The form of the function is:
-
- deletec(string,start,num)
-
- The number of characters specified to delete, 'num',
- cannot exceed 65,536. If 'num' is zero or exceeds the
- number of characters remaining in the string from the
- start position, then the remainder of the string is
- deleted. See also error 4320 below.
-
- 4320
- Excessive Characters Specified 'deletec' Function.
- The 'deletec' function deletes the number of characters
- specified by the third argument starting at the position
- specified by the second argument from the string value of
- the first argument. The form of the function is:
-
- deletec(string,start,num)
-
- The start is negative or greater than the length of the
- string value of the first argument, then no characters
- are deleted.
-
- 4330
- 'deletec' Intermediate Result Exceeds Limits.
- The 'deletec' function deletes the number of characters
- specified by the third argument starting at the position
- specified by the second argument from the string value of
- the first argument. The form of the function is:
-
- deletec(string,start,num)
-
-
-
- QTAwk - 20-7 - QTAwk
-
-
-
-
-
-
- Section 20.0 Appendix V
-
-
- If the length of the string value of the first argument
- exceeds 4096 then this error is triggered.
-
- 4340
- Start Position Specified Too Great, 'insert' Function.
- The 'insert' function inserts the string value of the
- second argument into the string value of the frist
- argument, starting at the position specified by the third
- argument. The form of the function is:
-
- insert(string1,string2,start)
-
- The third argument cannot exceed 65,536. If start
- exceeds the length of the string value of 'string1', then
- the string value of 'string2' is concatenated onto the
- string value of 'string1'
-
- 4350
- 'insert' Function Intermediate Result Exceeds Limits.
- The 'insert' function inserts the string value of the
- second argument into the string value of the frist
- argument, starting at the position specified by the third
- argument. The form of the function is:
-
- insert(string1,string2,start)
-
- The length of the string value value of 'string1' cannot
- exceed 4096 in length. The result of insert 'string2'
- into 'string1' cannot exceed 4096 also. See error number
- 4360 below.
-
- 4360
- 'insert' Function Return Exceeds Limits.
- The 'insert' function inserts the string value of the
- second argument into the string value of the frist
- argument, starting at the position specified by the third
- argument. The form of the function is:
-
- insert(string1,string2,start)
-
- The length of the string value value of 'string1' cannot
- exceed 4096 in length. The result of insert 'string2'
- into 'string1' cannot exceed 4096 also. See error number
- 4350 above.
-
- 4370
-
-
- QTAwk - 20-8 - QTAwk
-
-
-
-
-
-
- Section 20.0 Appendix V
-
-
- Start Position Specified Too Great, 'overlay' Function.
- The 'overlay' function overlays the string value of the
- second argument into the string value of the frist
- argument, starting at the position specified by the third
- argument. The form of the function is:
-
- overlay(string1,string2,start)
-
- The third argument cannot exceed 65,536. If start
- exceeds the length of the string value of 'string1', then
- blanks are appended to 'string1' to create a string of
- length 'start'. The second string is then concatenated to
- this string. See also error numbers 4380, 4390, and 4400
- below.
-
- 4380
- 'overlay' Function Result Exceeds Limits.
- The 'overlay' function overlays the string value of the
- second argument into the string value of the frist
- argument, starting at the position specified by the third
- argument. The form of the function is:
-
- overlay(string1,string2,start)
-
- The third argument cannot exceed 4096. If start exceeds
- the length of the string value of 'string1', then blanks
- are appended to 'string1' to create a string of length
- 'start'. The second string is then concatenated to this
- string. See also error number 4370 above and 4390 and
- 4400 below.
-
- 4390
- 'overlay' Function Intermediate Result Exceeds Limits.
- The 'overlay' function overlays the string value of the
- second argument into the string value of the frist
- argument, starting at the position specified by the third
- argument. The form of the function is:
-
- overlay(string1,string2,start)
-
- The length of the string value of 'string1' cannot
- exceed 4096 characters. See also error number 4370 and
- 4380 above and 4400 below.
-
- 4400
- 'overlay' Function Result Exceeds Limits.
-
-
- QTAwk - 20-9 - QTAwk
-
-
-
-
-
-
- Section 20.0 Appendix V
-
-
- The 'overlay' function overlays the string value of the
- second argument into the string value of the frist
- argument, starting at the position specified by the third
- argument. The form of the function is:
-
- overlay(string1,string2,start)
-
- The length of the resultant string after overlaying
- 'string2' onto 'string1' cannot exceed 4096. See also
- error numbers 4370, 4380, and 4390 above.
-
- 4410
- 'remove' Function Intermediate Result Exceeds Limits.
- The 'remove' function removes all characters specified by
- the second argument from the string value of the first
- argument. The form of the function is:
-
- remove(string,char)
-
- The length of 'string' before any character are removed
- cannot exceed 4096.
-
- 4420
- Excessive Width Specified 'justify' Function.
- The 'justify' function forms a string from the elements
- of the array specified by the first argument. The string
- will have a length specified by the integer value of the
- third arugument and will be formed from the number of
- array elements specified by the second argument. Any
- padding characters necessary between array elements can
- be specified by the optional fourth argument. The form of
- the function is:
-
- justify(array_var,count,width [,pad_char] );
-
- The width specified cannot exceed 65,536. See also error
- number 4430 below.
-
- 4430
- Excessive Number Of Array Elements Specified 'justify'
- Function.
- The 'justify' function forms a string from the elements
- of the array specified by the first argument. The string
- will have a length specified by the integer value of the
- third arugument and will be formed from the number of
- array elements specified by the second argument. Any
-
-
- QTAwk - 20-10 - QTAwk
-
-
-
-
-
-
- Section 20.0 Appendix V
-
-
- padding characters necessary between array elements can
- be specified by the optional fourth argument. The form of
- the function is:
-
- justify(array_var,count,width [,pad_char] );
-
- The count of array elements to use cannot exceed 65,536.
- See also error number 4420 above.
-
- 4440
- Bad Function Call - Internal Error.
- An internal error has occured in calling a built-in
- function. Please contact the QTAwk author with
- information on the circumstances of this error.
-
- 4450
- Missing ')' for Function Call.
- A user-defined function has been called with an argument
- list and no right parenthesis, ')', terminating the
- argument list.
-
- 4460
- More Arguments For Function Than Defined. Function:
- {User_Function_Name}.
- More argument are passed to the user defined function
- named in the error message than were defined for the
- function. Check the user function name or the definition
- of the function for necessary extra arguments.
-
- 4470
- Less Arguments For User Function Than Defined. Function:
- '{User_Function_Name}'.
- Less arguments are passed to the user defined function
- named in the error message than were defined for the
- function. This error message is generated ONLY if the
- built-in variable '_arg_chk' has a TRUE value. Variables
- local to a user-defined function should be defined with
- the 'local' keyword.
-
- 4480
- Constant Passed For Function Array Parameter.
- A parameter to a user defined function used as an array
- within the function cannot be passed a constant value.
- Only a variable can be passed for this parameter. If the
- statement where the variable is indexed as an array is
- executed, the variable will be an array upon return from
-
-
- QTAwk - 20-11 - QTAwk
-
-
-
-
-
-
- Section 20.0 Appendix V
-
-
- the function.
-
- 4490
- Internal Error - Misalignment Of Local List ( ).
- This is an internal QTAwk error. It should ideally never
- happen. If this error message is generated, please
- contact the QTAwk author with information on the
- circumstances.
-
- 4500
- Cannot Assign Array To Array Element.
- Arrays can be assigned to variables, however, it is an
- error it attempt to assign an array to a single element
- of another array.
-
- 4510
- Array Cannot Operate on Scalar.
- A scalar may operate on an array, but the reverse is not
- true.
-
- 4520
- Assignment Operator needs a Variable on left.
- The assignment operator, '=', or any of the
- operator/assignment operators, 'op=', only operate on a
- variable to the left of the operator.
-
- 4530
- Stack Underflow
- Internal stack error. Please contact the QTAwk author
- with information on the circumstances of this error
- message.
-
- 5. Interpretation Errors
- 5000
- Expecting Filename After 'f' Option.
- QTAwk utility files are specified on the command line
- with the 'f' option. The filename of the utility
- immediately follows the 'f' flag. A blank between the
- flag and the filename is optional. This message is
- generated when no arguments follow the 'f' flag.
-
- 5010
- Unable To Compile Regular Expression
- The input record field separator may be specified on the
- command line with the 'F' option. If the string specified
- for FS is longer than a single character, a regular
-
-
- QTAwk - 20-12 - QTAwk
-
-
-
-
-
-
- Section 20.0 Appendix V
-
-
- expression is assumed. This error message is generated
- when QTAwk is unable to convert the string into a regular
- expression internal form for whatever reason.
-
- 5020
- '-F' command line option specified more than once.
- The command line 'F' option to specifiy the input record
- field separator, FS, may be specified only once.
-
- 5030
- Internal Error #3.
- Internal error. Please contact the QTAwk author with
- information on the circumstances of this error message.
-
- 5040
- Internal Error #2.
- Internal error. Please contact the QTAwk author with
- information on the circumstances of this error message.
-
- 5050
- BEGIN/END/NOMATCH/INITIAL/FINAL Patterns or User Function
- Require An Action.
- The pre-defined patterns:
-
- BEGIN
- INITIAL
- NOMATCH
- FINAL
- END
-
- must have actions associated with them. The brace
- opening the action must be on the same line as the
- pre-defined pattern.
-
- 5060
- Exceeded Internal Stack Size on Scan.
- The internal stack for containing parsed tokens has been
- exceeded. Attempt to simplify the utility in the area
- where this error occurred.
-
- 5070
- Underflow Internal Stack on Scan.
- This is an internal error. If this error occurs please
- contact the QTAwk author with information on the
- circumstances of this error message.
-
-
-
- QTAwk - 20-13 - QTAwk
-
-
-
-
-
-
- Section 20.0 Appendix V
-
-
- 5080
- Missing ')' For Function Call.
- A used defined function argument list must be terminated
- with a right parenthesis, ')'. A symbol has been found
- which cannot be part of the argument list and is not a
- right parenthesis.
-
- 5090
- Function Call Without Parenthisized Argument List.
- A user defined function definition must include an
- argument list. The argument list may be empty, e.g.,
- "()", if there are no formal arguments.
-
- 5100
- 'fprint' Function Takes A minimum Of 1 Argument.
- The 'fprint' built-in function must have at least the
- name name of the output file specified.
-
- 5110
- printf and 'sprintf' Functions Take A Minimum Of 1
- Argument.
- These functions must have at least a format string
- defined.
-
- 5120
- 'fprintf' Function Needs A Minimum of Two Arguments.
- This function needs an output file name and a format
- string.
-
- 5130
- Second Argument Of 'fgetline' Has To Be A Variable.
- If two arguments are specified for the 'fgetline'
- built-in function, the second must be a variable.
-
- 5140
- Argument Of 'getline' Has To Be A Variable.
- If an argument for the 'getline' function is specified,
- it must be a variable.
-
- 5150
- split Function Needs Variable Name As Second Argument.
- The second second argument for the 'split' function must
- be a variable. The peices into which the first argument
- is split will be returned as array elements of the
- variable specified.
-
-
-
- QTAwk - 20-14 - QTAwk
-
-
-
-
-
-
- Section 20.0 Appendix V
-
-
- 5160
- 'rotate' Function Needs Variable As Argument.
- The argument of the 'rotate' function has to be an array
- variable.
-
- 5170
- 'justify' Function Needs Variable As First Argument.
- The format of the 'justify' built-in function is:
-
- justify(a,n,w)
-
- or
-
- justify(a,n,w,c)
-
- The first argument, 'a', must be an array variable. The
- first n elements of the array are concatenated to form a
- string 'w' characters long. a single space is used to
- separate the concatenated elements. If the optional third
- argument is specified, it is converted to a character
- value and used to separate the elements.
-
- 5180
- '[pu]_sym' Function Needs Variable As Second Argument.
- The second argument must be variable whose value can be
- changed to equal the string value of the name variable
- specified.
-
- 5190
- Bad Function Call
- Internal QTAwk error. Please contact the QTAwk author
- with information on the circumstances of this error.
-
- 5200
- Improper Number Of Arguments, {Funcation_Name} Function.
- The built-in function specified has been called with an
- improper number of arguments for the function. Check the
- user manual for the correct use of the intended function.
-
- 5210
- Need Variable On Left Side Of Assignment.
- In an assignment statement of the form:
-
- variable = expression;
-
- a variable must be specified on the left side of the
-
-
- QTAwk - 20-15 - QTAwk
-
-
-
-
-
-
- Section 20.0 Appendix V
-
-
- assignment operator to receive the value of the
- expression on the right side of the operator.
-
- 5220
- Conditional Expression Error - Missing ':'
- The form of the conditional expression is :
-
- test_expression ? expression_1 : expression_2;
-
- test_expression is evaluated, if the result is TRUE
- (non-zero numeric or non-null string), expression_1 is
- evaluated and the value becomes the value of the
- conditional expression. If the value of test_expression
- is FALSE (zero numeric or null string), expression_2 is
- evaluated and the value becomes the value of the
- conditional expression.
-
- 5230
- 'in' Operator Needs Array As Right Operand
- The form of the 'in' operator is:
-
- expression in array_var
-
- The operand to the right of 'in', array_var here, has to
- be a variable. If the variable is not an array, then the
- value of the expression is FALSE.
-
- 5240
- Missing ')' in Expression Grouping.
- An expression has been scanned with unbalanced
- parenthesis. Check for a missing terminating right
- parenthesis.
-
- 5250
- Pre-Increment/Decrement Operators Need Variable.
- The increment and decrement operators, '++' and '--',
- only operate on variables. An instance has been found in
- which the operator has been used as a pre-fix operator on
- something other than a variable. Check that grouping has
- not changed a post-fix operator into a pre-fix operator.
-
- 5260
- Undefined Symbol.
- A symbol has been found which matches no defined QTAwk
- syntax. This usually, but not always, occurs when the
- terminating semi-colon, ';', has been left off a
-
-
- QTAwk - 20-16 - QTAwk
-
-
-
-
-
-
- Section 20.0 Appendix V
-
-
- statement.
-
- 5270
- Need Variable for Array Reference
- A left bracket for indexing an array has been
- encountered. However, the preceeding symbol was not a
- variable. Only variables may be arrays and indexed.
-
- 5280
- Missing Index For Array
- A left bracket for indexing an array has been
- encountered. However, the index expression is missing:
- var[] a null index is not allowed in QTAwk.
-
- 5290
- Missing ']' Terminating array index.
- A left bracket and an index expression for indexing an
- array have been encountered. However, the right bracket
- terminating the index expression was not recognized.
- Check that the array index follows the form:
-
- var[index_expression]
-
- 5300
- Post-Increment/Decrement Operators Need Variable.
- The increment and decrement operators, '++' and '--',
- only operate on variables. An instance has been found in
- which the operator has been used as a post-fix operator
- on something other than a variable. Check that grouping
- has not changed a pre-fix operator into a post-fix
- operator.
-
- 5310
- 'if' Keyword - No Expression To Test.
- The proper syntax for the 'if' statement is:
-
- if ( conditional_expression ) statement
-
- The left parenthesis, '(', starting the conditional
- expression was not found following the 'if' keyword.
- Check that the syntax conforms to the form above.
-
- 5320
- 'if' Keyword - No Terminating ')' On Test Expression.
- The proper syntax for the 'if' statement is:
-
-
-
- QTAwk - 20-17 - QTAwk
-
-
-
-
-
-
- Section 20.0 Appendix V
-
-
- if ( conditional_expression ) statement
-
- The right parenthesis, ')', terminating the conditional
- expression was not found. Check that the syntax to
- conforms the form above.
-
- 5330
- 'while' Keyword - No Terminating ')' On Test Expression.
- The proper syntax for the 'while' statement is:
-
- while ( conditional_expression ) statement
-
- The right parenthesis, ')', terminating the conditional
- expression was not found. Check that the syntax to
- conforms the form above.
-
- 5340
- Missing 'while' Part Of 'do'.
- The proper syntax for the 'do' statement is:
-
- do statement while ( conditional_expression );
-
- The 'while' keyword was not found following the
- statement portion. Check that a possible left brace, '{',
- starting a compound statement may have been deleted or
- for the possible misuse of a keyword as a variable.
-
- 5350
- Missing '(' On 'while' Part Of 'do'.
- The proper syntax for the 'do' statement is:
-
- do statement while ( conditional_expression );
-
- The left parenthesis, '(', starting the conditional
- expression was not found following the 'while' keyword.
- Check that the syntax conforms to the form above.
-
- 5360
- Missing ')' On 'while' Part Of 'do'.
- The proper syntax for the 'do' statement is:
-
- do statement while ( conditional_expression );
-
- The right parenthesis, ')', terminating the conditional
- expression was not found. Check that the syntax to
- conforms the form above.
-
-
- QTAwk - 20-18 - QTAwk
-
-
-
-
-
-
- Section 20.0 Appendix V
-
-
- 5370
- Missing ';' Terminating 'do - while'.
- The proper syntax for the 'do' statement is:
-
- do statement while ( conditional_expression );
-
- Note the semicolon following the right parenthesis
- terminating the conditional expression. The semicolon is
- necessary here.
-
- 5380
- Missing Opening Parenthesis On 'for'.
- The proper syntax for the 'for' statement is:
-
- for ( intial_expression ; conditional_expression ;
- loop_expression )
- statement
-
- or
-
- for ( variable_name in array_name ) statement
-
- The left parenthesis, '(', was not found following the
- 'for' keyword. Check that the syntax conforms to the form
- above.
-
- ;li. 5390
- 5400
- 5420
- Improper Syntax - 'for' Conditional.
- The proper syntax for the 'for' statement is:
-
- for ( intial_expression ; conditional_expression ;
- loop_expression )
- statement
-
- One of the semicolons separating the three expressions
- or the terminating right parenthesis was not found. Check
- that the syntax follows the form above
-
- 5410
- 'in' Operator Needs Variable As Left Operand in 'for'
- Expression.
- The proper syntax for the 'for' statement is:
-
- for ( variable_name in array_name ) statement
-
-
- QTAwk - 20-19 - QTAwk
-
-
-
-
-
-
- Section 20.0 Appendix V
-
-
- the symbol following the left parenthesis and preceeding
- the 'in' keyword must be a valid variable name.
-
- 5430
- break/continue Keyword Outside Of Loop.
- Either of these keywords must be used inside of a
- 'while', 'for' or 'do' loop. In addition, the 'break'
- statement may be used inside a 'switch-case' construct to
- terminate execution flow. One of the keywords has been
- found outside of such a construct. Check for an imbalance
- of braces, '{}', enclosing compund statements.
-
- 5440
- 'return' Statement Outside Of User Function.
- The 'return' statement may only be used inside of a
- user-defined function to terminate execution of the
- function and cause execution to return to the place where
- the function was called. The 'return' keyword was
- encountered outside of the definition of such a function.
- Check for the use of the keyword as a variable or for
- unbalanced braces, '{}', enclosing the statements of the
- function.
-
- 5450
- Exceeded Limits on Number of Local Variable Definitions
- (1).
- QTAwk places a limit of 256 local variables within any
- compound statement. An attempt has been made to define
- more local variables than this limit allows.
-
- 5460
- No Variables Defined With 'local' Keyword.
- The form of local variable definition with the 'local'
- keyword follows the form:
-
- local var1, var2 = optional_expression;
-
- The 'local' keyword was encountered followed immediately
- by a semicolon. Check that the syntax follows the above
- form.
-
- 5470
- 'switch' Keyword - No Terminating ')' On Expression.
- The form of the 'switch' construct is:
-
- switch ( switch_expression ) statement
-
-
- QTAwk - 20-20 - QTAwk
-
-
-
-
-
-
- Section 20.0 Appendix V
-
-
- The right parenthesis, ')', terminating the
- switch_expression was not found.
-
- 5480
- 'case/default Statement Without Switch Statement.
- The 'case' keyword is used within the 'switch' statement
- to specify case expressions to which execution should
- transfer after matching the switch expression. A 'case'
- keyword was found outside of the 'switch' statement.
- Check for the use of the keyword as a variable or for
- unbalanced braces enclosing a compound 'switch'
- statement.
-
- 5490
- Multiple 'default' Statements in 'switch'.
- The 'default' keyword is used within a 'switch' statement
- to specify a transfer point at execution should proceed
- when the switch_expression fails to match any
- case_expression. Only one 'default' transfer point is
- allowed per 'switch' statement. Check for possible
- unbalanced braces, '{}', enclosing a compound statement
- in previous 'case' statements.
-
- 5500
- Missing ':' Following Expression On Case Label.
- The form of the 'case' statement is:
-
- case case_expression:
-
- A colon, ':', must terminate the case expression. QTAwk
- did not find the terminating colon.
-
- 5510
- Need Variable For 'delete' Reference
- The form of the 'delete' statement is:
-
- delete variable_name;
-
- or
-
- delete (variable_name);
-
- or
-
- delete variable_name[index];
-
-
-
- QTAwk - 20-21 - QTAwk
-
-
-
-
-
-
- Section 20.0 Appendix V
-
-
- or
-
- delete (variable_name[index]);
-
- where variable must be a global or local variable.
-
- 5520
- 'deletea' Statement Variable Cannot Be Indexed.
- The form of the 'deletea' statement is:
-
- deletea variable_name;
-
- or
-
- deletea (variable_name);
-
- where variable must be a global or local variable and
- cannot be indexed.
-
- 5530
- Need Variable For 'deletea' Reference
- The form of the 'deletea' statement is:
-
- deletea variable_name;
-
- or
-
- deletea (variable_name);
-
- where variable must be a global or local variable.
-
- 5540
- No ';' Terminating Statement.
- All statements in QTAwk are terminated by a semicolon.
- The terminating semicolon was not found by QTAwk.
-
- 5550
- Internal Compilation Error - Action Strings.
- This is an QTAwk internal error that should never happen.
- If this error message in encountered, please contact the
- QTAwk author with information on the circumstances of
- this error.
-
- 5560
- Error On Single Line Action. No Termination.
- In parsing/compiling an action entered from the command
-
-
- QTAwk - 20-22 - QTAwk
-
-
-
-
-
-
- Section 20.0 Appendix V
-
-
- line or by executing the 'execute' built-in function, the
- end of the line was reached without reaching the end of
- the action expression(s). Typically caused by a missing
- right bracket, '}' (or unbalanced brackets - more left
- brackets than right brackets).
-
- 5570
- Too Many User Functions Defined.
- QTAwk currently has a limit of 256 user defined
- functions. The currently utility has attempted to define
- more than that limit. Please contact the QTAwk author
- with information on the circumstances of this error
- message.
-
- 5580
- Exceeded Limits on Number of Local Variable Definitions
- (2).
- QTAwk currently has a limit of 256 'local' variables
- defined within any single compound statement. The
- currently utility has attempted to define more than that
- limit. Please contact the QTAwk author with information
- on the circumstances of this error message.
-
- 5590
- Expecting Function Name To Follow 'function' Keyword In
- Pattern.
- The 'function' keyword has been encountered in a pattern
- without a function name immediately following. This
- syntax error may be corrected by inserting the missing
- name or by removing the function keyword from the
- pattern.
-
- 5600
- Multi-Defined Function Name.
- The name supplied for a user defined function has been
- used previously. The current usage attempts to redefine
- the name. Change either the first use of the name or the
- present.
-
- 5610
- Unexpected Symbol - Function Argument List Definition.
- A user defined function has been encoutered with the
- accompanying list defining the passed argument names. The
- form of the list is a variable name followed by 1) a
- comma and more names, 2) an ellipses, '...' followed by a
- right parenthesis, or 3) a right parenthesis ending the
-
-
- QTAwk - 20-23 - QTAwk
-
-
-
-
-
-
- Section 20.0 Appendix V
-
-
- list. A symbol other than a comma or right parenthesis
- has been found following a variable name.
-
- 5620
- Expecting ')' To Terminate Function Parameter List.
- A user defined function has been encoutered with the
- accompanying list defining the passed argument names. The
- form of the list is a variable name followed by 1) a
- comma and more names, 2) an ellipses, '...' followed by a
- right parenthesis, or 3) a right parenthesis ending the
- list. A symbol has been found other than the right
- parenthesis following the ellipses.
-
- 5630
- Unexpected Symbol - Function Argument List Definition.
- A user defined function has been encoutered with the
- accompanying list defining the passed argument names. The
- form of the list is a variable name followed by 1) a
- comma and more names, 2) an ellipses, '...' followed by a
- right parenthesis, or 3) a right parenthesis ending the
- list. A symbol other than a comma or right parenthesis
- has been found following a variable name.
-
- 5640
- Expecting Parenthesized Argument Definition List For
- Function.
- A user defined function has the following form: function
- function_name ( argument_list ) The left parenthesis of
- the argument list was not found.
-
- 5650
- Improper Syntax - Improper Ending For Pattern
- A pattern expression must be ended by: 1) a comma (the
- first expression in a range expression only), 2) the left
- brace, '{', starting the associated action, 3) an
- End-of-File, or 4) new line. a symbol other than above
- has been encountered.
-
- 5660
- GROUP Pattern Only Accepts a Regular Expression, a String
- or a Variable.
- The GROUP pattern keyword may only be followed by a
- regular expression constant, a string constant or a
- variable. A symbol other than one of these three has been
- encountered.
-
-
-
- QTAwk - 20-24 - QTAwk
-
-
-
-
-
-
- Section 20.0 Appendix V
-
-
- 5670
- Internal Parse Error: 1001.
- Internal parser error. Please contact the QTAwk author
- with information on the circumstances of this error
- message.
-
- 5680
- Local Variable With Reserved Name.
- An attempt has been made to defione a local variable in
- either a user defined function argument list or with the
- 'local' keyword, with a name equal to a reserved word.
-
- 5690
- Improper Use of Keyword.
- A pattern keyword has been encountered in an action
- statement.
-
- 5700
- User Function Variable Argument Keyword Outside Of User
- Function.
- The two predefined local variables: vargc and vargv can
- only be used within user defined functions which have
- been defined with a variable length argument list using
- the ellipsis, '...'. One of these variables has been
- encountered outside of a user defined function.
-
- 5710
- Variable Argument Keyword In User Function Defined
- Without Variable Number Of Arguments.
- The two predefined local variables: vargc and vargv can
- only be used within user defined functions which have
- been defined with a variable length argument list using
- the ellipsis, '...'. One of these variables has been
- encountered inside of a user defined function which was
- not defined with a variable length argument list.
-
- 5720
- Internal Error - Variable Argument List Variable.
- The two predefined local variables: vargc and vargv can
- only be used within user defined functions which have
- been defined with a variable length argument list using
- the ellipsis, '...'. One of these variables has been
- previously defined as a local variable within the current
- compound statement.
-
- 5730
-
-
- QTAwk - 20-25 - QTAwk
-
-
-
-
-
-
- Section 20.0 Appendix V
-
-
- Internal Error - Variable Argument List Variable.
- The two predefined local variables: vargc and vargv can
- only be used within user defined functions which have
- been defined with a variable length argument list using
- the ellipsis, '...'. One of these variables has been
- previously defined as a global variable.
-
- 5740
- Internal Parse Error: 1002.
- Internal parser error. Please contact the QTAwk author
- with information on the circumstances of this error
- message.
-
- 5750
- Internal Parse Error: 1003.
- Internal parser error. Please contact the QTAwk author
- with information on the circumstances of this error
- message.
-
- 5760
- Empty Regular Expression.
- A regular expression must have some characters between
- the beginning and ending slashes. A regular expression
- has been encountered with none.
-
- 5770
- Regular Expression - No Terminating /.
- A regular expression constant must be contained on one
- line and be terminated by a slash. A regular expression
- has been been found with a no terminating slash before
- encountering a new line.
-
- 5780
- Internal Parse Error: 1004.
- Internal parser error. Please contact the QTAwk author
- with information on the circumstances of this error
- message.
-
- 5790
- String Constant - No Terminating ".
- A string constant must be contained on one line and be
- terminated by a double quote. A string constant has been
- been found with a no terminating double qoute before
- encountering a new line.
-
- 5800
-
-
- QTAwk - 20-26 - QTAwk
-
-
-
-
-
-
- Section 20.0 Appendix V
-
-
- Internal Parse Error: 1005.
- Internal parser error. Please contact the QTAwk author
- with information on the circumstances of this error
- message.
-
- 5810
- Character Constant - No Terminating '.
- a character constant must be contained on one line and be
- terminated by a single quote. A character constant has
- been been found with a no terminating single qoute before
- encountering a new line.
-
- 5820
- Character Constant Longer Than One Character
- A character constant is a single character bounded by
- single quotes as in 'A'. Escape sequences may also be
- used for specifying the character for a character
- constant, e.g., '\f' or '\x012' or '\022' are three ways
- to specify a single form feed character. This error
- reports that an attempt has been made to use single
- quotes to bound more than a single character.
-
- 5830
- Lexical Error - Illegal '.'
- Periods are used only in floating point numerics, e.g.,
- 0.88 or .33, or in user defined function definitions to
- indicate a variable number of arguments, e.g.,
-
- function max(...) {
-
- A period has been found which does not match either of
- these uses.
-
- 5840
- Lexical Error
- A character has been read which does not fit any syntax
- for a valid utility.
-
- 5850
- Exceeded Max. Limits On Number Of Variables.
- A amximum of 256 global variables may be defined in any
- single QTAwk utility.
-
- 6. Memory Errors
- 6000
- Out of Memory (n: , s: )
-
-
- QTAwk - 20-27 - QTAwk
-
-
-
-
-
-
- Section 20.0 Appendix V
-
-
- The QTAwk utility has used all available memory and
- attempted to exceed that limit. It is recommended that
- the utility be made shorter, or split into multiple
- utilities run separately.
-
- 6010
- Insufficient Memory.
- The QTAwk utility has used all available memory and
- attempted to exceed that limit. It is recommended that
- the utility be made shorter, or split into multiple
- utilities run separately.
-
- 6020
- 6030
- Action Too Long
- An action has been defined which exceeds the limits set
- for the internal length. The maximum length for the
- internal form of any action is 409,600 characters.
-
- 6040
- 6050
- 6060
- Out of Memory
- The QTAwk utility has used all available memory and
- attempted to exceed that limit. It is recommended that
- the utility be made shorter, or split into multiple
- utilities run separately.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- QTAwk - 20-28 - QTAwk
-
-
-
-
-
-
- Table of Contents
-
-
- Table of Contents
-
- QTAwk License ............................................... iii
- == Registration Information ................................. iii
- == Upgrade Information ...................................... iii
- == QTAwk License Agreement ................................... iv
- QTAwk 4.20 Order Form ....................................... vii
- == Order Information ....................................... viii
- == International Orders: ................................... viii
- == Company Purchase Orders: ................................ viii
- == Multi-System Licenses: .................................. viii
- Update History ............................................... ix
- Introduction ................................................. xi
-
- 1.0 TUTORIAL ................................................ 1-1
- 1.1 Data .................................................... 1-1
- 1.2 Running QTAwk ........................................... 1-2
-
- 2.0 REGULAR EXPRESSIONS ..................................... 2-1
- 2.1 'OR' Operator ........................................... 2-2
- 2.2 Character Classes ....................................... 2-2
- 2.3 Closure ................................................. 2-4
- 2.4 Repetition Operator ..................................... 2-6
- 2.5 Escape Sequences ........................................ 2-8
- 2.6 Position Operators ...................................... 2-9
- 2.7 Examples ................................................ 2-9
- 2.8 Look Ahead Operator .................................... 2-11
- 2.9 Match Classes .......................................... 2-11
- 2.10 Named Expressions ..................................... 2-12
- 2.11 Predefined Names ...................................... 2-15
- 2.12 Operator Summary ...................................... 2-17
-
- 3.0 EXPRESSIONS ............................................. 3-1
- 3.1 New/Changed Operators ................................... 3-2
- 3.2 Sequence Operator ....................................... 3-4
- 3.3 Match Operator Variables ................................ 3-5
- 3.4 Constants ............................................... 3-5
-
- 4.0 STRINGS and REGULAR EXPRESSIONS ......................... 4-1
- 4.1 Regular Expression and String Translation ............... 4-1
- 4.2 Regular Expressions in Patterns ......................... 4-1
-
- 5.0 PATTERN-ACTIONS ......................................... 5-1
- 5.1 QTAwk Patterns .......................................... 5-1
- 5.2 QTAwk Predefined Patterns ............................... 5-2
-
-
-
- .............................- xiv -.............................
-
-
-
-
-
-
- ................................................Table of Contents
-
-
- 6.0 VARIABLES and ARRAYS .................................... 6-1
- 6.1 QTAwk Arrays ............................................ 6-2
- 6.2 QTAwk Arrays in Arithmetic Expressions .................. 6-3
-
- 7.0 GROUP PATTERNS .......................................... 7-1
- 7.1 GROUP Pattern Advantage ................................. 7-1
- 7.2 GROUP Pattern Disadvantage .............................. 7-1
- 7.3 GROUP Pattern Regular Expressions ....................... 7-2
-
- 8.0 STATEMENTS .............................................. 8-1
- 8.1 QTAwk Keywords .......................................... 8-1
- 8.2 'cycle' and 'next' ...................................... 8-1
- 8.3 'delete' and 'deletea' .................................. 8-3
- 8.4 'if'/'else' ............................................. 8-5
- 8.5 'in' .................................................... 8-5
- 8.6 'switch', 'case', 'default' ............................. 8-5
- 8.7 Loops ................................................... 8-7
- 8.8 'while' ................................................. 8-7
- 8.9 'for' ................................................... 8-7
- 8.10 'do'/'while' ........................................... 8-8
- 8.11 'local' ................................................ 8-8
- 8.12 'endfile' .............................................. 8-9
- 8.13 'break' ............................................... 8-10
- 8.14 'continue' ............................................ 8-10
- 8.15 'exit opt_expr_list' .................................. 8-10
- 8.16 'return opt_expr_list' ................................ 8-10
-
- 9.0 BUILT-IN FUNCTIONS ...................................... 9-1
- 9.1 Arithmetic Functions .................................... 9-1
- 9.2 String Functions ........................................ 9-3
- 9.3 I/O Functions ........................................... 9-9
- 9.4 Miscellaneous Functions ................................ 9-11
- 9.4.1 Expression Type ...................................... 9-11
- 9.4.2 Execute String ....................................... 9-11
- 9.4.3 Array Function ....................................... 9-13
- 9.4.4 System Control Function .............................. 9-14
- 9.4.5 Variable Access ...................................... 9-14
-
- 10.0 FORMAT SPECIFICATION .................................. 10-1
- 10.1 Output Types .......................................... 10-2
- 10.2 Output Flags .......................................... 10-3
- 10.3 Output Width .......................................... 10-4
- 10.4 Output Precision ...................................... 10-4
-
- 11.0 USER-DEFINED FUNCTIONS ................................ 11-1
- 11.1 Local Variables ....................................... 11-1
-
-
- .............................- xv -..............................
-
-
-
-
-
-
- ................................................Table of Contents
-
-
- 11.2 Argument Checking ..................................... 11-1
- 11.3 Variable Length Argument Lists ........................ 11-2
- 11.4 Null Argument List .................................... 11-3
- 11.5 Arrays and Used-Defined Functions ..................... 11-3
-
- 12.0 TRACE STATEMENTS ...................................... 12-1
- 12.1 Selective Statement Tracing ........................... 12-1
- 12.2 Trace Output .......................................... 12-1
-
- 13.0 BUILT-IN VARIABLES .................................... 13-1
- 13.1 User Function Variable Argument Lists ................. 13-5
-
- 14.0 COMMAND LINE INVOCATION ............................... 14-1
- 14.1 Multiple QTAwk Utilities .............................. 14-1
- 14.2 Setting the Field Separator ........................... 14-2
- 14.3 Setting Variables on the Command Line ................. 14-2
- 14.4 QTAwk Execution Sequence .............................. 14-3
-
- 15.0 LIMITS ................................................ 15-1
-
- 16.0 Appendix I ............................................ 16-1
-
- 17.0 Appendix II ........................................... 17-1
-
- 18.0 Appendix III .......................................... 18-1
-
- 19.0 Appendix IV ........................................... 19-1
-
- 20.0 Appendix V ............................................ 20-1
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- - xvi -
-