home *** CD-ROM | disk | FTP | other *** search
-
- Trex documentation and tutorial
- ------------------------------------------------------------
-
-
- Introduction
- ------------------------------------------------------------
-
- Trex (pronounced T-rex, just like the film star) stands for
- Tiny REgular eXpression filter. It can be useful on its
- own, but is really meant to give you the opportunity of
- playing with regular expressions if you're not yet familiar
- with them. Trex is a VR companion program. You may freely
- distribute it along with the contents of the VR archive,
- and you may use it as long as you wish without ever
- registering VR. You may _not_ distribute TREX and this
- documentation file without the rest of the files in the VR
- distribution archive.
-
-
- What does it do ?
- ------------------------------------------------------------
-
- Trex checks text lines that it gets on input, (typically
- via redirection, from a file) and outputs only those parts
- that match certain patterns that you specify. It can be
- used to perform searches, or to filter out certain elements
- from a text.
-
-
- Examples and tutorial
- ------------------------------------------------------------
-
- Trex must be run from the DOS commandline prompt. The normal
- way for it to get input is via the DOS redirection
- facility, using the '<' character. Lets have an example.
-
- [NOTE: If you have an HP100LX, you can use MEMO or FILER to
- view this file (TREX.DOC), and start a DOS shell by hitting
- <Ctrl>-<123>. In this way, you can switch back and fro
- between this reading text and actually trying Trex out.]
-
- Say you want to search for the string 'you' in this file.
- You would issue the following command:
-
- trex "you" <trex.doc
-
- You will notice that Trex outputs a number of 'you' lines.
- Now that's not particularly useful - you probably want to
- see some context ! Try this:
-
- trex ".*you.*" <trex.doc
-
- What's the difference ? Remember, Trex always outputs
- exactly what 'matched' your description. If you instructed
- it to search for 'you', and it found a line containing this
- pattern, it will just output the match. The second line
- instructs Trex to match any characters before and after the
- string 'you', too. (And to output them consequently)
-
- How does it work ? The '.' character is special to Trex.
- It matches any character. '*' has special meaning too: it
- indicates: match zero or more of the previous 'pattern'.
- In this case the 'previous pattern' was the '.', saying
- 'any character is OK'. So, '.*' means: 'whatever - I don't
- care'. '.*you.*' means: the start and end I don't care, but
- there should be 'you' in it, and the whole stuff printed.
-
-
- Before we continue with our tutorial, here's an overview of
- the characters that have special meaning for Trex:
-
-
- Trex special characters
- ------------------------------------------------------------
-
- '.' : any character.
- '*' : zero or more occurances
- '+' : one or more occurances
- '?' : zero or one occurance
- '|' : 'or', separates alternatives
- '[' and ']' : used to define ranges
- '(' and ')' : used for grouping
- '^' : beginning of line
- '$' : end of line
- '\' : next character 'as-is'
-
-
- Don't worry if this list looks incomprehensible at first -
- we'll look at them in turn, with more examples. Lets do
- some experimenting: try to figure out what this command
- does before actually trying it:
-
- trex "^ +.trex.*" <trex.doc
-
- That was harder. This matches (and prints) every line that
- starts with at least one space, then has the string 'trex'.
- It filters out the trex examples we saw up to now. Let's
- analyze how it works. The '^' ensures that the beginning of
- the line is matched. The '+' following the space character,
- indicates that at least one space (possibly more) has to be
- present. The '.*' behind 'trex' is not required to _find_
- the line, but to output it entirely. (You know that trick by
- now)
-
- Can you find out what Trex command would print out all lines
- ending in ':' in this file ? Here's the solution:
-
- trex ".*:$" <trex.doc
-
- Lets have a look at ranges now. Suppose you want to find
- numerical values in a file. You could use this:
-
- trex "[0123456789]+" <trex.doc
-
- The '[' and ']' simply enclose a list of valid characters.
- (Digits, in this case). The trailing '+' indicates that
- Trex should try to match as many of them as possible, but at
- least one.
-
- There's an easier way, to do this. You can use a dash in a
- range to specify 'all characters in between'. Like this:
-
- trex "[0-9]+" <trex.doc
-
- OK, now suppose we'd like to locate values with a preceding
- dollars sign, like:
-
- $1000 or
- $99.95 etc.
-
- Two problems. First, we can't use the '$' or '.' sign - both
- are special characters, and in this case we don't want them to
- have special action. The '\' sign is the solution - it instructs
- Trex to ignore the special meaning - if any - of the following
- character. So to match the $1000 line above, you can use:
-
- trex "\$[0-9]+" <trex.doc
-
- Ok, so far so good, but what about the $99.95 ? Only the
- '$99' part got output - not what we wanted ! We could
- change our line to:
-
- trex "\$[0-9]+\.[0-9]+" <trex.doc
-
- but now, _only_ '$99.95' will be matched. Not what we want
- either. Here's a possible solution:
-
- trex "\$[0-9]+(\.[0-9]+)?" <trex.doc
-
- This is the first time that we see both the grouping
- parentheses and the '?' character. The question mark says
- 'zero or one occurences of the previous patterns', and the
- parentheses take care that the 'previous pattern' is what
- we want it to be.
-
- We didn't use the '|' character yet. It is used to separate
- alternatives. To search for HP95 or HP100, for example, you
- could use:
-
- HP95|HP100
-
- or
-
- HP(95|100)
-
- This terminates our little Trex tutorial - the best way
- to learn more about regular expressions is to experiment
- with them using TREX - good luck !
-
-
-