Expression | Matches |
Does not match
a
|
a
|
b
, ...
|
ABC
|
ABC
|
123
, ...
| |
Metacharacters -- escape with backslash
Characters that are used as metacharacters must be preceded by a backslash (\) to be literally matched. These are:
|
|
|
|
|
| |
Escape character -- backslash (\)
The backslash (\) has "escape" functionality. That is, a literal character following a backslash can escape its "literalness" and, in combination with the backslash, attain new functionality (if defined). Conversely, a metacharacter following a backslash escapes its meta-meaning and is literally matched.
Do not match -- exclamation point (!) (in SNiFF+)
Expression
Matches
Does not match
\
nothing, because not followed by anything (undefined)
\
, ...
\\
\
(first
\
escapes meta-functionality)
\ \
, ...
\n
newline (a defined sequence)
n
,
\
, ...
\a
a
(because \a not defined)
\
, ...
This applies only in filter fields of SNiFF+ tools: An exclamation point at the beginning of a regular expression means "match everything except the following regex".
Note that this is not a metacharacter in the usual sense (does not have to be escaped, unless it is in position one of the regex). Note also that this is a SNiFF+ specific implementation and not usually part of the regular expression syntax.
Single character wild card -- period (.)
Matches any single character, except newline (\n)
Expression | Matches |
Does not match
.et
|
get, Get, set, 2et
....
|
got
...
| |
Quantifiers -- how often to match
Zero or more occurrences -- asterisk -- *
Note that the asterisk is not a wild card, but a quantifier. A regex followed by an asterisk (*) matches zero or more occurrences of the regex. A period followed by an asterisk (.*) therefore matches "any character (except newline) occurring any number of times, or not at all".
One or more occurrences -- plus sign -- +
A regex followed by a plus sign (+) matches one or more occurrences of the regex. A period followed by a plus sign (.+) therefore matches "any character (except newline) occurring at least once".
Zero or one occurrence only -- question mark -- ?
A regex followed by a question mark (?) matches zero or one occurrence only of the regular expression. A period followed by a question mark (?) therefore matches "any character (except newline) occurring only once or not at all".
Expression | Matches |
Does not match
Do*Command
|
DCommand
| myDoCommand DoooCommand ...
DoMenuCommand
| abc ...
Do+Command
|
DoCommand
| myDoooCommands ...
DCommand
| DoMenuCommand ...
Do?Command
|
DCommand
| DoCommand (anything else)
| |
Position -- where to match
Matches can be restricted to their position in words, lines and files.
Beginning or end of word -- \b
\b followed by a regex matches only at the beginning of a word.
\b preceded by a regex matches only at the end of a word.
Not beginning or end of word -- \B
A regex preceded by \B matches everywhere except at the beginning of a word.
A regex followed by
Beginning of line -- caret -- (^)
The caret means "match the following regex only if it is at the beginning of a line". Note that the caret has a different meaning (negation) when it is used within Character classes or lists.
End of line -- dollar sign -- ($)
The dollar sign means "match the preceding regex only if it is at the end of a line".
Expression | Matches |
Does not match
^void
|
void
(only if
void
is the first text in the line)
|
// void, avoid,
| void preceded by any characters...
)$
|
foo(a
) (only if the '
)
' is the last character in the text line)
| anything where '
)
' is followed by any characters, e.g. ';'
| |
First in file -- \accent grave -- (\`)
The \` means "match the following regex only if it is at the beginning of a file".
Last in file -- \accent acute -- (\')
The \´ means "match the preceding regex only if it is at the end of a file".
Expression |
Matches
\`.*
| the first line in every file (e.g. in the Retriever)
|
.*\´
| the last line on every file (e.g. in the Retriever)
| |
Nonprinting or whitespace characters
Nonprinting characters are represented as follows in SNiFF+ regular expressions:
This is a special character class (see page 628), namely
[ \f\n\r\t\v], the listed items are:
Expression
Matches
[ \t]+$
all (unnecessary) space and tab characters at the end of lines.