Previous Next
Literals and metacharacters

Literals in regular expressions are ordinary characters that are literally matched, that is, they match only themselves. Metacharacters are characters that are not matched literally; they are a kind of shorthand for defined functionalities.

Expression Matches Does not match
a a
b , ...
ABC ABC
123 , ...

Metacharacters -- escape with backslash
Characters that are used as metacharacters must be preceded by a backslash (\) to be literally matched. These are:

Escape character -- backslash (\)
The backslash (\) has "escape" functionality. That is, a literal character following a backslash can escape its "literalness" and, in combination with the backslash, attain new functionality (if defined). Conversely, a metacharacter following a backslash escapes its meta-meaning and is literally matched.

Do not match -- exclamation point (!) (in SNiFF+)
This applies only in filter fields of SNiFF+ tools: An exclamation point at the beginning of a regular expression means "match everything except the following regex".
Note that this is not a metacharacter in the usual sense (does not have to be escaped, unless it is in position one of the regex). Note also that this is a SNiFF+ specific implementation and not usually part of the regular expression syntax.
Single character wild card -- period (.)
Matches any single character, except newline (\n)

Expression Matches Does not match
.et get, Get, set, 2et ....
got ...

Quantifiers -- how often to match
Zero or more occurrences -- asterisk -- *
Note that the asterisk is not a wild card, but a quantifier. A regex followed by an asterisk (*) matches zero or more occurrences of the regex. A period followed by an asterisk (.*) therefore matches "any character (except newline) occurring any number of times, or not at all".
One or more occurrences -- plus sign -- +
A regex followed by a plus sign (+) matches one or more occurrences of the regex. A period followed by a plus sign (.+) therefore matches "any character (except newline) occurring at least once".
Zero or one occurrence only -- question mark -- ?
A regex followed by a question mark (?) matches zero or one occurrence only of the regular expression. A period followed by a question mark (?) therefore matches "any character (except newline) occurring only once or not at all".

Expression Matches Does not match
Do*Command DCommand
myDoCommand
DoooCommand ...
DoMenuCommand
abc ...
Do+Command DoCommand
myDoooCommands ...
DCommand
DoMenuCommand ...
Do?Command DCommand
DoCommand
(anything else)

Position -- where to match
Matches can be restricted to their position in words, lines and files.
Beginning or end of word -- \b
\b followed by a regex matches only at the beginning of a word.
\b preceded by a regex matches only at the end of a word.
Not beginning or end of word -- \B
A regex preceded by \B matches everywhere except at the beginning of a word.
A regex followed by
\B matches everywhere except at the end of a word.

Expression Matches Does not match
\bCommand Command
Command er
DoCommand
...
Command\b Command
Do Command
Commander
...
\bCommand\b Command (only)
(anything else)
get\B get Date, for get ful
get, forget ...
\Bget for get , for get ful
get, getDate ...

Beginning of line -- caret -- (^)
The caret means "match the following regex only if it is at the beginning of a line". Note that the caret has a different meaning (negation) when it is used within Character classes or lists.
End of line -- dollar sign -- ($)
The dollar sign means "match the preceding regex only if it is at the end of a line".

Expression Matches Does not match
^void void (only if void is the first text in the line)
// void, avoid,
void preceded by any characters...
)$ foo(a ) (only if the ' ) ' is the last character in the text line)
anything where ' ) ' is followed by any characters, e.g. ';'

First in file -- \accent grave -- (\`)
The \` means "match the following regex only if it is at the beginning of a file".
Last in file -- \accent acute -- (\')
The \´ means "match the preceding regex only if it is at the end of a file".

Expression Matches
\`.* the first line in every file (e.g. in the Retriever)
.*\´ the last line on every file (e.g. in the Retriever)

Nonprinting or whitespace characters
Nonprinting characters are represented as follows in SNiFF+ regular expressions: