home *** CD-ROM | disk | FTP | other *** search
- <TITLE>Module Contents -- Python library reference</TITLE>
- Prev: <A HREF="../r/regular_expressions" TYPE="Prev">Regular Expressions</A>
- Up: <A HREF="../r/regex" TYPE="Up">regex</A>
- Top: <A HREF="../t/top" TYPE="Top">Top</A>
- <H2>4.2.2. Module Contents</H2>
- The module defines these functions, and an exception:
- <P>
- <DL><DT><B>match</B> (<VAR>pattern</VAR>, <VAR>string</VAR>) -- function of module regex<DD>
- Return how many characters at the beginning of <VAR>string</VAR> match
- the regular expression <VAR>pattern</VAR>. Return <CODE>-1</CODE> if the
- string does not match the pattern (this is different from a
- zero-length match!).
- </DL>
- <DL><DT><B>search</B> (<VAR>pattern</VAR>, <VAR>string</VAR>) -- function of module regex<DD>
- Return the first position in <VAR>string</VAR> that matches the regular
- expression <VAR>pattern</VAR>. Return <CODE>-1</CODE> if no position in the string
- matches the pattern (this is different from a zero-length match
- anywhere!).
- </DL>
- <DL><DT><B>compile</B> (<VAR>pattern</VAR>[, <VAR>translate</VAR>]) -- function of module regex<DD>
- Compile a regular expression pattern into a regular expression
- object, which can be used for matching using its <CODE>match</CODE> and
- <CODE>search</CODE> methods, described below. The optional argument
- <VAR>translate</VAR>, if present, must be a 256-character string
- indicating how characters (both of the pattern and of the strings to
- be matched) are translated before comparing them; the <CODE>i</CODE>-th
- element of the string gives the translation for the character with
- ASCII code <CODE>i</CODE>. This can be used to implement
- case-insensitive matching; see the <CODE>casefold</CODE> data item below.
- <P>
- The sequence
- <P>
- <UL COMPACT><CODE>prog = regex.compile(pat)<P>
- result = prog.match(str)<P>
- </CODE></UL>
- is equivalent to
- <P>
- <UL COMPACT><CODE>result = regex.match(pat, str)<P>
- </CODE></UL>
- but the version using <CODE>compile()</CODE> is more efficient when multiple
- regular expressions are used concurrently in a single program. (The
- compiled version of the last pattern passed to <CODE>regex.match()</CODE> or
- <CODE>regex.search()</CODE> is cached, so programs that use only a single
- regular expression at a time needn't worry about compiling regular
- expressions.)
- </DL>
- <DL><DT><B>set_syntax</B> (<VAR>flags</VAR>) -- function of module regex<DD>
- Set the syntax to be used by future calls to <CODE>compile</CODE>,
- <CODE>match</CODE> and <CODE>search</CODE>. (Already compiled expression objects
- are not affected.) The argument is an integer which is the OR of
- several flag bits. The return value is the previous value of
- the syntax flags. Names for the flags are defined in the standard
- module <CODE>regex_syntax</CODE>; read the file <FILE>regex_syntax.py</FILE> for
- more information.
- </DL>
- <DL><DT><B>symcomp</B> (<VAR>pattern</VAR>[, <VAR>translate</VAR>]) -- function of module regex<DD>
- This is like <CODE>compile</CODE>, but supports symbolic group names: if a
- parenthesis-enclosed group begins with a group name in angular
- brackets, e.g. <CODE>'\(<id>[a-z][a-z0-9]*\)'</CODE>, the group can
- be referenced by its name in arguments to the <CODE>group</CODE> method of
- the resulting compiled regular expression object, like this:
- <CODE>p.group('id')</CODE>. Group names may contain alphanumeric characters
- and <CODE>'_'</CODE> only.
- </DL>
- <DL><DT><B>error</B> -- exception of module regex<DD>
- Exception raised when a string passed to one of the functions here
- is not a valid regular expression (e.g., unmatched parentheses) or
- when some other error occurs during compilation or matching. (It is
- never an error if a string contains no match for a pattern.)
- </DL>
- <DL><DT><B>casefold</B> -- data of module regex<DD>
- A string suitable to pass as <VAR>translate</VAR> argument to
- <CODE>compile</CODE> to map all upper case characters to their lowercase
- equivalents.
- </DL>
- Compiled regular expression objects support these methods:
- <P>
- <DL><DT><B>match</B> (<VAR>string</VAR>[, <VAR>pos</VAR>]) -- Method on regex<DD>
- Return how many characters at the beginning of <VAR>string</VAR> match
- the compiled regular expression. Return <CODE>-1</CODE> if the string
- does not match the pattern (this is different from a zero-length
- match!).
- <P>
- The optional second parameter <VAR>pos</VAR> gives an index in the string
- where the search is to start; it defaults to <CODE>0</CODE>. This is not
- completely equivalent to slicing the string; the <CODE>'^'</CODE> pattern
- character matches at the real begin of the string and at positions
- just after a newline, not necessarily at the index where the search
- is to start.
- </DL>
- <DL><DT><B>search</B> (<VAR>string</VAR>[, <VAR>pos</VAR>]) -- Method on regex<DD>
- Return the first position in <VAR>string</VAR> that matches the regular
- expression <CODE>pattern</CODE>. Return <CODE>-1</CODE> if no position in the
- string matches the pattern (this is different from a zero-length
- match anywhere!).
- <P>
- The optional second parameter has the same meaning as for the
- <CODE>match</CODE> method.
- </DL>
- <DL><DT><B>group</B> (<VAR>index</VAR>, <VAR>index</VAR>, ...) -- Method on regex<DD>
- This method is only valid when the last call to the <CODE>match</CODE>
- or <CODE>search</CODE> method found a match. It returns one or more
- groups of the match. If there is a single <VAR>index</VAR> argument,
- the result is a single string; if there are multiple arguments, the
- result is a tuple with one item per argument. If the <VAR>index</VAR> is
- zero, the corresponding return value is the entire matching string; if
- it is in the inclusive range [1..99], it is the string matching the
- the corresponding parenthesized group (using the default syntax,
- groups are parenthesized using <CODE>*(</CODE> and <CODE>*)</CODE>). If no
- such group exists, the corresponding result is <CODE>None</CODE>.
- <P>
- If the regular expression was compiled by <CODE>symcomp</CODE> instead of
- <CODE>compile</CODE>, the <VAR>index</VAR> arguments may also be strings
- identifying groups by their group name.
- </DL>
- Compiled regular expressions support these data attributes:
- <P>
- <DL><DT><B>regs</B> -- attribute of regex<DD>
- When the last call to the <CODE>match</CODE> or <CODE>search</CODE> method found a
- match, this is a tuple of pairs of indices corresponding to the
- beginning and end of all parenthesized groups in the pattern. Indices
- are relative to the string argument passed to <CODE>match</CODE> or
- <CODE>search</CODE>. The 0-th tuple gives the beginning and end or the
- whole pattern. When the last match or search failed, this is
- <CODE>None</CODE>.
- </DL>
- <DL><DT><B>last</B> -- attribute of regex<DD>
- When the last call to the <CODE>match</CODE> or <CODE>search</CODE> method found a
- match, this is the string argument passed to that method. When the
- last match or search failed, this is <CODE>None</CODE>.
- </DL>
- <DL><DT><B>translate</B> -- attribute of regex<DD>
- This is the value of the <VAR>translate</VAR> argument to
- <CODE>regex.compile</CODE> that created this regular expression object. If
- the <VAR>translate</VAR> argument was omitted in the <CODE>regex.compile</CODE>
- call, this is <CODE>None</CODE>.
- </DL>
- <DL><DT><B>givenpat</B> -- attribute of regex<DD>
- The regular expression pattern as passed to <CODE>compile</CODE> or
- <CODE>symcomp</CODE>.
- </DL>
- <DL><DT><B>realpat</B> -- attribute of regex<DD>
- The regular expression after stripping the group names for regular
- expressions compiled with <CODE>symcomp</CODE>. Same as <CODE>givenpat</CODE>
- otherwise.
- </DL>
- <DL><DT><B>groupindex</B> -- attribute of regex<DD>
- A dictionary giving the mapping from symbolic group names to numerical
- group indices for regular expressions compiled with <CODE>symcomp</CODE>.
- <CODE>None</CODE> otherwise.
- </DL>
-