home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
The Datafile PD-CD 5
/
DATAFILE_PDCD5.iso
/
utilities
/
p
/
python
/
!ibrowse
/
files
/
pylibi-3
< prev
next >
Encoding:
Amiga
Atari
Commodore
DOS
FM Towns/JPY
Macintosh
Macintosh JP
Macintosh to JP
NeXTSTEP
RISC OS/Acorn
Shift JIS
UTF-8
Wrap
GNU Info File
|
1996-11-14
|
48.8 KB
|
1,252 lines
This is Info file pylibi, produced by Makeinfo-1.55 from the input file
lib.texi.
This file describes the built-in types, exceptions and functions and the
standard modules that come with the Python system. It assumes basic
knowledge about the Python language. For an informal introduction to
the language, see the Python Tutorial. The Python Reference Manual
gives a more formal definition of the language. (These manuals are not
yet available in INFO or Texinfo format.)
Copyright 1991-1995 by Stichting Mathematisch Centrum, Amsterdam, The
Netherlands.
All Rights Reserved
Permission to use, copy, modify, and distribute this software and its
documentation for any purpose and without fee is hereby granted,
provided that the above copyright notice appear in all copies and that
both that copyright notice and this permission notice appear in
supporting documentation, and that the names of Stichting Mathematisch
Centrum or CWI or Corporation for National Research Initiatives or CNRI
not be used in advertising or publicity pertaining to distribution of
the software without specific, written prior permission.
While CWI is the initial source for this software, a modified version
is made available by the Corporation for National Research Initiatives
(CNRI) at the Internet address ftp://ftp.python.org.
STICHTING MATHEMATISCH CENTRUM AND CNRI DISCLAIM ALL WARRANTIES WITH
REGARD TO THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS, IN NO EVENT SHALL STICHTING MATHEMATISCH
CENTRUM OR CNRI BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL
DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR
PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS
ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF
THIS SOFTWARE.
File: pylibi, Node: marshal, Next: imp, Prev: copy, Up: Python Services
Built-in Module `marshal'
=========================
This module contains functions that can read and write Python values in
a binary format. The format is specific to Python, but independent of
machine architecture issues (e.g., you can write a Python value to a
file on a PC, transport the file to a Sun, and read it back there).
Details of the format are undocumented on purpose; it may change
between Python versions (although it rarely does).(1)
This is not a general "persistency" module. For general persistency
and transfer of Python objects through RPC calls, see the modules
`pickle' and `shelve'. The `marshal' module exists mainly to support
reading and writing the "pseudo-compiled" code for Python modules of
`.pyc' files.
Not all Python object types are supported; in general, only objects
whose value is independent from a particular invocation of Python can
be written and read by this module. The following types are supported:
`None', integers, long integers, floating point numbers, strings,
tuples, lists, dictionaries, and code objects, where it should be
understood that tuples, lists and dictionaries are only supported as
long as the values contained therein are themselves supported; and
recursive lists and dictionaries should not be written (they will cause
infinite loops).
Caveat: On machines where C's `long int' type has more than 32 bits
(such as the DEC Alpha), it is possible to create plain Python integers
that are longer than 32 bits. Since the current `marshal' module uses
32 bits to transfer plain Python integers, such values are silently
truncated. This particularly affects the use of very long integer
literals in Python modules -- these will be accepted by the parser on
such machines, but will be silently be truncated when the module is read
from the `.pyc' instead.(2)
There are functions that read/write files as well as functions
operating on strings.
The module defines these functions:
- function of module marshal: dump (VALUE, FILE)
Write the value on the open file. The value must be a supported
type. The file must be an open file object such as `sys.stdout'
or returned by `open()' or `posix.popen()'.
If the value has (or contains an object that has) an unsupported
type, a `ValueError' exception is raised - but garbage data will
also be written to the file. The object will not be properly read
back by `load()'.
- function of module marshal: load (FILE)
Read one value from the open file and return it. If no valid value
is read, raise `EOFError', `ValueError' or `TypeError'. The file
must be an open file object.
Warning: If an object containing an unsupported type was marshalled
with `dump()', `load()' will substitute `None' for the
unmarshallable type.
- function of module marshal: dumps (VALUE)
Return the string that would be written to a file by `dump(value,
file)'. The value must be a supported type. Raise a `ValueError'
exception if value has (or contains an object that has) an
unsupported type.
- function of module marshal: loads (STRING)
Convert the string to a value. If no valid value is found, raise
`EOFError', `ValueError' or `TypeError'. Extra characters in the
string are ignored.
---------- Footnotes ----------
(1) The name of this module stems from a bit of terminology used by
the designers of Modula-3 (amongst others), who use the term
"marshalling" for shipping of data around in a self-contained form.
Strictly speaking, "to marshal" means to convert some data from
internal to external form (in an RPC buffer for instance) and
"unmarshalling" for the reverse process.
(2) A solution would be to refuse such literals in the parser, since
they are inherently non-portable. Another solution would be to let the
`marshal' module raise an exception when an integer value would be
truncated. At least one of these solutions will be implemented in a
future version.
File: pylibi, Node: imp, Next: __builtin__, Prev: marshal, Up: Python Services
Built-in Module `imp'
=====================
This module provides an interface to the mechanisms used to implement
the `import' statement. It defines the following constants and
functions:
- function of module imp: get_magic ()
Return the magic string value used to recognize byte-compiled code
files ("`.pyc' files").
- function of module imp: get_suffixes ()
Return a list of triples, each describing a particular type of
file. Each triple has the form `(SUFFIX, MODE, TYPE)', where
SUFFIX is a string to be appended to the module name to form the
filename to search for, MODE is the mode string to pass to the
built-in `open' function to open the file (this can be `'r'' for
text files or `'rb'' for binary files), and TYPE is the file type,
which has one of the values `PY_SOURCE', `PY_COMPILED' or
`C_EXTENSION', defined below. (System-dependent values may also
be returned.)
- function of module imp: find_module (NAME, [PATH])
Try to find the module NAME on the search path PATH. The default
PATH is `sys.path'. The return value is a triple `(FILE,
PATHNAME, DESCRIPTION)' where FILE is an open file object
positioned at the beginning, PATHNAME is the pathname of the file
found, and DESCRIPTION is a triple as contained in the list
returned by `get_suffixes' describing the kind of file found.
- function of module imp: init_builtin (NAME)
Initialize the built-in module called NAME and return its module
object. If the module was already initialized, it will be
initialized *again*. A few modules cannot be initialized twice --
attempting to initialize these again will raise an `ImportError'
exception. If there is no built-in module called NAME, `None' is
returned.
- function of module imp: init_frozen (NAME)
Initialize the frozen module called NAME and return its module
object. If the module was already initialized, it will be
initialized *again*. If there is no frozen module called NAME,
`None' is returned. (Frozen modules are modules written in Python
whose compiled byte-code object is incorporated into a
custom-built Python interpreter by Python's `freeze' utility. See
`Tools/freeze' for now.)
- function of module imp: is_builtin (NAME)
Return `1' if there is a built-in module called NAME which can be
initialized again. Return `-1' if there is a built-in module
called NAME which cannot be initialized again (see
`init_builtin'). Return `0' if there is no built-in module called
NAME.
- function of module imp: is_frozen (NAME)
Return `1' if there is a frozen module (see `init_frozen') called
NAME, `0' if there is no such module.
- function of module imp: load_compiled (NAME, PATHNAME, FILE)
Load and initialize a module implemented as a byte-compiled code
file and return its module object. If the module was already
initialized, it will be initialized *again*. The NAME argument is
used to create or access a module object. The PATHNAME argument
points to the byte-compiled code file. The FILE argument is the
byte-compiled code file, open for reading in binary mode, from the
beginning. It must currently be a real file object, not a
user-defined class emulating a file.
- function of module imp: load_dynamic (NAME, PATHNAME, [FILE])
Load and initialize a module implemented as a dynamically loadable
shared library and return its module object. If the module was
already initialized, it will be initialized *again*. Some modules
don't like that and may raise an exception. The PATHNAME argument
must point to the shared library. The NAME argument is used to
construct the name of the initialization function: an external C
function called `initNAME()' in the shared library is called. The
optional FILE argment is ignored. (Note: using shared libraries
is highly system dependent, and not all systems support it.)
- function of module imp: load_source (NAME, PATHNAME, FILE)
Load and initialize a module implemented as a Python source file
and return its module object. If the module was already
initialized, it will be initialized *again*. The NAME argument is
used to create or access a module object. The PATHNAME argument
points to the source file. The FILE argument is the source file,
open for reading as text, from the beginning. It must currently
be a real file object, not a user-defined class emulating a file.
Note that if a properly matching byte-compiled file (with suffix
`.pyc') exists, it will be used instead of parsing the given
source file.
- function of module imp: new_module (NAME)
Return a new empty module object called NAME. This object is
*not* inserted in `sys.modules'.
The following constants with integer values, defined in the module, are
used to indicate the search result of `imp.find_module'.
- data of module imp: SEARCH_ERROR
The module was not found.
- data of module imp: PY_SOURCE
The module was found as a source file.
- data of module imp: PY_COMPILED
The module was found as a compiled code object file.
- data of module imp: C_EXTENSION
The module was found as dynamically loadable shared library.
* Menu:
* Examples::
File: pylibi, Node: Examples, Prev: imp, Up: imp
Examples
--------
The following function emulates the default import statement:
import imp
import sys
def __import__(name, globals=None, locals=None, fromlist=None):
# Fast path: see if the module has already been imported.
if sys.modules.has_key(name):
return sys.modules[name]
# If any of the following calls raises an exception,
# there's a problem we can't handle -- let the caller handle it.
# See if it's a built-in module.
m = imp.init_builtin(name)
if m:
return m
# See if it's a frozen module.
m = imp.init_frozen(name)
if m:
return m
# Search the default path (i.e. sys.path).
fp, pathname, (suffix, mode, type) = imp.find_module(name)
# See what we got.
try:
if type == imp.C_EXTENSION:
return imp.load_dynamic(name, pathname)
if type == imp.PY_SOURCE:
return imp.load_source(name, pathname, fp)
if type == imp.PY_COMPILED:
return imp.load_compiled(name, pathname, fp)
# Shouldn't get here at all.
raise ImportError, '%s: unknown module type (%d)' % (name, type)
finally:
# Since we may exit via an exception, close fp explicitly.
fp.close()
File: pylibi, Node: __builtin__, Next: __main__, Prev: imp, Up: Python Services
Built-in Module `__builtin__'
=============================
This module provides direct access to all `built-in' identifiers of
Python; e.g. `__builtin__.open' is the full name for the built-in
function `open'. See the section on Built-in Functions in the previous
chapter.
File: pylibi, Node: __main__, Prev: __builtin__, Up: Python Services
Built-in Module `__main__'
==========================
This module represents the (otherwise anonymous) scope in which the
interpreter's main program executes -- commands read either from
standard input or from a script file.
File: pylibi, Node: String Services, Next: Miscellaneous Services, Prev: Python Services, Up: Top
String Services
***************
The modules described in this chapter provide a wide range of string
manipulation operations. Here's an overview:
string
-- Common string operations.
regex
-- Regular expression search and match operations.
regsub
-- Substitution and splitting operations that use regular
expressions.
struct
-- Interpret strings as packed binary data.
* Menu:
* string::
* regex::
* regsub::
* struct::
File: pylibi, Node: string, Next: regex, Prev: String Services, Up: String Services
Standard Module `string'
========================
This module defines some constants useful for checking character
classes and some useful string functions. See the modules `regex' and
`regsub' for string functions based on regular expressions.
The constants defined in this module are are:
- data of module string: digits
The string `'0123456789''.
- data of module string: hexdigits
The string `'0123456789abcdefABCDEF''.
- data of module string: letters
The concatenation of the strings `lowercase' and `uppercase'
described below.
- data of module string: lowercase
A string containing all the characters that are considered
lowercase letters. On most systems this is the string
`'abcdefghijklmnopqrstuvwxyz''. Do not change its definition --
the effect on the routines `upper' and `swapcase' is undefined.
- data of module string: octdigits
The string `'01234567''.
- data of module string: uppercase
A string containing all the characters that are considered
uppercase letters. On most systems this is the string
`'ABCDEFGHIJKLMNOPQRSTUVWXYZ''. Do not change its definition --
the effect on the routines `lower' and `swapcase' is undefined.
- data of module string: whitespace
A string containing all characters that are considered whitespace.
On most systems this includes the characters space, tab, linefeed,
return, formfeed, and vertical tab. Do not change its definition
-- the effect on the routines `strip' and `split' is undefined.
The functions defined in this module are:
- function of module string: atof (S)
Convert a string to a floating point number. The string must have
the standard syntax for a floating point literal in Python,
optionally preceded by a sign (`+' or `-').
- function of module string: atoi (S[, BASE])
Convert string S to an integer in the given BASE. The string must
consist of one or more digits, optionally preceded by a sign (`+'
or `-'). The BASE defaults to 10. If it is 0, a default base is
chosen depending on the leading characters of the string (after
stripping the sign): `0x' or `0X' means 16, `0' means 8, anything
else means 10. If BASE is 16, a leading `0x' or `0X' is always
accepted. (Note: for a more flexible interpretation of numeric
literals, use the built-in function `eval()'.)
- function of module string: atol (S[, BASE])
Convert string S to a long integer in the given BASE. The string
must consist of one or more digits, optionally preceded by a sign
(`+' or `-'). The BASE argument has the same meaning as for
`atoi()'. A trailing `l' or `L' is not allowed, except if the
base is 0.
- function of module string: capitalize (WORD)
Capitalize the first character of the argument.
- function of module string: capwords (S)
Split the argument into words using `split', capitalize each word
using `capitalize', and join the capitalized words using `join'.
Note that this replaces runs of whitespace characters by a single
space. (See also `regsub.capwords()' for a version that doesn't
change the delimiters, and lets you specify a word separator.)
- function of module string: expandtabs (S, TABSIZE)
Expand tabs in a string, i.e. replace them by one or more spaces,
depending on the current column and the given tab size. The column
number is reset to zero after each newline occurring in the string.
This doesn't understand other non-printing characters or escape
sequences.
- function of module string: find (S, SUB[, START])
Return the lowest index in S not smaller than START where the
substring SUB is found. Return `-1' when SUB does not occur as a
substring of S with index at least START. If START is omitted, it
defaults to `0'. If START is negative, `len(S)' is added.
- function of module string: rfind (S, SUB[, START])
Like `find' but find the highest index.
- function of module string: index (S, SUB[, START])
Like `find' but raise `ValueError' when the substring is not found.
- function of module string: rindex (S, SUB[, START])
Like `rfind' but raise `ValueError' when the substring is not
found.
- function of module string: count (S, SUB[, START])
Return the number of (non-overlapping) occurrences of substring
SUB in string S with index at least START. If START is omitted,
it defaults to `0'. If START is negative, `len(S)' is added.
- function of module string: lower (S)
Convert letters to lower case.
- function of module string: maketrans (FROM, TO)
Return a translation table suitable for passing to
`string.translate' or `regex.compile', that will map each
character in FROM into the character at the same position in TO;
FROM and TO must have the same length.
- function of module string: split (S[, SEP[, MAXSPLIT]])
Return a list of the words of the string S. If the optional
second argument SEP is absent or `None', the words are separated
by arbitrary strings of whitespace characters (space, tab,
newline, return, formfeed). If the second argument SEP is present
and not `None', it specifies a string to be used as the word
separator. The returned list will then have one more items than
the number of non-overlapping occurrences of the separator in the
string. The optional third argument MAXSPLIT defaults to 0. If
it is nonzero, at most MAXSPLIT number of splits occur, and the
remainder of the string is returned as the final element of the
list (thus, the list will have at most `MAXSPLIT+1' elements).
(See also `regsub.split()' for a version that allows specifying a
regular expression as the separator.)
- function of module string: splitfields (S[, SEP[, MAXSPLIT]])
This function behaves identical to `split'. (In the past, `split'
was only used with one argument, while `splitfields' was only used
with two arguments.)
- function of module string: join (WORDS[, SEP])
Concatenate a list or tuple of words with intervening occurrences
of SEP. The default value for SEP is a single space character.
It is always true that `string.join(string.split(S, SEP), SEP)'
equals S.
- function of module string: joinfields (WORDS[, SEP])
This function behaves identical to `join'. (In the past, `join'
was only used with one argument, while `joinfields' was only used
with two arguments.)
- function of module string: lstrip (S)
Remove leading whitespace from the string S.
- function of module string: rstrip (S)
Remove trailing whitespace from the string S.
- function of module string: strip (S)
Remove leading and trailing whitespace from the string S.
- function of module string: swapcase (S)
Convert lower case letters to upper case and vice versa.
- function of module string: translate (S, TABLE[, DELETECHARS])
Delete all characters from S that are in DELETECHARS (if present),
and then translate the characters using TABLE, which must be a
256-character string giving the translation for each character
value, indexed by its ordinal.
- function of module string: upper (S)
Convert letters to upper case.
- function of module string: ljust (S, WIDTH)
- function of module string: rjust (S, WIDTH)
- function of module string: center (S, WIDTH)
These functions respectively left-justify, right-justify and
center a string in a field of given width. They return a string
that is at least WIDTH characters wide, created by padding the
string S with spaces until the given width on the right, left or
both sides. The string is never truncated.
- function of module string: zfill (S, WIDTH)
Pad a numeric string on the left with zero digits until the given
width is reached. Strings starting with a sign are handled
correctly.
This module is implemented in Python. Much of its functionality has
been reimplemented in the built-in module `strop'. However, you should
*never* import the latter module directly. When `string' discovers
that `strop' exists, it transparently replaces parts of itself with the
implementation from `strop'. After initialization, there is *no*
overhead in using `string' instead of `strop'.
File: pylibi, Node: regex, Next: regsub, Prev: string, Up: String Services
Built-in Module `regex'
=======================
This module provides regular expression matching operations similar to
those found in Emacs. It is always available.
By default the patterns are Emacs-style regular expressions (with one
exception). There is a way to change the syntax to match that of
several well-known UNIX utilities. The exception is that Emacs' `\s'
pattern is not supported, since the original implementation references
the Emacs syntax tables.
This module is 8-bit clean: both patterns and strings may contain null
bytes and characters whose high bit is set.
*Please note:* There is a little-known fact about Python string
literals which means that you don't usually have to worry about
doubling backslashes, even though they are used to escape special
characters in string literals as well as in regular expressions. This
is because Python doesn't remove backslashes from string literals if
they are followed by an unrecognized escape character. *However*, if
you want to include a literal "backslash" in a regular expression
represented as a string literal, you have to *quadruple* it. E.g. to
extract LaTeX `\section{...}' headers from a document, you can use this
pattern: `'\\\\section{\(.*\)}''. *Another exception:* the escape
sequece `\b' is significant in string literals (where it means the
ASCII bell character) as well as in Emacs regular expressions (where it
stands for a word boundary), so in order to search for a word boundary,
you should use the pattern `'\\b''. Similarly, a backslash followed by
a digit 0-7 should be doubled to avoid interpretation as an octal
escape.
* Menu:
* Regular Expressions::
* Module Contents::
File: pylibi, Node: Regular Expressions, Next: Module Contents, Prev: regex, Up: regex
Regular Expressions
-------------------
A regular expression (or RE) specifies a set of strings that matches
it; the functions in this module let you check if a particular string
matches a given regular expression (or if a given regular expression
matches a particular string, which comes down to the same thing).
Regular expressions can be concatenated to form new regular
expressions; if *A* and *B* are both regular expressions, then *AB* is
also an regular expression. If a string *p* matches A and another
string *q* matches B, the string *pq* will match AB. Thus, complex
expressions can easily be constructed from simpler ones like the
primitives described here. For details of the theory and
implementation of regular expressions, consult almost any textbook
about compiler construction.
A brief explanation of the format of regular expressions follows.
Regular expressions can contain both special and ordinary characters.
Ordinary characters, like '`A'', '`a'', or '`0'', are the simplest
regular expressions; they simply match themselves. You can concatenate
ordinary characters, so '`last'' matches the characters 'last'. (In
the rest of this section, we'll write RE's in `this special font',
usually without quotes, and strings to be matched 'in single quotes'.)
Special characters either stand for classes of ordinary characters, or
affect how the regular expressions around them are interpreted.
The special characters are:
* `.' (Dot.) Matches any character except a newline.
* `' (Caret.) Matches the start of the string.
* `' Matches the end of the string. `foo' matches both 'foo' and
'foobar', while the regular expression '`foo$'' matches only 'foo'.
* `*' Causes the resulting RE to match 0 or more repetitions of the
preceding RE. `ab*' will match 'a', 'ab', or 'a' followed by any
number of 'b's.
* `+' Causes the resulting RE to match 1 or more repetitions of the
preceding RE. `ab+' will match 'a' followed by any non-zero
number of 'b's; it will not match just 'a'.
* `?' Causes the resulting RE to match 0 or 1 repetitions of the
preceding RE. `ab?' will match either 'a' or 'ab'.
* `' Either escapes special characters (permitting you to match
characters like '*?+&$'), or signals a special sequence; special
sequences are discussed below. Remember that Python also uses the
backslash as an escape sequence in string literals; if the escape
sequence isn't recognized by Python's parser, the backslash and
subsequent character are included in the resulting string.
However, if Python would recognize the resulting sequence, the
backslash should be repeated twice.
* `[]' Used to indicate a set of characters. Characters can be
listed individually, or a range is indicated by giving two
characters and separating them by a '-'. Special characters are
not active inside sets. For example, `[akm$]' will match any of
the characters 'a', 'k', 'm', or '$'; `[a-z]' will match any
lowercase letter.
If you want to include a `]' inside a set, it must be the first
character of the set; to include a `-', place it as the first or
last character.
Characters *not* within a range can be matched by including a `^'
as the first character of the set; `^' elsewhere will simply match
the '`^'' character.
The special sequences consist of '`\'' and a character from the list
below. If the ordinary character is not on the list, then the
resulting RE will match the second character. For example, `\$'
matches the character '$'. Ones where the backslash should be doubled
are indicated.
* ` |' `A\|B', where A and B can be arbitrary REs, creates a regular
expression that will match either A or B. This can be used inside
groups (see below) as well.
* ` ( )' Indicates the start and end of a group; the contents of a
group can be matched later in the string with the `\' special
sequence, described next.
* ` 1, ... 7, 8, 9' Matches the contents of the group of the same
number. For example, `\(.+\) \\1' matches 'the the' or '55 55',
but not 'the end' (note the space after the group). This special
sequence can only be used to match one of the first 9 groups;
groups with higher numbers can be matched using the `\v' sequence.
(`\8' and `\9' don't need a double backslash because they are not
octal digits.)
* ` b' Matches the empty string, but only at the beginning or end of
a word. A word is defined as a sequence of alphanumeric
characters, so the end of a word is indicated by whitespace or a
non-alphanumeric character.
* ` B' Matches the empty string, but when it is *not* at the
beginning or end of a word.
* ` v' Must be followed by a two digit decimal number, and matches
the contents of the group of the same number. The group number
must be between 1 and 99, inclusive.
* ` w' Matches any alphanumeric character; this is equivalent to the
set `[a-zA-Z0-9]'.
* ` W' Matches any non-alphanumeric character; this is equivalent to
the set `[^a-zA-Z0-9]'.
* ` <' Matches the empty string, but only at the beginning of a
word. A word is defined as a sequence of alphanumeric characters,
so the end of a word is indicated by whitespace or a
non-alphanumeric character.
* ` >' Matches the empty string, but only at the end of a word.
* `' Matches a literal backslash.
* ` `' Like `^', this only matches at the start of the string.
* ` '' Like `$', this only matches at the end of the string.
File: pylibi, Node: Module Contents, Prev: Regular Expressions, Up: regex
Module Contents
---------------
The module defines these functions, and an exception:
- function of module regex: match (PATTERN, STRING)
Return how many characters at the beginning of STRING match the
regular expression PATTERN. Return `-1' if the string does not
match the pattern (this is different from a zero-length match!).
- function of module regex: search (PATTERN, STRING)
Return the first position in STRING that matches the regular
expression PATTERN. Return `-1' if no position in the string
matches the pattern (this is different from a zero-length match
anywhere!).
- function of module regex: compile (PATTERN[, TRANSLATE])
Compile a regular expression pattern into a regular expression
object, which can be used for matching using its `match' and
`search' methods, described below. The optional argument
TRANSLATE, if present, must be a 256-character string indicating
how characters (both of the pattern and of the strings to be
matched) are translated before comparing them; the `i'-th element
of the string gives the translation for the character with ASCII
code `i'. This can be used to implement case-insensitive
matching; see the `casefold' data item below.
The sequence
prog = regex.compile(pat)
result = prog.match(str)
is equivalent to
result = regex.match(pat, str)
but the version using `compile()' is more efficient when multiple
regular expressions are used concurrently in a single program.
(The compiled version of the last pattern passed to
`regex.match()' or `regex.search()' is cached, so programs that
use only a single regular expression at a time needn't worry about
compiling regular expressions.)
- function of module regex: set_syntax (FLAGS)
Set the syntax to be used by future calls to `compile', `match'
and `search'. (Already compiled expression objects are not
affected.) The argument is an integer which is the OR of several
flag bits. The return value is the previous value of the syntax
flags. Names for the flags are defined in the standard module
`regex_syntax'; read the file `regex_syntax.py' for more
information.
- function of module regex: symcomp (PATTERN[, TRANSLATE])
This is like `compile', but supports symbolic group names: if a
parenthesis-enclosed group begins with a group name in angular
brackets, e.g. `'\(<id>[a-z][a-z0-9]*\)'', the group can be
referenced by its name in arguments to the `group' method of the
resulting compiled regular expression object, like this:
`p.group('id')'. Group names may contain alphanumeric characters
and `'_'' only.
- exception of module regex: error
Exception raised when a string passed to one of the functions here
is not a valid regular expression (e.g., unmatched parentheses) or
when some other error occurs during compilation or matching. (It
is never an error if a string contains no match for a pattern.)
- data of module regex: casefold
A string suitable to pass as TRANSLATE argument to `compile' to
map all upper case characters to their lowercase equivalents.
Compiled regular expression objects support these methods:
- Method on regex: match (STRING[, POS])
Return how many characters at the beginning of STRING match the
compiled regular expression. Return `-1' if the string does not
match the pattern (this is different from a zero-length match!).
The optional second parameter POS gives an index in the string
where the search is to start; it defaults to `0'. This is not
completely equivalent to slicing the string; the `'^'' pattern
character matches at the real begin of the string and at positions
just after a newline, not necessarily at the index where the search
is to start.
- Method on regex: search (STRING[, POS])
Return the first position in STRING that matches the regular
expression `pattern'. Return `-1' if no position in the string
matches the pattern (this is different from a zero-length match
anywhere!).
The optional second parameter has the same meaning as for the
`match' method.
- Method on regex: group (INDEX, INDEX, ...)
This method is only valid when the last call to the `match' or
`search' method found a match. It returns one or more groups of
the match. If there is a single INDEX argument, the result is a
single string; if there are multiple arguments, the result is a
tuple with one item per argument. If the INDEX is zero, the
corresponding return value is the entire matching string; if it is
in the inclusive range [1..99], it is the string matching the the
corresponding parenthesized group (using the default syntax,
groups are parenthesized using `
(' and `
)'). If no such group exists, the corresponding result is `None'.
If the regular expression was compiled by `symcomp' instead of
`compile', the INDEX arguments may also be strings identifying
groups by their group name.
Compiled regular expressions support these data attributes:
- attribute of regex: regs
When the last call to the `match' or `search' method found a
match, this is a tuple of pairs of indices corresponding to the
beginning and end of all parenthesized groups in the pattern.
Indices are relative to the string argument passed to `match' or
`search'. The 0-th tuple gives the beginning and end or the whole
pattern. When the last match or search failed, this is `None'.
- attribute of regex: last
When the last call to the `match' or `search' method found a
match, this is the string argument passed to that method. When the
last match or search failed, this is `None'.
- attribute of regex: translate
This is the value of the TRANSLATE argument to `regex.compile'
that created this regular expression object. If the TRANSLATE
argument was omitted in the `regex.compile' call, this is `None'.
- attribute of regex: givenpat
The regular expression pattern as passed to `compile' or `symcomp'.
- attribute of regex: realpat
The regular expression after stripping the group names for regular
expressions compiled with `symcomp'. Same as `givenpat' otherwise.
- attribute of regex: groupindex
A dictionary giving the mapping from symbolic group names to
numerical group indices for regular expressions compiled with
`symcomp'. `None' otherwise.
File: pylibi, Node: regsub, Next: struct, Prev: regex, Up: String Services
Standard Module `regsub'
========================
This module defines a number of functions useful for working with
regular expressions (see built-in module `regex').
Warning: these functions are not thread-safe.
- function of module regsub: sub (PAT, REPL, STR)
Replace the first occurrence of pattern PAT in string STR by
replacement REPL. If the pattern isn't found, the string is
returned unchanged. The pattern may be a string or an already
compiled pattern. The replacement may contain references `\DIGIT'
to subpatterns and escaped backslashes.
- function of module regsub: gsub (PAT, REPL, STR)
Replace all (non-overlapping) occurrences of pattern PAT in string
STR by replacement REPL. The same rules as for `sub()' apply.
Empty matches for the pattern are replaced only when not adjacent
to a previous match, so e.g. `gsub('', '-', 'abc')' returns
`'-a-b-c-''.
- function of module regsub: split (STR, PAT[, MAXSPLIT])
Split the string STR in fields separated by delimiters matching
the pattern PAT, and return a list containing the fields. Only
non-empty matches for the pattern are considered, so e.g.
`split('a:b', ':*')' returns `['a', 'b']' and `split('abc', '')'
returns `['abc']'. The MAXSPLIT defaults to 0. If it is nonzero,
only MAXSPLIT number of splits occur, and the remainder of the
string is returned as the final element of the list.
- function of module regsub: splitx (STR, PAT[, MAXSPLIT])
Split the string STR in fields separated by delimiters matching
the pattern PAT, and return a list containing the fields as well
as the separators. For example, `splitx('a:::b', ':*')' returns
`['a', ':::', 'b']'. Otherwise, this function behaves the same as
`split'.
- function of module regsub: capwords (S[, PAT])
Capitalize words separated by optional pattern PAT. The default
pattern uses any characters except letters, digits and underscores
as word delimiters. Capitalization is done by changing the first
character of each word to upper case.
File: pylibi, Node: struct, Prev: regsub, Up: String Services
Built-in Module `struct'
========================
This module performs conversions between Python values and C structs
represented as Python strings. It uses "format strings" (explained
below) as compact descriptions of the lay-out of the C structs and the
intended conversion to/from Python values.
See also built-in module `array'.
The module defines the following exception and functions:
- exception of module struct: error
Exception raised on various occasions; argument is a string
describing what is wrong.
- function of module struct: pack (FMT, V1, V2, ...)
Return a string containing the values `V1, V2, ...' packed
according to the given format. The arguments must match the
values required by the format exactly.
- function of module struct: unpack (FMT, STRING)
Unpack the string (presumably packed by `pack(FMT, ...)')
according to the given format. The result is a tuple even if it
contains exactly one item. The string must contain exactly the
amount of data required by the format (i.e. `len(STRING)' must
equal `calcsize(FMT)').
- function of module struct: calcsize (FMT)
Return the size of the struct (and hence of the string)
corresponding to the given format.
Format characters have the following meaning; the conversion between C
and Python values should be obvious given their types:
*Format*
*C* -- *Python*
`x'
pad byte -- no value
`c'
char -- string of length 1
`b'
signed char -- integer
`h'
short -- integer
`i'
int -- integer
`l'
long -- integer
`f'
float -- float
`d'
double -- float
A format character may be preceded by an integral repeat count; e.g.
the format string `'4h'' means exactly the same as `'hhhh''.
C numbers are represented in the machine's native format and byte
order, and properly aligned by skipping pad bytes if necessary
(according to the rules used by the C compiler).
Examples (all on a big-endian machine):
pack('hhl', 1, 2, 3) == '\000\001\000\002\000\000\000\003'
unpack('hhl', '\000\001\000\002\000\000\000\003') == (1, 2, 3)
calcsize('hhl') == 8
Hint: to align the end of a structure to the alignment requirement of a
particular type, end the format with the code for that type with a
repeat count of zero, e.g. the format `'llh0l'' specifies two pad bytes
at the end, assuming longs are aligned on 4-byte boundaries.
(More format characters are planned, e.g. `'s'' for character arrays,
upper case for unsigned variants, and a way to specify the byte order,
which is useful for [de]constructing network packets and
reading/writing portable binary file formats like TIFF and AIFF.)
File: pylibi, Node: Miscellaneous Services, Next: Generic Operating System Services, Prev: String Services, Up: Top
Miscellaneous Services
**********************
The modules described in this chapter provide miscellaneous services
that are available in all Python versions. Here's an overview:
math
-- Mathematical functions (`sin()' etc.).
rand
-- Integer random number generator.
whrandom
-- Floating point random number generator.
array
-- Efficient arrays of uniformly typed numeric values.
* Menu:
* math::
* rand::
* whrandom::
* array::
File: pylibi, Node: math, Next: rand, Prev: Miscellaneous Services, Up: Miscellaneous Services
Built-in Module `math'
======================
This module is always available. It provides access to the
mathematical functions defined by the C standard. They are:
- function of module math: acos (X)
- function of module math: asin (X)
- function of module math: atan (X)
- function of module math: atan2 (X, Y)
- function of module math: ceil (X)
- function of module math: cos (X)
- function of module math: cosh (X)
- function of module math: exp (X)
- function of module math: fabs (X)
- function of module math: floor (X)
- function of module math: fmod (X, Y)
- function of module math: frexp (X)
- function of module math: hypot (X, Y)
- function of module math: ldexp (X, Y)
- function of module math: log (X)
- function of module math: log10 (X)
- function of module math: modf (X)
- function of module math: pow (X, Y)
- function of module math: sin (X)
- function of module math: sinh (X)
- function of module math: sqrt (X)
- function of module math: tan (X)
- function of module math: tanh (X)
Note that `frexp' and `modf' have a different call/return pattern than
their C equivalents: they take a single argument and return a pair of
values, rather than returning their second return value through an
`output parameter' (there is no such thing in Python).
The module also defines two mathematical constants:
- data of module math: pi
- data of module math: e
File: pylibi, Node: rand, Next: whrandom, Prev: math, Up: Miscellaneous Services
Standard Module `rand'
======================
This module implements a pseudo-random number generator with an
interface similar to `rand()' in C. the following functions:
- function of module rand: rand ()
Returns an integer random number in the range [0 ... 32768).
- function of module rand: choice (S)
Returns a random element from the sequence (string, tuple or list)
S.
- function of module rand: srand (SEED)
Initializes the random number generator with the given integral
seed. When the module is first imported, the random number is
initialized with the current time.
File: pylibi, Node: whrandom, Next: array, Prev: rand, Up: Miscellaneous Services
Standard Module `whrandom'
==========================
This module implements a Wichmann-Hill pseudo-random number generator.
It defines the following functions:
- function of module whrandom: random ()
Returns the next random floating point number in the range [0.0
... 1.0).
- function of module whrandom: seed (X, Y, Z)
Initializes the random number generator from the integers X, Y and
Z. When the module is first imported, the random number is
initialized using values derived from the current time.
File: pylibi, Node: array, Prev: whrandom, Up: Miscellaneous Services
Built-in Module `array'
=======================
This module defines a new object type which can efficiently represent
an array of basic values: characters, integers, floating point numbers.
Arrays are sequence types and behave very much like lists, except that
the type of objects stored in them is constrained. The type is
specified at object creation time by using a "type code", which is a
single character. The following type codes are defined:
*Typecode*
*Type* -- *Minimal size in bytes*
`'c''
character -- 1
`'b''
signed integer -- 1
`'h''
signed integer -- 2
`'i''
signed integer -- 2
`'l''
signed integer -- 4
`'f''
floating point -- 4
`'d''
floating point -- 8
The actual representation of values is determined by the machine
architecture (strictly speaking, by the C implementation). The actual
size can be accessed through the ITEMSIZE attribute.
See also built-in module `struct'.
The module defines the following function:
- function of module array: array (TYPECODE[, INITIALIZER])
Return a new array whose items are restricted by TYPECODE, and
initialized from the optional INITIALIZER value, which must be a
list or a string. The list or string is passed to the new array's
`fromlist()' or `fromstring()' method (see below) to add initial
items to the array.
Array objects support the following data items and methods:
- data of module array: typecode
The typecode character used to create the array.
- data of module array: itemsize
The length in bytes of one array item in the internal
representation.
- function of module array: append (X)
Append a new item with value X to the end of the array.
- function of module array: byteswap (X)
"Byteswap" all items of the array. This is only supported for
integer values. It is useful when reading data from a file written
on a machine with a different byte order.
- function of module array: fromfile (F, N)
Read N items (as machine values) from the file object F and append
them to the end of the array. If less than N items are available,
`EOFError' is raised, but the items that were available are still
inserted into the array. F must be a real built-in file object;
something else with a `read()' method won't do.
- function of module array: fromlist (LIST)
Append items from the list. This is equivalent to `for x in LIST:
a.append(x)' except that if there is a type error, the array is
unchanged.
- function of module array: fromstring (S)
Appends items from the string, interpreting the string as an array
of machine values (i.e. as if it had been read from a file using
the `fromfile()' method).
- function of module array: insert (I, X)
Insert a new item with value X in the array before position I.
- function of module array: tofile (F)
Write all items (as machine values) to the file object F.
- function of module array: tolist ()
Convert the array to an ordinary list with the same items.
- function of module array: tostring ()
Convert the array to an array of machine values and return the
string representation (the same sequence of bytes that would be
written to a file by the `tofile()' method.)
When an array object is printed or converted to a string, it is
represented as `array(TYPECODE, INITIALIZER)'. The INITIALIZER is
omitted if the array is empty, otherwise it is a string if the TYPECODE
is `'c'', otherwise it is a list of numbers. The string is guaranteed
to be able to be converted back to an array with the same type and
value using reverse quotes (```'). Examples:
array('l')
array('c', 'hello world')
array('l', [1, 2, 3, 4, 5])
array('d', [1.0, 2.0, 3.14])
File: pylibi, Node: Generic Operating System Services, Next: Optional Operating System Services, Prev: Miscellaneous Services, Up: Top
Generic Operating System Services
*********************************
The modules described in this chapter provide interfaces to operating
system features that are available on (almost) all operating systems,
such as files and a clock. The interfaces are generally modelled after
the UNIX or C interfaces but they are available on most other systems
as well. Here's an overview:
os
-- Miscellaneous OS interfaces.
time
-- Time access and conversions.
getopt
-- Parser for command line options.
tempfile
-- Generate temporary file names.
* Menu:
* os::
* time::
* getopt::
* tempfile::
* errno::