home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
The Datafile PD-CD 5
/
DATAFILE_PDCD5.iso
/
utilities
/
p
/
python
/
!ibrowse
/
files
/
pylibi-7
< prev
next >
Encoding:
Amiga
Atari
Commodore
DOS
FM Towns/JPY
Macintosh
Macintosh JP
Macintosh to JP
NeXTSTEP
RISC OS/Acorn
Shift JIS
UTF-8
Wrap
GNU Info File
|
1996-11-14
|
50.3 KB
|
1,180 lines
This is Info file pylibi, produced by Makeinfo-1.55 from the input file
lib.texi.
This file describes the built-in types, exceptions and functions and the
standard modules that come with the Python system. It assumes basic
knowledge about the Python language. For an informal introduction to
the language, see the Python Tutorial. The Python Reference Manual
gives a more formal definition of the language. (These manuals are not
yet available in INFO or Texinfo format.)
Copyright 1991-1995 by Stichting Mathematisch Centrum, Amsterdam, The
Netherlands.
All Rights Reserved
Permission to use, copy, modify, and distribute this software and its
documentation for any purpose and without fee is hereby granted,
provided that the above copyright notice appear in all copies and that
both that copyright notice and this permission notice appear in
supporting documentation, and that the names of Stichting Mathematisch
Centrum or CWI or Corporation for National Research Initiatives or CNRI
not be used in advertising or publicity pertaining to distribution of
the software without specific, written prior permission.
While CWI is the initial source for this software, a modified version
is made available by the Corporation for National Research Initiatives
(CNRI) at the Internet address ftp://ftp.python.org.
STICHTING MATHEMATISCH CENTRUM AND CNRI DISCLAIM ALL WARRANTIES WITH
REGARD TO THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS, IN NO EVENT SHALL STICHTING MATHEMATISCH
CENTRUM OR CNRI BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL
DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR
PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS
ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF
THIS SOFTWARE.
File: pylibi, Node: formatter, Next: rfc822, Prev: htmllib, Up: Internet and WWW
Standard Module `formatter'
===========================
This module supports two interface definitions, each with mulitple
implementations. The *formatter* interface is used by the `HTMLParser'
class of the `htmllib' module, and the *writer* interface is required
by the formatter interface.
Formatter objects transform an abstract flow of formatting events into
specific output events on writer objects. Formatters manage several
stack structures to allow various properties of a writer object to be
changed and restored; writers need not be able to handle relative
changes nor any sort of "change back" operation. Specific writer
properties which may be controlled via formatter objects are horizontal
alignment, font, and left margin indentations. A mechanism is provided
which supports providing arbitrary, non-exclusive style settings to a
writer as well. Additional interfaces facilitate formatting events
which are not reversible, such as paragraph separation.
Writer objects encapsulate device interfaces. Abstract devices, such
as file formats, are supported as well as physical devices. The
provided implementations all work with abstract devices. The interface
makes available mechanisms for setting the properties which formatter
objects manage and inserting data into the output.
* Menu:
* The Formatter Interface::
* Formatter Implementations::
* The Writer Interface::
* Writer Implementations::
File: pylibi, Node: The Formatter Interface, Next: Formatter Implementations, Prev: formatter, Up: formatter
The Formatter Interface
-----------------------
Interfaces to create formatters are dependent on the specific formatter
class being instantiated. The interfaces described below are the
required interfaces which all formatters must support once initialized.
One data element is defined at the module level:
- data of module formatter: AS_IS
Value which can be used in the font specification passed to the
`push_font()' method described below, or as the new value to any
other `push_PROPERTY()' method. Pushing the `AS_IS' value allows
the corresponding `pop_PROPERTY()' method to be called without
having to track whether the property was changed.
The following attributes are defined for formatter instance objects:
- data of formatter object data: writer
The writer instance with which the formatter interacts.
- Method on formatter object: end_paragraph (BLANKLINES)
Close any open paragraphs and insert at least `blanklines' before
the next paragraph.
- Method on formatter object: add_line_break ()
Add a hard line break if one does not already exist. This does not
break the logical paragraph.
- Method on formatter object: add_hor_rule (*ARGS, **KW)
Insert a horizontal rule in the output. A hard break is inserted
if there is data in the current paragraph, but the logical
paragraph is not broken. The arguments and keywords are passed on
to the writer's `send_line_break()' method.
- Method on formatter object: add_flowing_data (DATA)
Provide data which should be formatted with collapsed whitespaces.
Whitespace from preceeding and successive calls to
`add_flowing_data()' is considered as well when the whitespace
collapse is performed. The data which is passed to this method is
expected to be word-wrapped by the output device. Note that any
word-wrapping still must be performed by the writer object due to
the need to rely on device and font information.
- Method on formatter object: add_literal_data (DATA)
Provide data which should be passed to the writer unchanged.
Whitespace, including newline and tab characters, are considered
legal in the value of `data'.
- Method on formatter object: add_label_data (FORMAT, COUNTER)
Insert a label which should be placed to the left of the current
left margin. This should be used for constructing bulleted or
numbered lists. If the `format' value is a string, it is
interpreted as a format specification for `counter', which should
be an integer. The result of this formatting becomes the value of
the label; if `format' is not a string it is used as the label
value directly. The label value is passed as the only argument to
the writer's `send_label_data()' method. Interpretation of
non-string label values is dependent on the associated writer.
Format specifications are strings which, in combination with a
counter value, are used to compute label values. Each character
in the format string is copied to the label value, with some
characters recognized to indicate a transform on the counter
value. Specifically, the character "`1'" represents the counter
value formatter as an arabic number, the characters "`A'" and
"`a'" represent alphabetic representations of the counter value in
upper and lower case, respectively, and "`I'" and "`i'" represent
the counter value in Roman numerals, in upper and lower case.
Note that the alphabetic and roman transforms require that the
counter value be greater than zero.
- Method on formatter object: flush_softspace ()
Send any pending whitespace buffered from a previous call to
`add_flowing_data()' to the associated writer object. This should
be called before any direct manipulation of the writer object.
- Method on formatter object: push_alignment (ALIGN)
Push a new alignment setting onto the alignment stack. This may be
`AS_IS' if no change is desired. If the alignment value is
changed from the previous setting, the writer's `new_alignment()'
method is called with the `align' value.
- Method on formatter object: pop_alignment ()
Restore the previous alignment.
- Method on formatter object: push_font ((SIZE, ITALIC, BOLD,
TELETYPE))
Change some or all font properties of the writer object.
Properties which are not set to `AS_IS' are set to the values
passed in while others are maintained at their current settings.
The writer's `new_font()' method is called with the fully resolved
font specification.
- Method on formatter object: pop_font ()
Restore the previous font.
- Method on formatter object: push_margin (MARGIN)
Increase the number of left margin indentations by one, associating
the logical tag `margin' with the new indentation. The initial
margin level is `0'. Changed values of the logical tag must be
true values; false values other than `AS_IS' are not sufficient to
change the margin.
- Method on formatter object: pop_margin ()
Restore the previous margin.
- Method on formatter object: push_style (*STYLES)
Push any number of arbitrary style specifications. All styles are
pushed onto the styles stack in order. A tuple representing the
entire stack, including `AS_IS' values, is passed to the writer's
`new_styles()' method.
- Method on formatter object: pop_style ([N` = 1'])
Pop the last `n' style specifications passed to `push_style()'. A
tuple representing the revised stack, including `AS_IS' values, is
passed to the writer's `new_styles()' method.
- Method on formatter object: set_spacing (SPACING)
Set the spacing style for the writer.
- Method on formatter object: assert_line_data ([FLAG` = 1'])
Inform the formatter that data has been added to the current
paragraph out-of-band. This should be used when the writer has
been manipulated directly. The optional `flag' argument can be
set to false if the writer manipulations produced a hard line
break at the end of the output.
File: pylibi, Node: Formatter Implementations, Next: The Writer Interface, Prev: The Formatter Interface, Up: formatter
Formatter Implementations
-------------------------
Two implementations of formatter objects are provided by this module.
Most applications may use one of these classes without modification or
subclassing.
- function of module formatter: NullFormatter ([WRITER` = None'])
A formatter which does nothing. If `writer' is omitted, a
`NullWriter' instance is created. No methods of the writer are
called by `NullWriter' instances. Implementations should inherit
from this class if implementing a writer interface but don't need
to inherit any implementation.
- function of module formatter: AbstractFormatter (WRITER)
The standard formatter. This implementation has demonstrated wide
applicability to many writers, and may be used directly in most
circumstances. It has been used to implement a full-featured
world-wide web browser.
File: pylibi, Node: The Writer Interface, Next: Writer Implementations, Prev: Formatter Implementations, Up: formatter
The Writer Interface
--------------------
Interfaces to create writers are dependent on the specific writer class
being instantiated. The interfaces described below are the required
interfaces which all writers must support once initialized. Note that
while most applications can use the `AbstractFormatter' class as a
formatter, the writer must typically be provided by the application.
- Method on writer object: new_alignment (ALIGN)
Set the alignment style. The `align' value can be any object, but
by convention is a string or `None', where `None' indicates that
the writer's "preferred" alignment should be used. Conventional
`align' values are `'left'', `'center'', `'right'', and
`'justify''.
- Method on writer object: new_font (FONT)
Set the font style. The value of `font' will be `None',
indicating that the device's default font should be used, or a
tuple of the form (SIZE, ITALIC, BOLD, TELETYPE). Size will be a
string indicating the size of font that should be used; specific
strings and their interpretation must be defined by the
application. The ITALIC, BOLD, and TELETYPE values are boolean
indicators specifying which of those font attributes should be
used.
- Method on writer object: new_margin (MARGIN, LEVEL)
Set the margin level to the integer `level' and the logical tag to
`margin'. Interpretation of the logical tag is at the writer's
discretion; the only restriction on the value of the logical tag
is that it not be a false value for non-zero values of `level'.
- Method on writer object: new_spacing (SPACING)
Set the spacing style to `spacing'.
- Method on writer object: new_styles (STYLES)
Set additional styles. The `styles' value is a tuple of arbitrary
values; the value `AS_IS' should be ignored. The `styles' tuple
may be interpreted either as a set or as a stack depending on the
requirements of the application and writer implementation.
- Method on writer object: send_line_break ()
Break the current line.
- Method on writer object: send_paragraph (BLANKLINE)
Produce a paragraph separation of at least `blankline' blank
lines, or the equivelent. The `blankline' value will be an
integer.
- Method on writer object: send_hor_rule (*ARGS, **KW)
Display a horizontal rule on the output device. The arguments to
this method are entirely application- and writer-specific, and
should be interpreted with care. The method implementation may
assume that a line break has already been issued via
`send_line_break()'.
- Method on writer object: send_flowing_data (DATA)
Output character data which may be word-wrapped and re-flowed as
needed. Within any sequence of calls to this method, the writer
may assume that spans of multiple whitespace characters have been
collapsed to single space characters.
- Method on writer object: send_literal_data (DATA)
Output character data which has already been formatted for
display. Generally, this should be interpreted to mean that line
breaks indicated by newline characters should be preserved and no
new line breaks should be introduced. The data may contain
embedded newline and tab characters, unlike data provided to the
`send_formatted_data()' interface.
- Method on writer object: send_label_data (DATA)
Set `data' to the left of the current left margin, if possible.
The value of `data' is not restricted; treatment of non-string
values is entirely application- and writer-dependent. This method
will only be called at the beginning of a line.
File: pylibi, Node: Writer Implementations, Prev: The Writer Interface, Up: formatter
Writer Implementations
----------------------
Three implementations of the writer object interface are provided as
examples by this module. Most applications will need to derive new
writer classes from the `NullWriter' class.
- function of module formatter: NullWriter ()
A writer which only provides the interface definition; no actions
are taken on any methods. This should be the base class for all
writers which do not need to inherit any implementation methods.
- function of module formatter: AbstractWriter ()
A writer which can be used in debugging formatters, but not much
else. Each method simply accounces itself by printing its name and
arguments on standard output.
- function of module formatter: DumbWriter ([FILE` = None'[, MAXCOL` =
72']])
Simple writer class which writes output on the file object passed
in as `file' or, if `file' is omitted, on standard output. The
output is simply word-wrapped to the number of columns specified by
`maxcol'. This class is suitable for reflowing a sequence of
paragraphs.
File: pylibi, Node: rfc822, Next: mimetools, Prev: formatter, Up: Internet and WWW
Standard Module `rfc822'
========================
This module defines a class, `Message', which represents a collection
of "email headers" as defined by the Internet standard RFC 822. It is
used in various contexts, usually to read such headers from a file.
A `Message' instance is instantiated with an open file object as
parameter. Instantiation reads headers from the file up to a blank
line and stores them in the instance; after instantiation, the file is
positioned directly after the blank line that terminates the headers.
Input lines as read from the file may either be terminated by CR-LF or
by a single linefeed; a terminating CR-LF is replaced by a single
linefeed before the line is stored.
All header matching is done independent of upper or lower case; e.g.
`m['From']', `m['from']' and `m['FROM']' all yield the same result.
* Menu:
* Message Objects::
File: pylibi, Node: Message Objects, Prev: rfc822, Up: rfc822
Message Objects
---------------
A `Message' instance has the following methods:
- function of module rfc822: rewindbody ()
Seek to the start of the message body. This only works if the file
object is seekable.
- function of module rfc822: getallmatchingheaders (NAME)
Return a list of lines consisting of all headers matching NAME, if
any. Each physical line, whether it is a continuation line or
not, is a separate list item. Return the empty list if no header
matches NAME.
- function of module rfc822: getfirstmatchingheader (NAME)
Return a list of lines comprising the first header matching NAME,
and its continuation line(s), if any. Return `None' if there is
no header matching NAME.
- function of module rfc822: getrawheader (NAME)
Return a single string consisting of the text after the colon in
the first header matching NAME. This includes leading whitespace,
the trailing linefeed, and internal linefeeds and whitespace if
there any continuation line(s) were present. Return `None' if
there is no header matching NAME.
- function of module rfc822: getheader (NAME)
Like `getrawheader(NAME)', but strip leading and trailing
whitespace (but not internal whitespace).
- function of module rfc822: getaddr (NAME)
Return a pair (full name, email address) parsed from the string
returned by `getheader(NAME)'. If no header matching NAME exists,
return `None, None'; otherwise both the full name and the address
are (possibly empty )strings.
Example: If `m''s first `From' header contains the string
`'jack@cwi.nl (Jack Jansen)'', then `m.getaddr('From')' will yield
the pair `('Jack Jansen', 'jack@cwi.nl')'. If the header contained
`'Jack Jansen <jack@cwi.nl>'' instead, it would yield the exact
same result.
- function of module rfc822: getaddrlist (NAME)
This is similar to `getaddr(LIST)', but parses a header containing
a list of email addresses (e.g. a `To' header) and returns a list
of (full name, email address) pairs (even if there was only one
address in the header). If there is no header matching NAME,
return an empty list.
XXX The current version of this function is not really correct. It
yields bogus results if a full name contains a comma.
- function of module rfc822: getdate (NAME)
Retrieve a header using `getheader' and parse it into a 9-tuple
compatible with `time.mktime()'. If there is no header matching
NAME, or it is unparsable, return `None'.
Date parsing appears to be a black art, and not all mailers adhere
to the standard. While it has been tested and found correct on a
large collection of email from many sources, it is still possible
that this function may occasionally yield an incorrect result.
`Message' instances also support a read-only mapping interface. In
particular: `m[name]' is the same as `m.getheader(name)'; and `len(m)',
`m.has_key(name)', `m.keys()', `m.values()' and `m.items()' act as
expected (and consistently).
Finally, `Message' instances have two public instance variables:
- data of module rfc822: headers
A list containing the entire set of header lines, in the order in
which they were read. Each line contains a trailing newline. The
blank line terminating the headers is not contained in the list.
- data of module rfc822: fp
The file object passed at instantiation time.
File: pylibi, Node: mimetools, Next: binhex, Prev: rfc822, Up: Internet and WWW
Standard Module `mimetools'
===========================
This module defines a subclass of the class `rfc822.Message' and a
number of utility functions that are useful for the manipulation for
MIME style multipart or encoded message.
It defines the following items:
- function of module mimetools: Message (FP)
Return a new instance of the `mimetools.Message' class. This is a
subclass of the `rfc822.Message' class, with some additional
methods (see below).
- function of module mimetools: choose_boundary ()
Return a unique string that has a high likelihood of being usable
as a part boundary. The string has the form
`"HOSTIPADDR.UID.PID.TIMESTAMP.RANDOM"'.
- function of module mimetools: decode (INPUT, OUTPUT, ENCODING)
Read data encoded using the allowed MIME ENCODING from open file
object INPUT and write the decoded data to open file object
OUTPUT. Valid values for ENCODING include `"base64"',
`"quoted-printable"' and `"uuencode"'.
- function of module mimetools: encode (INPUT, OUTPUT, ENCODING)
Read data from open file object INPUT and write it encoded using
the allowed MIME ENCODING to open file object OUTPUT. Valid
values for ENCODING are the same as for `decode()'.
- function of module mimetools: copyliteral (INPUT, OUTPUT)
Read lines until EOF from open file INPUT and write them to open
file OUTPUT.
- function of module mimetools: copybinary (INPUT, OUTPUT)
Read blocks until EOF from open file INPUT and write them to open
file OUTPUT. The block size is currently fixed at 8192.
* Menu:
* mimetools.Message Methods::
File: pylibi, Node: mimetools.Message Methods, Prev: mimetools, Up: mimetools
Additional Methods of Message objects
-------------------------------------
The `mimetools.Message' class defines the following methods in addition
to the `rfc822.Message' class:
- Method on mimetool.Message: getplist ()
Return the parameter list of the `Content-type' header. This is a
list if strings. For parameters of the form `KEY=VALUE', KEY is
converted to lower case but VALUE is not. For example, if the
message contains the header `Content-type: text/html; spam=1;
Spam=2; Spam' then `getplist()' will return the Python list
`['spam=1', 'spam=2', 'Spam']'.
- Method on mimetool.Message: getparam (NAME)
Return the VALUE of the first parameter (as returned by
`getplist()' of the form `NAME=VALUE' for the given NAME. If
VALUE is surrounded by quotes of the form <...> or "...", these
are removed.
- Method on mimetool.Message: getencoding ()
Return the encoding specified in the `Content-transfer-encoding'
message header. If no such header exists, return `"7bit"'. The
encoding is converted to lower case.
- Method on mimetool.Message: gettype ()
Return the message type (of the form `TYPE/varsubtype') as
specified in the `Content-type' header. If no such header exists,
return `"text/plain"'. The type is converted to lower case.
- Method on mimetool.Message: getmaintype ()
Return the main type as specified in the `Content-type' header.
If no such header exists, return `"text"'. The main type is
converted to lower case.
- Method on mimetool.Message: getsubtype ()
Return the subtype as specified in the `Content-type' header. If
no such header exists, return `"plain"'. The subtype is converted
to lower case.
File: pylibi, Node: binhex, Next: uu, Prev: mimetools, Up: Internet and WWW
Standard module `binhex'
========================
This module encodes and decodes files in binhex4 format, a format
allowing representation of Macintosh files in ASCII. On the macintosh,
both forks of a file and the finder information are encoded (or
decoded), on other platforms only the data fork is handled.
The `binhex' module defines the following functions:
- function of module binhex: binhex (INPUT, OUTPUT)
Convert a binary file with filename INPUT to binhex file OUTPUT.
The OUTPUT parameter can either be a filename or a file-like
object (any object supporting a WRITE and CLOSE method).
- function of module binhex: hexbin (INPUT[, OUTPUT])
Decode a binhex file INPUT. INPUT may be a filename or a file-like
object supporting READ and CLOSE methods. The resulting file is
written to a file named OUTPUT, unless the argument is empty in
which case the output filename is read from the binhex file.
* Menu:
* notes::
File: pylibi, Node: notes, Prev: binhex, Up: binhex
notes
-----
There is an alternative, more powerful interface to the coder and
decoder, see the source for details.
If you code or decode textfiles on non-Macintosh platforms they will
still use the macintosh newline convention (carriage-return as end of
line).
As of this writing, HEXBIN appears to not work in all cases.
File: pylibi, Node: uu, Next: binascii, Prev: binhex, Up: Internet and WWW
Standard module `uu'
====================
This module encodes and decodes files in uuencode format, allowing
arbitrary binary data to be transferred over ascii-only connections.
Whereever a file argument is expected, the methods accept either a
pathname (`'-'' for stdin/stdout) or a file-like object.
Normally you would pass filenames, but there is one case where you have
to open the file yourself: if you are on a non-unix platform and your
binary file is actually a textfile that you want encoded
unix-compatible you will have to open the file yourself as a textfile,
so newline conversion is performed.
This code was contributed by Lance Ellinghouse, and modified by Jack
Jansen.
The `uu' module defines the following functions:
- function of module uu: encode (IN_FILE, OUT_FILE[, NAME, MODE])
Uuencode file IN_FILE into file OUT_FILE. The uuencoded file will
have the header specifying NAME and MODE as the defaults for the
results of decoding the file. The default defaults are taken from
IN_FILE, or `'-'' and `0666' respectively.
- function of module uu: decode (IN_FILE[, OUT_FILE, MODE])
This call decodes uuencoded file IN_FILE placing the result on
file OUT_FILE. If OUT_FILE is a pathname the MODE is also set.
Defaults for OUT_FILE and MODE are taken from the uuencode header.
File: pylibi, Node: binascii, Next: xdrlib, Prev: uu, Up: Internet and WWW
Built-in Module `binascii'
==========================
The binascii module contains a number of methods to convert between
binary and various ascii-encoded binary representations. Normally, you
will not use these modules directly but use wrapper modules like UU or
HEXBIN in stead, this module solely exists because bit-manipuation of
large amounts of data is slow in python.
The `binascii' module defines the following functions:
- function of module binascii: a2b_uu (STRING)
Convert a single line of uuencoded data back to binary and return
the binary data. Lines normally contain 45 (binary) bytes, except
for the last line. Line data may be followed by whitespace.
- function of module binascii: b2a_uu (DATA)
Convert binary data to a line of ascii characters, the return
value is the converted line, including a newline char. The length
of DATA should be at most 45.
- function of module binascii: a2b_base64 (STRING)
Convert a block of base64 data back to binary and return the
binary data. More than one line may be passed at a time.
- function of module binascii: b2a_base64 (DATA)
Convert binary data to a line of ascii characters in base64 coding.
The return value is the converted line, including a newline char.
The length of DATA should be at most 57 to adhere to the base64
standard.
- function of module binascii: a2b_hqx (STRING)
Convert binhex4 formatted ascii data to binary, without doing
rle-decompression. The string should contain a complete number of
binary bytes, or (in case of the last portion of the binhex4 data)
have the remaining bits zero.
- function of module binascii: rledecode_hqx (DATA)
Perform RLE-decompression on the data, as per the binhex4
standard. The algorithm uses `0x90' after a byte as a repeat
indicator, followed by a count. A count of `0' specifies a byte
value of `0x90'. The routine returns the decompressed data, unless
data input data ends in an orphaned repeat indicator, in which
case the INCOMPLETE exception is raised.
- function of module binascii: rlecode_hqx (DATA)
Perform binhex4 style RLE-compression on DATA and return the
result.
- function of module binascii: b2a_hqx (DATA)
Perform hexbin4 binary-to-ascii translation and return the
resulting string. The argument should already be rle-coded, and
have a length divisible by 3 (except possibly the last fragment).
- function of module binascii: crc_hqx (DATA, CRC)
Compute the binhex4 crc value of DATA, starting with an initial
CRC and returning the result.
- exception of module binascii: Error
Exception raised on errors. These are usually programming errors.
- exception of module binascii: Incomplete
Exception raised on incomplete data. These are usually not
programming errors, but handled by reading a little more data and
trying again.
File: pylibi, Node: xdrlib, Prev: binascii, Up: Internet and WWW
Standard module `xdrlib'
========================
The `xdrlib' module supports the External Data Representation Standard
as described in RFC 1014, written by Sun Microsystems, Inc. June 1987.
It supports most of the data types described in the RFC, although some,
most notably `float' and `double' are only supported on those operating
systems that provide an XDR library.
The `xdrlib' module defines two classes, one for packing variables into
XDR representation, and another for unpacking from XDR representation.
There are also two exception classes.
* Menu:
* Packer Objects::
* Unpacker Objects::
* Exceptions::
* Supporting Floating Point Data::
File: pylibi, Node: Packer Objects, Next: Unpacker Objects, Prev: xdrlib, Up: xdrlib
Packer Objects
--------------
`Packer' is the class for packing data into XDR representation. The
`Packer' class is instantiated with no arguments.
- function of module xdrlib: get_buffer ()
Returns the current pack buffer as a string.
- function of module xdrlib: reset ()
Resets the pack buffer to the empty string.
In general, you can pack any of the most common XDR data types by
calling the appropriate `pack_TYPE' method. Each method takes a single
argument, the value to pack. The following simple data type packing
methods are supported: `pack_uint', `pack_int', `pack_enum',
`pack_bool', `pack_uhyper', and `pack_hyper'.
The following methods pack floating point numbers, however they require
C library support. Without the optional C built-in module, both of
these methods will raise an `xdrlib.ConversionError' exception. See
the note at the end of this chapter for details.
- function of module xdrlib: pack_float (VALUE)
Packs the single-precision floating point number VALUE.
- function of module xdrlib: pack_double (VALUE)
Packs the double-precision floating point number VALUE.
The following methods support packing strings, bytes, and opaque data:
- function of module xdrlib: pack_fstring (N, S)
Packs a fixed length string, S. N is the length of the string but
it is *not* packed into the data buffer. The string is padded
with null bytes if necessary to guaranteed 4 byte alignment.
- function of module xdrlib: pack_fopaque (N, DATA)
Packs a fixed length opaque data stream, similarly to
`pack_fstring'.
- function of module xdrlib: pack_string (S)
Packs a variable length string, S. The length of the string is
first packed as an unsigned integer, then the string data is packed
with `pack_fstring'.
- function of module xdrlib: pack_opaque (DATA)
Packs a variable length opaque data string, similarly to
`pack_string'.
- function of module xdrlib: pack_bytes (BYTES)
Packs a variable length byte stream, similarly to `pack_string'.
The following methods support packing arrays and lists:
- function of module xdrlib: pack_list (LIST, PACK_ITEM)
Packs a LIST of homogeneous items. This method is useful for
lists with an indeterminate size; i.e. the size is not available
until the entire list has been walked. For each item in the list,
an unsigned integer `1' is packed first, followed by the data value
from the list. PACK_ITEM is the function that is called to pack
the individual item. At the end of the list, an unsigned integer
`0' is packed.
- function of module xdrlib: pack_farray (N, ARRAY, PACK_ITEM)
Packs a fixed length list (ARRAY) of homogeneous items. N is the
length of the list; it is *not* packed into the buffer, but a
`ValueError' exception is raised if `len(array)' is not equal to
N. As above, PACK_ITEM is the function used to pack each element.
- function of module xdrlib: pack_array (LIST, PACK_ITEM)
Packs a variable length LIST of homogeneous items. First, the
length of the list is packed as an unsigned integer, then each
element is packed as in `pack_farray' above.
File: pylibi, Node: Unpacker Objects, Next: Exceptions, Prev: Packer Objects, Up: xdrlib
Unpacker Objects
----------------
`Unpacker' is the complementary class which unpacks XDR data values
from a string buffer, and has the following methods:
- function of module xdrlib: __init__ (DATA)
Instantiates an `Unpacker' object with the string buffer DATA.
- function of module xdrlib: reset (DATA)
Resets the string buffer with the given DATA.
- function of module xdrlib: get_position ()
Returns the current unpack position in the data buffer.
- function of module xdrlib: set_position (POSITION)
Sets the data buffer unpack position to POSITION. You should be
careful about using `get_position()' and `set_position()'.
- function of module xdrlib: done ()
Indicates unpack completion. Raises an `xdrlib.Error' exception
if all of the data has not been unpacked.
In addition, every data type that can be packed with a `Packer', can be
unpacked with an `Unpacker'. Unpacking methods are of the form
`unpack_TYPE', and take no arguments. They return the unpacked object.
The same caveats apply for `unpack_float' and `unpack_double' as above.
- function of module xdrlib: unpack_float ()
Unpacks a single-precision floating point number.
- function of module xdrlib: unpack_double ()
Unpacks a double-precision floating point number, similarly to
`unpack_float'.
In addition, the following methods unpack strings, bytes, and opaque
data:
- function of module xdrlib: unpack_fstring (N)
Unpacks and returns a fixed length string. N is the number of
characters expected. Padding with null bytes to guaranteed 4 byte
alignment is assumed.
- function of module xdrlib: unpack_fopaque (N)
Unpacks and returns a fixed length opaque data stream, similarly to
`unpack_fstring'.
- function of module xdrlib: unpack_string ()
Unpacks and returns a variable length string. The length of the
string is first unpacked as an unsigned integer, then the string
data is unpacked with `unpack_fstring'.
- function of module xdrlib: unpack_opaque ()
Unpacks and returns a variable length opaque data string,
similarly to `unpack_string'.
- function of module xdrlib: unpack_bytes ()
Unpacks and returns a variable length byte stream, similarly to
`unpack_string'.
The following methods support unpacking arrays and lists:
- function of module xdrlib: unpack_list (UNPACK_ITEM)
Unpacks and returns a list of homogeneous items. The list is
unpacked one element at a time by first unpacking an unsigned
integer flag. If the flag is `1', then the item is unpacked and
appended to the list. A flag of `0' indicates the end of the
list. UNPACK_ITEM is the function that is called to unpack the
items.
- function of module xdrlib: unpack_farray (N, UNPACK_ITEM)
Unpacks and returns (as a list) a fixed length array of homogeneous
items. N is number of list elements to expect in the buffer. As
above, UNPACK_ITEM is the function used to unpack each element.
- function of module xdrlib: unpack_array (UNPACK_ITEM)
Unpacks and returns a variable length LIST of homogeneous items.
First, the length of the list is unpacked as an unsigned integer,
then each element is unpacked as in `unpack_farray' above.
File: pylibi, Node: Exceptions, Next: Supporting Floating Point Data, Prev: Unpacker Objects, Up: xdrlib
Exceptions
----------
Exceptions in this module are coded as class instances:
- exception of module xdrlib: Error
The base exception class. `Error' has a single public data member
`msg' containing the description of the error.
- exception of module xdrlib: ConversionError
Class derived from `Error'. Contains no additional instance
variables.
Here is an example of how you would catch one of these exceptions:
import xdrlib
p = xdrlib.Packer()
try:
p.pack_double(8.01)
except xdrlib.ConversionError, instance:
print 'packing the double failed:', instance.msg
File: pylibi, Node: Supporting Floating Point Data, Prev: Exceptions, Up: xdrlib
Supporting Floating Point Data
------------------------------
Packing and unpacking floating point data, i.e. `Packer.pack_float',
`Packer.pack_double', `Unpacker.unpack_float', and
`Unpacker.unpack_double', are only supported with the helper built-in
`_xdr' module, which relies on your operating system having the
appropriate XDR library routines.
If you have built the Python interpeter with the `_xdr' module, or have
built the `_xdr' module as a shared library, `xdrlib' will use these to
pack and unpack floating point numbers. Otherwise, using these
routines will raise a `ConversionError' exception.
See the Python installation instructions for details on building the
`_xdr' module.
File: pylibi, Node: Restricted Execution, Next: Cryptographic Services, Prev: Internet and WWW, Up: Top
Restricted Execution
********************
In general, Python programs have complete access to the underlying
operating system throug the various functions and classes, For example,
a Python program can open any file for reading and writing by using the
`open()' built-in function (provided the underlying OS gives you
permission!). This is exactly what you want for most applications.
There exists a class of applications for which this "openness" is
inappropriate. Take Grail: a web browser that accepts "applets",
snippets of Python code, from anywhere on the Internet for execution on
the local system. This can be used to improve the user interface of
forms, for instance. Since the originator of the code is unknown, it
is obvious that it cannot be trusted with the full resources of the
local machine.
*Restricted execution* is the basic framework in Python that allows for
the segregation of trusted and untrusted code. It is based on the
notion that trusted Python code (a *supervisor*) can create a "padded
cell' (or environment) with limited permissions, and run the untrusted
code within this cell. The untrusted code cannot break out of its
cell, and can only interact with sensitive system resources through
interfaces defined and managed by the trusted code. The term
"restricted execution" is favored over "safe-Python" since true safety
is hard to define, and is determined by the way the restricted
environment is created. Note that the restricted environments can be
nested, with inner cells creating subcells of lesser, but never
greater, privilege.
An interesting aspect of Python's restricted execution model is that
the interfaces presented to untrusted code usually have the same names
as those presented to trusted code. Therefore no special interfaces
need to be learned to write code designed to run in a restricted
environment. And because the exact nature of the padded cell is
determined by the supervisor, different restrictions can be imposed,
depending on the application. For example, it might be deemed "safe"
for untrusted code to read any file within a specified directory, but
never to write a file. In this case, the supervisor may redefine the
built-in `open()' function so that it raises an exception whenever the
MODE parameter is `'w''. It might also perform a `chroot()'-like
operation on the FILENAME parameter, such that root is always relative
to some safe "sandbox" area of the filesystem. In this case, the
untrusted code would still see an built-in `open()' function in its
environment, with the same calling interface. The semantics would be
identical too, with `IOError's being raised when the supervisor
determined that an unallowable parameter is being used.
The Python run-time determines whether a particular code block is
executing in restricted execution mode based on the identity of the
`__builtins__' object in its global variables: if this is (the
dictionary of) the standard `__builtin__' module, the code is deemed to
be unrestricted, else it is deemed to be restricted.
Python code executing in restricted mode faces a number of limitations
that are designed to prevent it from escaping from the padded cell.
For instance, the function object attribute `func_globals' and the
class and instance object attribute `__dict__' are unavailable.
Two modules provide the framework for setting up restricted execution
environments:
rexec
-- Basic restricted execution framework.
Bastion
-- Providing restricted access to objects.
* Menu:
* rexec::
* Bastion::
File: pylibi, Node: rexec, Next: Bastion, Prev: Restricted Execution, Up: Restricted Execution
Standard Module `rexec'
=======================
This module contains the `RExec' class, which supports `r_exec()',
`r_eval()', `r_execfile()', and `r_import()' methods, which are
restricted versions of the standard Python functions `exec()',
`eval()', `execfile()', and the `import' statement. Code executed in
this restricted environment will only have access to modules and
functions that are deemed safe; you can subclass `RExec' to add or
remove capabilities as desired.
*Note:* The `RExec' class can prevent code from performing unsafe
operations like reading or writing disk files, or using TCP/IP sockets.
However, it does not protect against code using extremely large
amounts of memory or CPU time.
- function of module rexec: RExec ([HOOKS[, VERBOSE]])
Returns an instance of the `RExec' class.
HOOKS is an instance of the `RHooks' class or a subclass of it.
If it is omitted or `None', the default `RHooks' class is
instantiated. Whenever the RExec module searches for a module
(even a built-in one) or reads a module's code, it doesn't
actually go out to the file system itself. Rather, it calls
methods of an RHooks instance that was passed to or created by its
constructor. (Actually, the RExec object doesn't make these
calls--they are made by a module loader object that's part of the
RExec object. This allows another level of flexibility, e.g.
using packages.)
By providing an alternate RHooks object, we can control the file
system accesses made to import a module, without changing the
actual algorithm that controls the order in which those accesses
are made. For instance, we could substitute an RHooks object that
passes all filesystem requests to a file server elsewhere, via
some RPC mechanism such as ILU. Grail's applet loader uses this
to support importing applets from a URL for a directory.
If VERBOSE is true, additional debugging output may be sent to
standard output.
The RExec class has the following class attributes, which are used by
the `__init__' method. Changing them on an existing instance won't
have any effect; instead, create a subclass of `RExec' and assign them
new values in the class definition. Instances of the new class will
then use those new values. All these attributes are tuples of strings.
- attribute of RExec object: nok_builtin_names
Contains the names of built-in functions which will *not* be
available to programs running in the restricted environment. The
value for `RExec' is `('open',' `'reload',' `'__import__')'.
(This gives the exceptions, because by far the majority of
built-in functions are harmless. A subclass that wants to
override this variable should probably start with the value from
the base class and concatenate additional forbidden functions --
when new dangerous built-in functions are added to Python, they
will also be added to this module.)
- attribute of RExec object: ok_builtin_modules
Contains the names of built-in modules which can be safely
imported. The value for `RExec' is `('audioop',' `'array','
`'binascii',' `'cmath',' `'errno',' `'imageop',' `'marshal','
`'math',' `'md5',' `'operator',' `'parser',' `'regex',' `'rotor','
`'select',' `'strop',' `'struct',' `'time')'. A similar remark
about overriding this variable applies -- use the value from the
base class as a starting point.
- attribute of RExec object: ok_path
Contains the directories which will be searched when an `import'
is performed in the restricted environment. The value for `RExec'
is the same as `sys.path' (at the time the module is loaded) for
unrestricted code.
- attribute of RExec object: ok_posix_names
Contains the names of the functions in the `os' module which will
be available to programs running in the restricted environment.
The value for `RExec' is `('error',' `'fstat',' `'listdir','
`'lstat',' `'readlink',' `'stat',' `'times',' `'uname','
`'getpid',' `'getppid',' `'getcwd',' `'getuid',' `'getgid','
`'geteuid',' `'getegid')'.
- attribute of RExec object: ok_sys_names
Contains the names of the functions and variables in the `sys'
module which will be available to programs running in the
restricted environment. The value for `RExec' is `('ps1','
`'ps2',' `'copyright',' `'version',' `'platform',' `'exit','
`'maxint')'.
RExec instances support the following methods:
- Method on RExec object: r_eval (CODE)
CODE must either be a string containing a Python expression, or a
compiled code object, which will be evaluated in the restricted
environment's `__main__' module. The value of the expression or
code object will be returned.
- Method on RExec object: r_exec (CODE)
CODE must either be a string containing one or more lines of
Python code, or a compiled code object, which will be executed in
the restricted environment's `__main__' module.
- Method on RExec object: r_execfile (FILENAME)
Execute the Python code contained in the file FILENAME in the
restricted environment's `__main__' module.
Methods whose names begin with `s_' are similar to the functions
beginning with `r_', but the code will be granted access to restricted
versions of the standard I/O streans `sys.stdin', `sys.stderr', and
`sys.stdout'.
- Method on RExec object: s_eval (CODE)
CODE must be a string containing a Python expression, which will
be evaluated in the restricted environment.
- Method on RExec object: s_exec (CODE)
CODE must be a string containing one or more lines of Python code,
which will be executed in the restricted environment.
- Method on RExec object: s_execfile (CODE)
Execute the Python code contained in the file FILENAME in the
restricted environment.
`RExec' objects must also support various methods which will be
implicitly called by code executing in the restricted environment.
Overriding these methods in a subclass is used to change the policies
enforced by a restricted environment.
- Method on RExec object: r_import (MODULENAME[, GLOBALS, LOCALS,
FROMLIST])
Import the module MODULENAME, raising an `ImportError' exception
if the module is considered unsafe.
- Method on RExec object: r_open (FILENAME[, MODE[, BUFSIZE]])
Method called when `open()' is called in the restricted
environment. The arguments are identical to those of `open()',
and a file object (or a class instance compatible with file
objects) should be returned. `RExec''s default behaviour is allow
opening any file for reading, but forbidding any attempt to write
a file. See the example below for an implementation of a less
restrictive `r_open()'.
- Method on RExec object: r_reload (MODULE)
Reload the module object MODULE, re-parsing and re-initializing it.
- Method on RExec object: r_unload (MODULE)
Unload the module object MODULE (i.e., remove it from the
restricted environment's `sys.modules' dictionary).
And their equivalents with access to restricted standard I/O streams:
- Method on RExec object: s_import (MODULENAME[, GLOBALS, LOCALS,
FROMLIST])
Import the module MODULENAME, raising an `ImportError' exception
if the module is considered unsafe.
- Method on RExec object: s_reload (MODULE)
Reload the module object MODULE, re-parsing and re-initializing it.
- Method on RExec object: s_unload (MODULE)
Unload the module object MODULE.
* Menu:
* An example::
File: pylibi, Node: An example, Prev: rexec, Up: rexec
An example
----------
Let us say that we want a slightly more relaxed policy than the
standard RExec class. For example, if we're willing to allow files in
`/tmp' to be written, we can subclass the `RExec' class:
class TmpWriterRExec(rexec.RExec):
def r_open(self, file, mode='r', buf=-1):
if mode in ('r', 'rb'):
pass
elif mode in ('w', 'wb', 'a', 'ab'):
# check filename : must begin with /tmp/
if file[:5]!='/tmp/':
raise IOError, "can't write outside /tmp"
elif (string.find(file, '/../') >= 0 or
file[:3] == '../' or file[-3:] == '/..'):
raise IOError, "'..' in filename forbidden"
else: raise IOError, "Illegal open() mode"
return open(file, mode, buf)
Notice that the above code will occasionally forbid a perfectly valid
filename; for example, code in the restricted environment won't be able
to open a file called `/tmp/foo/../bar'. To fix this, the `r_open'
method would have to simplify the filename to `/tmp/bar', which would
require splitting apart the filename and performing various operations
on it. In cases where security is at stake, it may be preferable to
write simple code which is sometimes overly restrictive, instead of
more general code that is also more complex and may harbor a subtle
security hole.