The Datafile PD-CD 5

home *** CD-ROM | disk | FTP | other *** search

/ The Datafile PD-CD 5 / DATAFILE_PDCD5.iso / utilities / p / python / !ibrowse / files / pylibi-7 < prev next >

Wrap

GNU Info File | 1996-11-14 | 50.3 KB | 1,180 lines

This is Info file pylibi, produced by Makeinfo-1.55 from the input file lib.texi. This file describes the built-in types, exceptions and functions and the standard modules that come with the Python system. It assumes basic knowledge about the Python language. For an informal introduction to the language, see the Python Tutorial. The Python Reference Manual gives a more formal definition of the language. (These manuals are not yet available in INFO or Texinfo format.) Copyright 1991-1995 by Stichting Mathematisch Centrum, Amsterdam, The Netherlands. All Rights Reserved Permission to use, copy, modify, and distribute this software and its documentation for any purpose and without fee is hereby granted, provided that the above copyright notice appear in all copies and that both that copyright notice and this permission notice appear in supporting documentation, and that the names of Stichting Mathematisch Centrum or CWI or Corporation for National Research Initiatives or CNRI not be used in advertising or publicity pertaining to distribution of the software without specific, written prior permission. While CWI is the initial source for this software, a modified version is made available by the Corporation for National Research Initiatives (CNRI) at the Internet address ftp://ftp.python.org. STICHTING MATHEMATISCH CENTRUM AND CNRI DISCLAIM ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS, IN NO EVENT SHALL STICHTING MATHEMATISCH CENTRUM OR CNRI BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. File: pylibi, Node: formatter, Next: rfc822, Prev: htmllib, Up: Internet and WWW Standard Module `formatter' =========================== This module supports two interface definitions, each with mulitple implementations. The *formatter* interface is used by the `HTMLParser' class of the `htmllib' module, and the *writer* interface is required by the formatter interface. Formatter objects transform an abstract flow of formatting events into specific output events on writer objects. Formatters manage several stack structures to allow various properties of a writer object to be changed and restored; writers need not be able to handle relative changes nor any sort of "change back" operation. Specific writer properties which may be controlled via formatter objects are horizontal alignment, font, and left margin indentations. A mechanism is provided which supports providing arbitrary, non-exclusive style settings to a writer as well. Additional interfaces facilitate formatting events which are not reversible, such as paragraph separation. Writer objects encapsulate device interfaces. Abstract devices, such as file formats, are supported as well as physical devices. The provided implementations all work with abstract devices. The interface makes available mechanisms for setting the properties which formatter objects manage and inserting data into the output. * Menu: * The Formatter Interface:: * Formatter Implementations:: * The Writer Interface:: * Writer Implementations:: File: pylibi, Node: The Formatter Interface, Next: Formatter Implementations, Prev: formatter, Up: formatter The Formatter Interface ----------------------- Interfaces to create formatters are dependent on the specific formatter class being instantiated. The interfaces described below are the required interfaces which all formatters must support once initialized. One data element is defined at the module level: - data of module formatter: AS_IS Value which can be used in the font specification passed to the `push_font()' method described below, or as the new value to any other `push_PROPERTY()' method. Pushing the `AS_IS' value allows the corresponding `pop_PROPERTY()' method to be called without having to track whether the property was changed. The following attributes are defined for formatter instance objects: - data of formatter object data: writer The writer instance with which the formatter interacts. - Method on formatter object: end_paragraph (BLANKLINES) Close any open paragraphs and insert at least `blanklines' before the next paragraph. - Method on formatter object: add_line_break () Add a hard line break if one does not already exist. This does not break the logical paragraph. - Method on formatter object: add_hor_rule (*ARGS, **KW) Insert a horizontal rule in the output. A hard break is inserted if there is data in the current paragraph, but the logical paragraph is not broken. The arguments and keywords are passed on to the writer's `send_line_break()' method. - Method on formatter object: add_flowing_data (DATA) Provide data which should be formatted with collapsed whitespaces. Whitespace from preceeding and successive calls to `add_flowing_data()' is considered as well when the whitespace collapse is performed. The data which is passed to this method is expected to be word-wrapped by the output device. Note that any word-wrapping still must be performed by the writer object due to the need to rely on device and font information. - Method on formatter object: add_literal_data (DATA) Provide data which should be passed to the writer unchanged. Whitespace, including newline and tab characters, are considered legal in the value of `data'. - Method on formatter object: add_label_data (FORMAT, COUNTER) Insert a label which should be placed to the left of the current left margin. This should be used for constructing bulleted or numbered lists. If the `format' value is a string, it is interpreted as a format specification for `counter', which should be an integer. The result of this formatting becomes the value of the label; if `format' is not a string it is used as the label value directly. The label value is passed as the only argument to the writer's `send_label_data()' method. Interpretation of non-string label values is dependent on the associated writer. Format specifications are strings which, in combination with a counter value, are used to compute label values. Each character in the format string is copied to the label value, with some characters recognized to indicate a transform on the counter value. Specifically, the character "`1'" represents the counter value formatter as an arabic number, the characters "`A'" and "`a'" represent alphabetic representations of the counter value in upper and lower case, respectively, and "`I'" and "`i'" represent the counter value in Roman numerals, in upper and lower case. Note that the alphabetic and roman transforms require that the counter value be greater than zero. - Method on formatter object: flush_softspace () Send any pending whitespace buffered from a previous call to `add_flowing_data()' to the associated writer object. This should be called before any direct manipulation of the writer object. - Method on formatter object: push_alignment (ALIGN) Push a new alignment setting onto the alignment stack. This may be `AS_IS' if no change is desired. If the alignment value is changed from the previous setting, the writer's `new_alignment()' method is called with the `align' value. - Method on formatter object: pop_alignment () Restore the previous alignment. - Method on formatter object: push_font ((SIZE, ITALIC, BOLD, TELETYPE)) Change some or all font properties of the writer object. Properties which are not set to `AS_IS' are set to the values passed in while others are maintained at their current settings. The writer's `new_font()' method is called with the fully resolved font specification. - Method on formatter object: pop_font () Restore the previous font. - Method on formatter object: push_margin (MARGIN) Increase the number of left margin indentations by one, associating the logical tag `margin' with the new indentation. The initial margin level is `0'. Changed values of the logical tag must be true values; false values other than `AS_IS' are not sufficient to change the margin. - Method on formatter object: pop_margin () Restore the previous margin. - Method on formatter object: push_style (*STYLES) Push any number of arbitrary style specifications. All styles are pushed onto the styles stack in order. A tuple representing the entire stack, including `AS_IS' values, is passed to the writer's `new_styles()' method. - Method on formatter object: pop_style ([N` = 1']) Pop the last `n' style specifications passed to `push_style()'. A tuple representing the revised stack, including `AS_IS' values, is passed to the writer's `new_styles()' method. - Method on formatter object: set_spacing (SPACING) Set the spacing style for the writer. - Method on formatter object: assert_line_data ([FLAG` = 1']) Inform the formatter that data has been added to the current paragraph out-of-band. This should be used when the writer has been manipulated directly. The optional `flag' argument can be set to false if the writer manipulations produced a hard line break at the end of the output. File: pylibi, Node: Formatter Implementations, Next: The Writer Interface, Prev: The Formatter Interface, Up: formatter Formatter Implementations ------------------------- Two implementations of formatter objects are provided by this module. Most applications may use one of these classes without modification or subclassing. - function of module formatter: NullFormatter ([WRITER` = None']) A formatter which does nothing. If `writer' is omitted, a `NullWriter' instance is created. No methods of the writer are called by `NullWriter' instances. Implementations should inherit from this class if implementing a writer interface but don't need to inherit any implementation. - function of module formatter: AbstractFormatter (WRITER) The standard formatter. This implementation has demonstrated wide applicability to many writers, and may be used directly in most circumstances. It has been used to implement a full-featured world-wide web browser. File: pylibi, Node: The Writer Interface, Next: Writer Implementations, Prev: Formatter Implementations, Up: formatter The Writer Interface -------------------- Interfaces to create writers are dependent on the specific writer class being instantiated. The interfaces described below are the required interfaces which all writers must support once initialized. Note that while most applications can use the `AbstractFormatter' class as a formatter, the writer must typically be provided by the application. - Method on writer object: new_alignment (ALIGN) Set the alignment style. The `align' value can be any object, but by convention is a string or `None', where `None' indicates that the writer's "preferred" alignment should be used. Conventional `align' values are `'left'', `'center'', `'right'', and `'justify''. - Method on writer object: new_font (FONT) Set the font style. The value of `font' will be `None', indicating that the device's default font should be used, or a tuple of the form (SIZE, ITALIC, BOLD, TELETYPE). Size will be a string indicating the size of font that should be used; specific strings and their interpretation must be defined by the application. The ITALIC, BOLD, and TELETYPE values are boolean indicators specifying which of those font attributes should be used. - Method on writer object: new_margin (MARGIN, LEVEL) Set the margin level to the integer `level' and the logical tag to `margin'. Interpretation of the logical tag is at the writer's discretion; the only restriction on the value of the logical tag is that it not be a false value for non-zero values of `level'. - Method on writer object: new_spacing (SPACING) Set the spacing style to `spacing'. - Method on writer object: new_styles (STYLES) Set additional styles. The `styles' value is a tuple of arbitrary values; the value `AS_IS' should be ignored. The `styles' tuple may be interpreted either as a set or as a stack depending on the requirements of the application and writer implementation. - Method on writer object: send_line_break () Break the current line. - Method on writer object: send_paragraph (BLANKLINE) Produce a paragraph separation of at least `blankline' blank lines, or the equivelent. The `blankline' value will be an integer. - Method on writer object: send_hor_rule (*ARGS, **KW) Display a horizontal rule on the output device. The arguments to this method are entirely application- and writer-specific, and should be interpreted with care. The method implementation may assume that a line break has already been issued via `send_line_break()'. - Method on writer object: send_flowing_data (DATA) Output character data which may be word-wrapped and re-flowed as needed. Within any sequence of calls to this method, the writer may assume that spans of multiple whitespace characters have been collapsed to single space characters. - Method on writer object: send_literal_data (DATA) Output character data which has already been formatted for display. Generally, this should be interpreted to mean that line breaks indicated by newline characters should be preserved and no new line breaks should be introduced. The data may contain embedded newline and tab characters, unlike data provided to the `send_formatted_data()' interface. - Method on writer object: send_label_data (DATA) Set `data' to the left of the current left margin, if possible. The value of `data' is not restricted; treatment of non-string values is entirely application- and writer-dependent. This method will only be called at the beginning of a line. File: pylibi, Node: Writer Implementations, Prev: The Writer Interface, Up: formatter Writer Implementations ---------------------- Three implementations of the writer object interface are provided as examples by this module. Most applications will need to derive new writer classes from the `NullWriter' class. - function of module formatter: NullWriter () A writer which only provides the interface definition; no actions are taken on any methods. This should be the base class for all writers which do not need to inherit any implementation methods. - function of module formatter: AbstractWriter () A writer which can be used in debugging formatters, but not much else. Each method simply accounces itself by printing its name and arguments on standard output. - function of module formatter: DumbWriter ([FILE` = None'[, MAXCOL` = 72']]) Simple writer class which writes output on the file object passed in as `file' or, if `file' is omitted, on standard output. The output is simply word-wrapped to the number of columns specified by `maxcol'. This class is suitable for reflowing a sequence of paragraphs. File: pylibi, Node: rfc822, Next: mimetools, Prev: formatter, Up: Internet and WWW Standard Module `rfc822' ======================== This module defines a class, `Message', which represents a collection of "email headers" as defined by the Internet standard RFC 822. It is used in various contexts, usually to read such headers from a file. A `Message' instance is instantiated with an open file object as parameter. Instantiation reads headers from the file up to a blank line and stores them in the instance; after instantiation, the file is positioned directly after the blank line that terminates the headers. Input lines as read from the file may either be terminated by CR-LF or by a single linefeed; a terminating CR-LF is replaced by a single linefeed before the line is stored. All header matching is done independent of upper or lower case; e.g. `m['From']', `m['from']' and `m['FROM']' all yield the same result. * Menu: * Message Objects:: File: pylibi, Node: Message Objects, Prev: rfc822, Up: rfc822 Message Objects --------------- A `Message' instance has the following methods: - function of module rfc822: rewindbody () Seek to the start of the message body. This only works if the file object is seekable. - function of module rfc822: getallmatchingheaders (NAME) Return a list of lines consisting of all headers matching NAME, if any. Each physical line, whether it is a continuation line or not, is a separate list item. Return the empty list if no header matches NAME. - function of module rfc822: getfirstmatchingheader (NAME) Return a list of lines comprising the first header matching NAME, and its continuation line(s), if any. Return `None' if there is no header matching NAME. - function of module rfc822: getrawheader (NAME) Return a single string consisting of the text after the colon in the first header matching NAME. This includes leading whitespace, the trailing linefeed, and internal linefeeds and whitespace if there any continuation line(s) were present. Return `None' if there is no header matching NAME. - function of module rfc822: getheader (NAME) Like `getrawheader(NAME)', but strip leading and trailing whitespace (but not internal whitespace). - function of module rfc822: getaddr (NAME) Return a pair (full name, email address) parsed from the string returned by `getheader(NAME)'. If no header matching NAME exists, return `None, None'; otherwise both the full name and the address are (possibly empty )strings. Example: If `m''s first `From' header contains the string `'jack@cwi.nl (Jack Jansen)'', then `m.getaddr('From')' will yield the pair `('Jack Jansen', 'jack@cwi.nl')'. If the header contained `'Jack Jansen <jack@cwi.nl>'' instead, it would yield the exact same result. - function of module rfc822: getaddrlist (NAME) This is similar to `getaddr(LIST)', but parses a header containing a list of email addresses (e.g. a `To' header) and returns a list of (full name, email address) pairs (even if there was only one address in the header). If there is no header matching NAME, return an empty list. XXX The current version of this function is not really correct. It yields bogus results if a full name contains a comma. - function of module rfc822: getdate (NAME) Retrieve a header using `getheader' and parse it into a 9-tuple compatible with `time.mktime()'. If there is no header matching NAME, or it is unparsable, return `None'. Date parsing appears to be a black art, and not all mailers adhere to the standard. While it has been tested and found correct on a large collection of email from many sources, it is still possible that this function may occasionally yield an incorrect result. `Message' instances also support a read-only mapping interface. In particular: `m[name]' is the same as `m.getheader(name)'; and `len(m)', `m.has_key(name)', `m.keys()', `m.values()' and `m.items()' act as expected (and consistently). Finally, `Message' instances have two public instance variables: - data of module rfc822: headers A list containing the entire set of header lines, in the order in which they were read. Each line contains a trailing newline. The blank line terminating the headers is not contained in the list. - data of module rfc822: fp The file object passed at instantiation time. File: pylibi, Node: mimetools, Next: binhex, Prev: rfc822, Up: Internet and WWW Standard Module `mimetools' =========================== This module defines a subclass of the class `rfc822.Message' and a number of utility functions that are useful for the manipulation for MIME style multipart or encoded message. It defines the following items: - function of module mimetools: Message (FP) Return a new instance of the `mimetools.Message' class. This is a subclass of the `rfc822.Message' class, with some additional methods (see below). - function of module mimetools: choose_boundary () Return a unique string that has a high likelihood of being usable as a part boundary. The string has the form `"HOSTIPADDR.UID.PID.TIMESTAMP.RANDOM"'. - function of module mimetools: decode (INPUT, OUTPUT, ENCODING) Read data encoded using the allowed MIME ENCODING from open file object INPUT and write the decoded data to open file object OUTPUT. Valid values for ENCODING include `"base64"', `"quoted-printable"' and `"uuencode"'. - function of module mimetools: encode (INPUT, OUTPUT, ENCODING) Read data from open file object INPUT and write it encoded using the allowed MIME ENCODING to open file object OUTPUT. Valid values for ENCODING are the same as for `decode()'. - function of module mimetools: copyliteral (INPUT, OUTPUT) Read lines until EOF from open file INPUT and write them to open file OUTPUT. - function of module mimetools: copybinary (INPUT, OUTPUT) Read blocks until EOF from open file INPUT and write them to open file OUTPUT. The block size is currently fixed at 8192. * Menu: * mimetools.Message Methods:: File: pylibi, Node: mimetools.Message Methods, Prev: mimetools, Up: mimetools Additional Methods of Message objects ------------------------------------- The `mimetools.Message' class defines the following methods in addition to the `rfc822.Message' class: - Method on mimetool.Message: getplist () Return the parameter list of the `Content-type' header. This is a list if strings. For parameters of the form `KEY=VALUE', KEY is converted to lower case but VALUE is not. For example, if the message contains the header `Content-type: text/html; spam=1; Spam=2; Spam' then `getplist()' will return the Python list `['spam=1', 'spam=2', 'Spam']'. - Method on mimetool.Message: getparam (NAME) Return the VALUE of the first parameter (as returned by `getplist()' of the form `NAME=VALUE' for the given NAME. If VALUE is surrounded by quotes of the form <...> or "...", these are removed. - Method on mimetool.Message: getencoding () Return the encoding specified in the `Content-transfer-encoding' message header. If no such header exists, return `"7bit"'. The encoding is converted to lower case. - Method on mimetool.Message: gettype () Return the message type (of the form `TYPE/varsubtype') as specified in the `Content-type' header. If no such header exists, return `"text/plain"'. The type is converted to lower case. - Method on mimetool.Message: getmaintype () Return the main type as specified in the `Content-type' header. If no such header exists, return `"text"'. The main type is converted to lower case. - Method on mimetool.Message: getsubtype () Return the subtype as specified in the `Content-type' header. If no such header exists, return `"plain"'. The subtype is converted to lower case. File: pylibi, Node: binhex, Next: uu, Prev: mimetools, Up: Internet and WWW Standard module `binhex' ======================== This module encodes and decodes files in binhex4 format, a format allowing representation of Macintosh files in ASCII. On the macintosh, both forks of a file and the finder information are encoded (or decoded), on other platforms only the data fork is handled. The `binhex' module defines the following functions: - function of module binhex: binhex (INPUT, OUTPUT) Convert a binary file with filename INPUT to binhex file OUTPUT. The OUTPUT parameter can either be a filename or a file-like object (any object supporting a WRITE and CLOSE method). - function of module binhex: hexbin (INPUT[, OUTPUT]) Decode a binhex file INPUT. INPUT may be a filename or a file-like object supporting READ and CLOSE methods. The resulting file is written to a file named OUTPUT, unless the argument is empty in which case the output filename is read from the binhex file. * Menu: * notes:: File: pylibi, Node: notes, Prev: binhex, Up: binhex notes ----- There is an alternative, more powerful interface to the coder and decoder, see the source for details. If you code or decode textfiles on non-Macintosh platforms they will still use the macintosh newline convention (carriage-return as end of line). As of this writing, HEXBIN appears to not work in all cases. File: pylibi, Node: uu, Next: binascii, Prev: binhex, Up: Internet and WWW Standard module `uu' ==================== This module encodes and decodes files in uuencode format, allowing arbitrary binary data to be transferred over ascii-only connections. Whereever a file argument is expected, the methods accept either a pathname (`'-'' for stdin/stdout) or a file-like object. Normally you would pass filenames, but there is one case where you have to open the file yourself: if you are on a non-unix platform and your binary file is actually a textfile that you want encoded unix-compatible you will have to open the file yourself as a textfile, so newline conversion is performed. This code was contributed by Lance Ellinghouse, and modified by Jack Jansen. The `uu' module defines the following functions: - function of module uu: encode (IN_FILE, OUT_FILE[, NAME, MODE]) Uuencode file IN_FILE into file OUT_FILE. The uuencoded file will have the header specifying NAME and MODE as the defaults for the results of decoding the file. The default defaults are taken from IN_FILE, or `'-'' and `0666' respectively. - function of module uu: decode (IN_FILE[, OUT_FILE, MODE]) This call decodes uuencoded file IN_FILE placing the result on file OUT_FILE. If OUT_FILE is a pathname the MODE is also set. Defaults for OUT_FILE and MODE are taken from the uuencode header. File: pylibi, Node: binascii, Next: xdrlib, Prev: uu, Up: Internet and WWW Built-in Module `binascii' ========================== The binascii module contains a number of methods to convert between binary and various ascii-encoded binary representations. Normally, you will not use these modules directly but use wrapper modules like UU or HEXBIN in stead, this module solely exists because bit-manipuation of large amounts of data is slow in python. The `binascii' module defines the following functions: - function of module binascii: a2b_uu (STRING) Convert a single line of uuencoded data back to binary and return the binary data. Lines normally contain 45 (binary) bytes, except for the last line. Line data may be followed by whitespace. - function of module binascii: b2a_uu (DATA) Convert binary data to a line of ascii characters, the return value is the converted line, including a newline char. The length of DATA should be at most 45. - function of module binascii: a2b_base64 (STRING) Convert a block of base64 data back to binary and return the binary data. More than one line may be passed at a time. - function of module binascii: b2a_base64 (DATA) Convert binary data to a line of ascii characters in base64 coding. The return value is the converted line, including a newline char. The length of DATA should be at most 57 to adhere to the base64 standard. - function of module binascii: a2b_hqx (STRING) Convert binhex4 formatted ascii data to binary, without doing rle-decompression. The string should contain a complete number of binary bytes, or (in case of the last portion of the binhex4 data) have the remaining bits zero. - function of module binascii: rledecode_hqx (DATA) Perform RLE-decompression on the data, as per the binhex4 standard. The algorithm uses `0x90' after a byte as a repeat indicator, followed by a count. A count of `0' specifies a byte value of `0x90'. The routine returns the decompressed data, unless data input data ends in an orphaned repeat indicator, in which case the INCOMPLETE exception is raised. - function of module binascii: rlecode_hqx (DATA) Perform binhex4 style RLE-compression on DATA and return the result. - function of module binascii: b2a_hqx (DATA) Perform hexbin4 binary-to-ascii translation and return the resulting string. The argument should already be rle-coded, and have a length divisible by 3 (except possibly the last fragment). - function of module binascii: crc_hqx (DATA, CRC) Compute the binhex4 crc value of DATA, starting with an initial CRC and returning the result. - exception of module binascii: Error Exception raised on errors. These are usually programming errors. - exception of module binascii: Incomplete Exception raised on incomplete data. These are usually not programming errors, but handled by reading a little more data and trying again. File: pylibi, Node: xdrlib, Prev: binascii, Up: Internet and WWW Standard module `xdrlib' ======================== The `xdrlib' module supports the External Data Representation Standard as described in RFC 1014, written by Sun Microsystems, Inc. June 1987. It supports most of the data types described in the RFC, although some, most notably `float' and `double' are only supported on those operating systems that provide an XDR library. The `xdrlib' module defines two classes, one for packing variables into XDR representation, and another for unpacking from XDR representation. There are also two exception classes. * Menu: * Packer Objects:: * Unpacker Objects:: * Exceptions:: * Supporting Floating Point Data:: File: pylibi, Node: Packer Objects, Next: Unpacker Objects, Prev: xdrlib, Up: xdrlib Packer Objects -------------- `Packer' is the class for packing data into XDR representation. The `Packer' class is instantiated with no arguments. - function of module xdrlib: get_buffer () Returns the current pack buffer as a string. - function of module xdrlib: reset () Resets the pack buffer to the empty string. In general, you can pack any of the most common XDR data types by calling the appropriate `pack_TYPE' method. Each method takes a single argument, the value to pack. The following simple data type packing methods are supported: `pack_uint', `pack_int', `pack_enum', `pack_bool', `pack_uhyper', and `pack_hyper'. The following methods pack floating point numbers, however they require C library support. Without the optional C built-in module, both of these methods will raise an `xdrlib.ConversionError' exception. See the note at the end of this chapter for details. - function of module xdrlib: pack_float (VALUE) Packs the single-precision floating point number VALUE. - function of module xdrlib: pack_double (VALUE) Packs the double-precision floating point number VALUE. The following methods support packing strings, bytes, and opaque data: - function of module xdrlib: pack_fstring (N, S) Packs a fixed length string, S. N is the length of the string but it is *not* packed into the data buffer. The string is padded with null bytes if necessary to guaranteed 4 byte alignment. - function of module xdrlib: pack_fopaque (N, DATA) Packs a fixed length opaque data stream, similarly to `pack_fstring'. - function of module xdrlib: pack_string (S) Packs a variable length string, S. The length of the string is first packed as an unsigned integer, then the string data is packed with `pack_fstring'. - function of module xdrlib: pack_opaque (DATA) Packs a variable length opaque data string, similarly to `pack_string'. - function of module xdrlib: pack_bytes (BYTES) Packs a variable length byte stream, similarly to `pack_string'. The following methods support packing arrays and lists: - function of module xdrlib: pack_list (LIST, PACK_ITEM) Packs a LIST of homogeneous items. This method is useful for lists with an indeterminate size; i.e. the size is not available until the entire list has been walked. For each item in the list, an unsigned integer `1' is packed first, followed by the data value from the list. PACK_ITEM is the function that is called to pack the individual item. At the end of the list, an unsigned integer `0' is packed. - function of module xdrlib: pack_farray (N, ARRAY, PACK_ITEM) Packs a fixed length list (ARRAY) of homogeneous items. N is the length of the list; it is *not* packed into the buffer, but a `ValueError' exception is raised if `len(array)' is not equal to N. As above, PACK_ITEM is the function used to pack each element. - function of module xdrlib: pack_array (LIST, PACK_ITEM) Packs a variable length LIST of homogeneous items. First, the length of the list is packed as an unsigned integer, then each element is packed as in `pack_farray' above. File: pylibi, Node: Unpacker Objects, Next: Exceptions, Prev: Packer Objects, Up: xdrlib Unpacker Objects ---------------- `Unpacker' is the complementary class which unpacks XDR data values from a string buffer, and has the following methods: - function of module xdrlib: __init__ (DATA) Instantiates an `Unpacker' object with the string buffer DATA. - function of module xdrlib: reset (DATA) Resets the string buffer with the given DATA. - function of module xdrlib: get_position () Returns the current unpack position in the data buffer. - function of module xdrlib: set_position (POSITION) Sets the data buffer unpack position to POSITION. You should be careful about using `get_position()' and `set_position()'. - function of module xdrlib: done () Indicates unpack completion. Raises an `xdrlib.Error' exception if all of the data has not been unpacked. In addition, every data type that can be packed with a `Packer', can be unpacked with an `Unpacker'. Unpacking methods are of the form `unpack_TYPE', and take no arguments. They return the unpacked object. The same caveats apply for `unpack_float' and `unpack_double' as above. - function of module xdrlib: unpack_float () Unpacks a single-precision floating point number. - function of module xdrlib: unpack_double () Unpacks a double-precision floating point number, similarly to `unpack_float'. In addition, the following methods unpack strings, bytes, and opaque data: - function of module xdrlib: unpack_fstring (N) Unpacks and returns a fixed length string. N is the number of characters expected. Padding with null bytes to guaranteed 4 byte alignment is assumed. - function of module xdrlib: unpack_fopaque (N) Unpacks and returns a fixed length opaque data stream, similarly to `unpack_fstring'. - function of module xdrlib: unpack_string () Unpacks and returns a variable length string. The length of the string is first unpacked as an unsigned integer, then the string data is unpacked with `unpack_fstring'. - function of module xdrlib: unpack_opaque () Unpacks and returns a variable length opaque data string, similarly to `unpack_string'. - function of module xdrlib: unpack_bytes () Unpacks and returns a variable length byte stream, similarly to `unpack_string'. The following methods support unpacking arrays and lists: - function of module xdrlib: unpack_list (UNPACK_ITEM) Unpacks and returns a list of homogeneous items. The list is unpacked one element at a time by first unpacking an unsigned integer flag. If the flag is `1', then the item is unpacked and appended to the list. A flag of `0' indicates the end of the list. UNPACK_ITEM is the function that is called to unpack the items. - function of module xdrlib: unpack_farray (N, UNPACK_ITEM) Unpacks and returns (as a list) a fixed length array of homogeneous items. N is number of list elements to expect in the buffer. As above, UNPACK_ITEM is the function used to unpack each element. - function of module xdrlib: unpack_array (UNPACK_ITEM) Unpacks and returns a variable length LIST of homogeneous items. First, the length of the list is unpacked as an unsigned integer, then each element is unpacked as in `unpack_farray' above. File: pylibi, Node: Exceptions, Next: Supporting Floating Point Data, Prev: Unpacker Objects, Up: xdrlib Exceptions ---------- Exceptions in this module are coded as class instances: - exception of module xdrlib: Error The base exception class. `Error' has a single public data member `msg' containing the description of the error. - exception of module xdrlib: ConversionError Class derived from `Error'. Contains no additional instance variables. Here is an example of how you would catch one of these exceptions: import xdrlib p = xdrlib.Packer() try: p.pack_double(8.01) except xdrlib.ConversionError, instance: print 'packing the double failed:', instance.msg File: pylibi, Node: Supporting Floating Point Data, Prev: Exceptions, Up: xdrlib Supporting Floating Point Data ------------------------------ Packing and unpacking floating point data, i.e. `Packer.pack_float', `Packer.pack_double', `Unpacker.unpack_float', and `Unpacker.unpack_double', are only supported with the helper built-in `_xdr' module, which relies on your operating system having the appropriate XDR library routines. If you have built the Python interpeter with the `_xdr' module, or have built the `_xdr' module as a shared library, `xdrlib' will use these to pack and unpack floating point numbers. Otherwise, using these routines will raise a `ConversionError' exception. See the Python installation instructions for details on building the `_xdr' module. File: pylibi, Node: Restricted Execution, Next: Cryptographic Services, Prev: Internet and WWW, Up: Top Restricted Execution ******************** In general, Python programs have complete access to the underlying operating system throug the various functions and classes, For example, a Python program can open any file for reading and writing by using the `open()' built-in function (provided the underlying OS gives you permission!). This is exactly what you want for most applications. There exists a class of applications for which this "openness" is inappropriate. Take Grail: a web browser that accepts "applets", snippets of Python code, from anywhere on the Internet for execution on the local system. This can be used to improve the user interface of forms, for instance. Since the originator of the code is unknown, it is obvious that it cannot be trusted with the full resources of the local machine. *Restricted execution* is the basic framework in Python that allows for the segregation of trusted and untrusted code. It is based on the notion that trusted Python code (a *supervisor*) can create a "padded cell' (or environment) with limited permissions, and run the untrusted code within this cell. The untrusted code cannot break out of its cell, and can only interact with sensitive system resources through interfaces defined and managed by the trusted code. The term "restricted execution" is favored over "safe-Python" since true safety is hard to define, and is determined by the way the restricted environment is created. Note that the restricted environments can be nested, with inner cells creating subcells of lesser, but never greater, privilege. An interesting aspect of Python's restricted execution model is that the interfaces presented to untrusted code usually have the same names as those presented to trusted code. Therefore no special interfaces need to be learned to write code designed to run in a restricted environment. And because the exact nature of the padded cell is determined by the supervisor, different restrictions can be imposed, depending on the application. For example, it might be deemed "safe" for untrusted code to read any file within a specified directory, but never to write a file. In this case, the supervisor may redefine the built-in `open()' function so that it raises an exception whenever the MODE parameter is `'w''. It might also perform a `chroot()'-like operation on the FILENAME parameter, such that root is always relative to some safe "sandbox" area of the filesystem. In this case, the untrusted code would still see an built-in `open()' function in its environment, with the same calling interface. The semantics would be identical too, with `IOError's being raised when the supervisor determined that an unallowable parameter is being used. The Python run-time determines whether a particular code block is executing in restricted execution mode based on the identity of the `__builtins__' object in its global variables: if this is (the dictionary of) the standard `__builtin__' module, the code is deemed to be unrestricted, else it is deemed to be restricted. Python code executing in restricted mode faces a number of limitations that are designed to prevent it from escaping from the padded cell. For instance, the function object attribute `func_globals' and the class and instance object attribute `__dict__' are unavailable. Two modules provide the framework for setting up restricted execution environments: rexec -- Basic restricted execution framework. Bastion -- Providing restricted access to objects. * Menu: * rexec:: * Bastion:: File: pylibi, Node: rexec, Next: Bastion, Prev: Restricted Execution, Up: Restricted Execution Standard Module `rexec' ======================= This module contains the `RExec' class, which supports `r_exec()', `r_eval()', `r_execfile()', and `r_import()' methods, which are restricted versions of the standard Python functions `exec()', `eval()', `execfile()', and the `import' statement. Code executed in this restricted environment will only have access to modules and functions that are deemed safe; you can subclass `RExec' to add or remove capabilities as desired. *Note:* The `RExec' class can prevent code from performing unsafe operations like reading or writing disk files, or using TCP/IP sockets. However, it does not protect against code using extremely large amounts of memory or CPU time. - function of module rexec: RExec ([HOOKS[, VERBOSE]]) Returns an instance of the `RExec' class. HOOKS is an instance of the `RHooks' class or a subclass of it. If it is omitted or `None', the default `RHooks' class is instantiated. Whenever the RExec module searches for a module (even a built-in one) or reads a module's code, it doesn't actually go out to the file system itself. Rather, it calls methods of an RHooks instance that was passed to or created by its constructor. (Actually, the RExec object doesn't make these calls--they are made by a module loader object that's part of the RExec object. This allows another level of flexibility, e.g. using packages.) By providing an alternate RHooks object, we can control the file system accesses made to import a module, without changing the actual algorithm that controls the order in which those accesses are made. For instance, we could substitute an RHooks object that passes all filesystem requests to a file server elsewhere, via some RPC mechanism such as ILU. Grail's applet loader uses this to support importing applets from a URL for a directory. If VERBOSE is true, additional debugging output may be sent to standard output. The RExec class has the following class attributes, which are used by the `__init__' method. Changing them on an existing instance won't have any effect; instead, create a subclass of `RExec' and assign them new values in the class definition. Instances of the new class will then use those new values. All these attributes are tuples of strings. - attribute of RExec object: nok_builtin_names Contains the names of built-in functions which will *not* be available to programs running in the restricted environment. The value for `RExec' is `('open',' `'reload',' `'__import__')'. (This gives the exceptions, because by far the majority of built-in functions are harmless. A subclass that wants to override this variable should probably start with the value from the base class and concatenate additional forbidden functions -- when new dangerous built-in functions are added to Python, they will also be added to this module.) - attribute of RExec object: ok_builtin_modules Contains the names of built-in modules which can be safely imported. The value for `RExec' is `('audioop',' `'array',' `'binascii',' `'cmath',' `'errno',' `'imageop',' `'marshal',' `'math',' `'md5',' `'operator',' `'parser',' `'regex',' `'rotor',' `'select',' `'strop',' `'struct',' `'time')'. A similar remark about overriding this variable applies -- use the value from the base class as a starting point. - attribute of RExec object: ok_path Contains the directories which will be searched when an `import' is performed in the restricted environment. The value for `RExec' is the same as `sys.path' (at the time the module is loaded) for unrestricted code. - attribute of RExec object: ok_posix_names Contains the names of the functions in the `os' module which will be available to programs running in the restricted environment. The value for `RExec' is `('error',' `'fstat',' `'listdir',' `'lstat',' `'readlink',' `'stat',' `'times',' `'uname',' `'getpid',' `'getppid',' `'getcwd',' `'getuid',' `'getgid',' `'geteuid',' `'getegid')'. - attribute of RExec object: ok_sys_names Contains the names of the functions and variables in the `sys' module which will be available to programs running in the restricted environment. The value for `RExec' is `('ps1',' `'ps2',' `'copyright',' `'version',' `'platform',' `'exit',' `'maxint')'. RExec instances support the following methods: - Method on RExec object: r_eval (CODE) CODE must either be a string containing a Python expression, or a compiled code object, which will be evaluated in the restricted environment's `__main__' module. The value of the expression or code object will be returned. - Method on RExec object: r_exec (CODE) CODE must either be a string containing one or more lines of Python code, or a compiled code object, which will be executed in the restricted environment's `__main__' module. - Method on RExec object: r_execfile (FILENAME) Execute the Python code contained in the file FILENAME in the restricted environment's `__main__' module. Methods whose names begin with `s_' are similar to the functions beginning with `r_', but the code will be granted access to restricted versions of the standard I/O streans `sys.stdin', `sys.stderr', and `sys.stdout'. - Method on RExec object: s_eval (CODE) CODE must be a string containing a Python expression, which will be evaluated in the restricted environment. - Method on RExec object: s_exec (CODE) CODE must be a string containing one or more lines of Python code, which will be executed in the restricted environment. - Method on RExec object: s_execfile (CODE) Execute the Python code contained in the file FILENAME in the restricted environment. `RExec' objects must also support various methods which will be implicitly called by code executing in the restricted environment. Overriding these methods in a subclass is used to change the policies enforced by a restricted environment. - Method on RExec object: r_import (MODULENAME[, GLOBALS, LOCALS, FROMLIST]) Import the module MODULENAME, raising an `ImportError' exception if the module is considered unsafe. - Method on RExec object: r_open (FILENAME[, MODE[, BUFSIZE]]) Method called when `open()' is called in the restricted environment. The arguments are identical to those of `open()', and a file object (or a class instance compatible with file objects) should be returned. `RExec''s default behaviour is allow opening any file for reading, but forbidding any attempt to write a file. See the example below for an implementation of a less restrictive `r_open()'. - Method on RExec object: r_reload (MODULE) Reload the module object MODULE, re-parsing and re-initializing it. - Method on RExec object: r_unload (MODULE) Unload the module object MODULE (i.e., remove it from the restricted environment's `sys.modules' dictionary). And their equivalents with access to restricted standard I/O streams: - Method on RExec object: s_import (MODULENAME[, GLOBALS, LOCALS, FROMLIST]) Import the module MODULENAME, raising an `ImportError' exception if the module is considered unsafe. - Method on RExec object: s_reload (MODULE) Reload the module object MODULE, re-parsing and re-initializing it. - Method on RExec object: s_unload (MODULE) Unload the module object MODULE. * Menu: * An example:: File: pylibi, Node: An example, Prev: rexec, Up: rexec An example ---------- Let us say that we want a slightly more relaxed policy than the standard RExec class. For example, if we're willing to allow files in `/tmp' to be written, we can subclass the `RExec' class: class TmpWriterRExec(rexec.RExec): def r_open(self, file, mode='r', buf=-1): if mode in ('r', 'rb'): pass elif mode in ('w', 'wb', 'a', 'ab'): # check filename : must begin with /tmp/ if file[:5]!='/tmp/': raise IOError, "can't write outside /tmp" elif (string.find(file, '/../') >= 0 or file[:3] == '../' or file[-3:] == '/..'): raise IOError, "'..' in filename forbidden" else: raise IOError, "Illegal open() mode" return open(file, mode, buf) Notice that the above code will occasionally forbid a perfectly valid filename; for example, code in the restricted environment won't be able to open a file called `/tmp/foo/../bar'. To fix this, the `r_open' method would have to simplify the filename to `/tmp/bar', which would require splitting apart the filename and performing various operations on it. In cases where security is at stake, it may be preferable to write simple code which is sometimes overly restrictive, instead of more general code that is also more complex and may harbor a subtle security hole.