home *** CD-ROM | disk | FTP | other *** search
Text File | 1995-03-27 | 74.1 KB | 1,589 lines | [TEXT/ROSA] |
- Common Lisp the Language, 2nd Edition
- -------------------------------------------------------------------------------
-
- 2. Data Types
-
- Common Lisp provides a variety of types of data objects. It is important to
- note that in Lisp it is data objects that are typed, not variables. Any
- variable can have any Lisp object as its value. (It is possible to make an
- explicit declaration that a variable will in fact take on one of only a limited
- set of values. However, such a declaration may always be omitted, and the
- program will still run correctly. Such a declaration merely constitutes advice
- from the user that may be useful in gaining efficiency. See declare.)
-
- In Common Lisp, a data type is a (possibly infinite) set of Lisp objects. Many
- Lisp objects belong to more than one such set, and so it doesn't always make
- sense to ask what is the type of an object; instead, one usually asks only
- whether an object belongs to a given type. The predicate typep may be used to
- ask whether an object belongs to a given type, and the function type-of returns
- a type to which a given object belongs.
-
- The data types defined in Common Lisp are arranged into a hierarchy (actually a
- partial order) defined by the subset relationship. Certain sets of objects,
- such as the set of numbers or the set of strings, are interesting enough to
- deserve labels. Symbols are used for most such labels (here, and throughout
- this book, the word ``symbol'' refers to atomic symbols, one kind of Lisp
- object, elsewhere known as literal atoms). See chapter 4 for a complete
- description of type specifiers.
-
- The set of all objects is specified by the symbol t. The empty data type, which
- contains no objects, is denoted by nil.
-
- [old_change_begin]
- A type called common encompasses all the data objects required by the Common
- Lisp language. A Common Lisp implementation is free to provide other data types
- that are not subtypes of common.
- [old_change_end]
-
- [change_begin]
- X3J13 voted in March 1989 (COMMON-TYPE) to remove the type common (and the
- predicate commonp) from the language, on the grounds that it has not proved to
- be useful in practice and that it could be difficult to redefine in the face of
- other changes to the Common Lisp type system (such as the introduction of CLOS
- classes).
- [change_end]
-
- The following categories of Common Lisp objects are of particular interest:
- numbers, characters, symbols, lists, arrays, structures, and functions. There
- are others as well. Some of these categories have many subdivisions. There are
- also standard types defined to be the union of two or more of these categories.
- The categories listed above, while they are data types, are neither more nor
- less ``real'' than other data types; they simply constitute a particularly
- useful slice across the type hierarchy for expository purposes.
-
- Here are brief descriptions of various Common Lisp data types. The remaining
- sections of this chapter go into more detail and also describe notations for
- objects of each type. Descriptions of Lisp functions that operate on data
- objects of each type appear in later chapters.
-
- * Numbers are provided in various forms and representations. Common Lisp
- provides a true integer data type: any integer, positive or negative, has
- in principle a representation as a Common Lisp data object, subject only
- to total memory limitations (rather than machine word width). A true
- rational data type is provided: the quotient of two integers, if not an
- integer, is a ratio. Floating-point numbers of various ranges and
- precisions are also provided, as well as Cartesian complex numbers.
-
- * Characters represent printed glyphs such as letters or text formatting
- operations. Strings are one-dimensional arrays of characters. Common Lisp
- provides for a rich character set, including ways to represent characters
- of various type styles.
-
- * Symbols (sometimes called atomic symbols for emphasis or clarity) are
- named data objects. Lisp provides machinery for locating a symbol object,
- given its name (in the form of a string). Symbols have property lists,
- which in effect allow symbols to be treated as record structures with an
- extensible set of named components, each of which may be any Lisp object.
- Symbols also serve to name functions and variables within programs.
-
- * Lists are sequences represented in the form of linked cells called
- conses. There is a special object (the symbol nil) that is the empty list.
- All other lists are built recursively by adding a new element to the front
- of an existing list. This is done by creating a new cons, which is an
- object having two components called the car and the cdr. The car may hold
- anything, and the cdr is made to point to the previously existing list.
- (Conses may actually be used completely generally as two-element record
- structures, but their most important use is to represent lists.)
-
- * Arrays are dimensioned collections of objects. An array can have any
- non-negative number of dimensions and is indexed by a sequence of
- integers. A general array can have any Lisp object as a component; other
- types of arrays are specialized for efficiency and can hold only certain
- types of Lisp objects. It is possible for two arrays, possibly with
- differing dimension information, to share the same set of elements (such
- that modifying one array modifies the other also) by causing one to be
- displaced to the other. One-dimensional arrays of any kind are called
- vectors. One-dimensional arrays of characters are called strings.
- One-dimensional arrays of bits (that is, of integers whose values are 0 or
- 1) are called bit-vectors.
-
- * Hash tables provide an efficient way of mapping any Lisp object (a key)
- to an associated object.
-
- * Readtables are used to control the built-in expression parser read.
-
- * Packages are collections of symbols that serve as name spaces. The parser
- recognizes symbols by looking up character sequences in the current
- package.
-
- * Pathnames represent names of files in a fairly implementation-independent
- manner. They are used to interface to the external file system.
-
- * Streams represent sources or sinks of data, typically characters or
- bytes. They are used to perform I/O, as well as for internal purposes such
- as parsing strings.
-
- * Random-states are data structures used to encapsulate the state of the
- built-in random-number generator.
-
- * Structures are user-defined record structures, objects that have named
- components. The defstruct facility is used to define new structure types.
- Some Common Lisp implementations may choose to implement certain
- system-supplied data types, such as bignums, readtables, streams, hash
- tables, and pathnames, as structures, but this fact will be invisible to
- the user.
-
- [old_change_begin]
-
- * Functions are objects that can be invoked as procedures; these may take
- arguments and return values. (All Lisp procedures can be construed to
- return values and therefore every procedure is a function.) Such objects
- include compiled-functions (compiled code objects). Some functions are
- represented as a list whose car is a particular symbol such as lambda.
- Symbols may also be used as functions.
-
- [old_change_end]
-
- [change_begin]
- X3J13 voted in June 1988 (FUNCTION-TYPE) to specify that symbols are not of
- type function, but are automatically coerced to functions in certain situations
- (see section 2.13).
-
- X3J13 voted in June 1988 (CONDITION-SYSTEM) to adopt the Common Lisp
- Condition System, thereby introducing a new category of data objects:
-
- * Conditions are objects used to affect control flow in certain
- conventional ways by means of signals and handlers that intercept those
- signals. In particular, errors are signaled by raising particular
- conditions, and errors may be trapped by establishing handlers for those
- conditions.
-
- X3J13 voted in June 1988 (CLOS) to adopt the Common Lisp Object System,
- thereby introducing additional categories of data objects:
-
- * Classes determine the structure and behavior of other objects, their
- instances. Every Common Lisp data object belongs to some class. (In some
- ways the CLOS class system is a generalization of the system of type
- specifiers of the first edition of this book, but the class system
- augments the type system rather than supplanting it.)
-
- * Methods are chunks of code that operate on arguments satisfying a
- particular pattern of classes. Methods are not functions; they are not
- invoked directly on arguments but instead are bundled into generic
- functions.
-
- * Generic functions are functions that contain, among other information, a
- set of methods. When invoked, a generic function executes a subset of its
- methods. The subset chosen for execution depends in a specific way on the
- classes or identities of the arguments to which it is applied.
-
- [change_end]
-
- These categories are not always mutually exclusive. The required relationships
- among the various data types are explained in more detail in section 2.15.
-
- -------------------------------------------------------------------------------
-
- * Numbers
- o Integers
- o Ratios
- o Floating-Point Numbers
- o Complex Numbers
- * Characters
- o Standard Characters
- o Line Divisions
- o Non-standard Characters
- o Character Attributes
- o String Characters
- * Symbols
- * Lists and Conses
- * Arrays
- o Vectors
- o Strings
- o Bit-Vectors
- * Hash Tables
- * Readtables
- * Packages
- * Pathnames
- * Streams
- * Random-States
- * Structures
- * Functions
- * Unreadable Data Objects
- * Overlap, Inclusion, and Disjointness of Types
-
-
- 2.1. Numbers
-
- Several kinds of numbers are defined in Common Lisp. They are divided into
- integers; ratios; floating-point numbers, with names provided for up to four
- different floating-point representations; and complex numbers.
-
- [change_begin]
- X3J13 voted in March 1989 (REAL-NUMBER-TYPE) to add the type real.
-
- The number data type encompasses all kinds of numbers. For convenience, there
- are names for some subclasses of numbers as well. Integers and ratios are of
- type rational. Rational numbers and floating-point numbers are of type real.
- Real numbers and complex numbers are of type number.
-
- Although the names of these types were chosen with the terminology of
- mathematics in mind, the correspondences are not always exact. Integers and
- ratios model the corresponding mathematical concepts directly. Numbers of type
- float may be used to approximate real numbers, both rational and irrational.
- The real type includes all Common Lisp numbers that represent mathematical real
- numbers, though there are mathematical real numbers (irrational numbers) that
- do not have an exact Common Lisp representation. Only real numbers may be
- ordered using the <, >, <=, and >= functions.
-
- -------------------------------------------------------------------------------
- Compatibility note: The Fortran 77 standard defines the term real datum to mean
- ``a processor approximation to the value of a real number.'' In practice the
- Fortran basic real type is the floating-point data type that Common Lisp calls
- single-float. The Fortran double precision type is Common Lisp's double-float.
- The Pascal real data type is an ``implementation-defined subset of the real
- numbers.'' In practice this is usually a floating-point type, often what Common
- Lisp calls double-float.
-
- A translation of an algorithm written in Fortran or Pascal that uses real data
- usually will use some appropriate precision of Common Lisp's float type. Some
- algorithms may gain accuracy or flexibility by using Common Lisp's rational or
- real type instead.
- -------------------------------------------------------------------------------
-
- [change_end]
-
- -------------------------------------------------------------------------------
-
- * Integers
- * Ratios
- * Floating-Point Numbers
- * Complex Numbers
-
-
- 2.1.1. Integers
-
- The integer data type is intended to represent mathematical integers. Unlike
- most programming languages, Common Lisp in principle imposes no limit on the
- magnitude of an integer; storage is automatically allocated as necessary to
- represent large integers.
-
- In every Common Lisp implementation there is a range of integers that are
- represented more efficiently than others; each such integer is called a fixnum,
- and an integer that is not a fixnum is called a bignum. Common Lisp is designed
- to hide this distinction as much as possible; the distinction between fixnums
- and bignums is visible to the user in only a few places where the efficiency of
- representation is important. Exactly which integers are fixnums is
- implementation-dependent; typically they will be those integers in the range
- to , inclusive, for some n not less than 15. See most-positive-fixnum and
- most-negative-fixnum.
-
- [change_begin]
- X3J13 voted in January 1989 (FIXNUM-NON-PORTABLE) to specify that fixnum must
- be a supertype of the type (signed-byte 16), and additionally that the value of
- array-dimension-limit must be a fixnum (implying that the implementor should
- choose the range of fixnums to be large enough to accommodate the largest size
- of array to be supported).
-
- -------------------------------------------------------------------------------
- Rationale: This specification allows programmers to declare variables in
- portable code to be of type fixnum for efficiency. Fixnums are guaranteed to
- encompass at least the set of 16-bit signed integers (compare this to the data
- type short int in the C programming language). In addition, any valid array
- index must be a fixnum, and therefore variables used to hold array indices
- (such as a dotimes variable) may be declared fixnum in portable code.
- -------------------------------------------------------------------------------
-
- [change_end]
-
- Integers are ordinarily written in decimal notation, as a sequence of decimal
- digits, optionally preceded by a sign and optionally followed by a decimal
- point. For example:
-
- 0 ;Zero
- -0 ;This always means the same as 0
- +6 ;The first perfect number
- 28 ;The second perfect number
- 1024. ;Two to the tenth power
- -1 ;
- 15511210043330985984000000. ;25 factorial (25!), probably a bignum
-
- -------------------------------------------------------------------------------
- Compatibility note: MacLisp and Lisp Machine Lisp normally assume that integers
- are written in octal (radix-8) notation unless a decimal point is present.
- Interlisp assumes integers are written in decimal notation and uses a trailing
- Q to indicate octal radix; however, a decimal point, even in trailing position,
- always indicates a floating-point number. This is of course consistent with
- Fortran. Ada does not permit trailing decimal points but instead requires them
- to be embedded. In Common Lisp, integers written as described above are always
- construed to be in decimal notation, whether or not the decimal point is
- present; allowing the decimal point to be present permits compatibility with
- MacLisp.
- -------------------------------------------------------------------------------
-
- Integers may be notated in radices other than ten. The notation
-
- #nnrddddd or #nnRddddd
-
- means the integer in radix-nn notation denoted by the digits ddddd. More
- precisely, one may write #, a non-empty sequence of decimal digits representing
- an unsigned decimal integer n, r (or R), an optional sign, and a sequence of
- radix-n digits, to indicate an integer written in radix n (which must be
- between 2 and 36, inclusive). Only legal digits for the specified radix may be
- used; for example, an octal number may contain only the digits 0 through 7. For
- digits above 9, letters of the alphabet of either case may be used in order.
- Binary, octal, and hexadecimal radices are useful enough to warrant the special
- abbreviations #b for #2r, #o for #8r, and #x for #16r. For example:
-
- #2r11010101 ;Another way of writing 213 decimal
- #b11010101 ;Ditto
- #b+11010101 ;Ditto
- #o325 ;Ditto, in octal radix
- #xD5 ;Ditto, in hexadecimal radix
- #16r+D5 ;Ditto
- #o-300 ;Decimal -192, written in base 8
- #3r-21010 ;Same thing in base 3
- #25R-7H ;Same thing in base 25
- #xACCEDED ;181202413, in hexadecimal radix
-
-
- 2.1.2. Ratios
-
- A ratio is a number representing the mathematical ratio of two integers.
- Integers and ratios collectively constitute the type rational. The canonical
- representation of a rational number is as an integer if its value is integral,
- and otherwise as the ratio of two integers, the numerator and denominator,
- whose greatest common divisor is 1, and of which the denominator is positive
- (and in fact greater than 1, or else the value would be integral). A ratio is
- notated with / as a separator, thus: 3/5. It is possible to notate ratios in
- non-canonical (unreduced) forms, such as 4/6, but the Lisp function prin1
- always prints the canonical form for a ratio.
-
- If any computation produces a result that is a ratio of two integers such that
- the denominator evenly divides the numerator, then the result is immediately
- converted to the equivalent integer. This is called the rule of rational
- canonicalization.
-
- Rational numbers may be written as the possibly signed quotient of decimal
- numerals: an optional sign followed by two non-empty sequences of digits
- separated by a /. This syntax may be described as follows:
-
- ratio ::= [sign] {digit}+ / {digit}+
-
- The second sequence may not consist entirely of zeros. For example:
-
- 2/3 ;This is in canonical form
- 4/6 ;A non-canonical form for the same number
- -17/23 ;A not very interesting ratio
- -30517578125/32768 ;This is
- 10/5 ;The canonical form for this is 2
-
- To notate rational numbers in radices other than ten, one uses the same radix
- specifiers (one of #nnR, #O, #B, or #X) as for integers. For example:
-
- #o-101/75 ;Octal notation for -65/61
- #3r120/21 ;Ternary notation for 15/7
- #Xbc/ad ;Hexadecimal notation for 188/173
- #xFADED/FACADE ;Hexadecimal notation for 1027565/16435934
-
-
- 2.1.3. Floating-Point Numbers
-
- Common Lisp allows an implementation to provide one or more kinds of
- floating-point number, which collectively make up the type float. Now a
- floating-point number is a (mathematical) rational number of the form , where
- s is +1 or -1, the sign; b is an integer greater than 1, the base or radix of
- the representation; p is a positive integer, the precision (in base-b digits)
- of the floating-point number; f is a positive integer between and
- (inclusive), the significand; and e is an integer, the exponent. The value of p
- and the range of e depends on the implementation and on the type of
- floating-point number within that implementation. In addition, there is a
- floating-point zero; depending on the implementation, there may also be a
- ``minus zero.'' If there is no minus zero, then 0.0 and -0.0 are both
- interpreted as simply a floating-point zero.
-
- -------------------------------------------------------------------------------
- Implementation note: The form of the above description should not be construed
- to require the internal representation to be in sign-magnitude form.
- Two's-complement and other representations are also acceptable. Note that the
- radix of the internal representation may be other than 2, as on the IBM 360 and
- 370, which use radix 16; see float-radix.
- -------------------------------------------------------------------------------
-
- Floating-point numbers may be provided in a variety of precisions and sizes,
- depending on the implementation. High-quality floating-point software tends to
- depend critically on the precise nature of the floating-point arithmetic and so
- may not always be completely portable. As an aid in writing programs that are
- moderately portable, however, certain definitions are made here:
-
- * A short floating-point number (type short-float) is of the representation
- of smallest fixed precision provided by an implementation.
-
- * A long floating-point number (type long-float) is of the representation
- of the largest fixed precision provided by an implementation.
-
- * Intermediate between short and long formats are two others, arbitrarily
- called single and double (types single-float and double-float).
-
- The precise definition of these categories is implementation-dependent.
- However, the rough intent is that short floating-point numbers be precise to at
- least four decimal places (but also have a space-efficient representation);
- single floating-point numbers, to at least seven decimal places; and double
- floating-point numbers, to at least fourteen decimal places. It is suggested
- that the precision (measured in bits, computed as ) and the exponent size
- (also measured in bits, computed as the base-2 logarithm of 1 plus the maximum
- exponent value) be at least as great as the values in table 2-1.
-
-
-
- Floating-point numbers are written in either decimal fraction or computerized
- scientific notation: an optional sign, then a non-empty sequence of digits with
- an embedded decimal point, then an optional decimal exponent specification. If
- there is no exponent specifier, then the decimal point is required, and there
- must be digits after it. The exponent specifier consists of an exponent marker,
- an optional sign, and a non-empty sequence of digits. For preciseness, here is
- a modified-BNF description of floating-point notation.
-
- floating-point-number ::= [sign] {digit}* decimal-point {digit}* [exponent]
- | [sign] {digit}+ [decimal-point {digit}*] exponent
- sign ::= + | -
- decimal-point ::= .
- digit ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
- exponent ::= exponent-marker [sign] {digit}+
- exponent-marker ::= e | s | f | d | l | E | S | F | D | L
-
- If no exponent specifier is present, or if the exponent marker e (or E) is
- used, then the precise format to be used is not specified. When such a
- representation is read and converted to an internal floating-point data object,
- the format specified by the variable *read-default-float-format* is used; the
- initial value of this variable is single-float.
-
- The letters s, f, d, and l (or their respective uppercase equivalents)
- explicitly specify the use of short, single, double, and long format,
- respectively.
-
- Examples of floating-point numbers:
-
- 0.0 ;Floating-point zero in default format
- 0E0 ;Also floating-point zero in default format
- -.0 ;This may be a zero or a minus zero,
- ; depending on the implementation
- 0. ;The integer zero, not a floating-point zero!
- 0.0s0 ;A floating-point zero in short format
- 0s0 ;Also a floating-point zero in short format
- 3.1415926535897932384d0 ;A double-format approximation to
- 6.02E+23 ;Avogadro's number, in default format
- 602E+21 ;Also Avogadro's number, in default format
- 3.010299957f-1 ; , in single format
- -0.000000001s9 ; in short format, the hard way
-
- [change_begin]
- Notice of correction. The first edition unfortunately listed an incorrect value
- (3.1010299957f-1) for the base-10 logarithm of 2.
- [change_end]
-
- The internal format used for an external representation depends only on the
- exponent marker and not on the number of decimal digits in the external
- representation.
-
- While Common Lisp provides terminology and notation sufficient to accommodate
- four distinct floating-point formats, not all implementations will have the
- means to support that many distinct formats. An implementation is therefore
- permitted to provide fewer than four distinct internal floating-point formats,
- in which case at least one of them will be ``shared'' by more than one of the
- external format names short, single, double, and long according to the
- following rules:
-
- * If one internal format is provided, then it is considered to be single,
- but serves also as short, double, and long. The data types short-float,
- single-float, double-float, and long-float are considered to be identical.
- An expression such as (eql 1.0s0 1.0d0) will be true in such an
- implementation because the two numbers 1.0s0 and 1.0d0 will be converted
- into the same internal format and therefore be considered to have the same
- data type, despite the differing external syntax. Similarly, (typep 1.0L0
- 'short-float) will be true in such an implementation. For output purposes
- all floating-point numbers are assumed to be of single format and thus
- will print using the exponent letter E or F.
-
- * If two internal formats are provided, then either of two correspondences
- may be used, depending on which is the more appropriate:
-
- o One format is short; the other is single and serves also as double
- and long. The data types single-float, double-float, and long-float
- are considered to be identical, but short-float is distinct. An
- expression such as (eql 1.0s0 1.0d0) will be false, but (eql 1.0f0
- 1.0d0) will be true. Similarly, (typep 1.0L0 'short-float) will be
- false, but (typep 1.0L0 'single-float) will be true. For output
- purposes all floating-point numbers are assumed to be of short or
- single format.
-
- o One format is single and serves also as short; the other is double
- and serves also as long. The data types short-float and single-float
- are considered to be identical, and the data types double-float and
- long-float are considered to be identical. An expression such as (eql
- 1.0s0 1.0d0) will be false, as will (eql 1.0f0 1.0d0); but (eql 1.0d0
- 1.0L0) will be true. Similarly, (typep 1.0L0 'short-float) will be
- false, but (typep 1.0L0 'double-float) will be true. For output
- purposes all floating-point numbers are assumed to be of single or
- double format.
-
- * If three internal formats are provided, then either of two
- correspondences may be used, depending on which is the more appropriate:
-
- o One format is short; another format is single; and the third format
- is double and serves also as long. Similar constraints apply.
-
- o One format is single and serves also as short; another is double;
- and the third format is long.
-
- -------------------------------------------------------------------------------
- Implementation note: It is recommended that an implementation provide as many
- distinct floating-point formats as feasible, using table 2-1 as a guideline.
- Ideally, short-format floating-point numbers should have an ``immediate''
- representation that does not require heap allocation; single-format
- floating-point numbers should approximate IEEE proposed standard single-format
- floating-point numbers; and double-format floating-point numbers should
- approximate IEEE proposed standard double-format floating-point numbers
- [23,17,16].
- -------------------------------------------------------------------------------
-
-
- 2.1.4. Complex Numbers
-
- Complex numbers (type complex) are represented in Cartesian form, with a real
- part and an imaginary part, each of which is a non-complex number (integer,
- ratio, or floating-point number). It should be emphasized that the parts of a
- complex number are not necessarily floating-point numbers; in this, Common Lisp
- is like PL/I and differs from Fortran. However, both parts must be of the same
- type: either both are rational, or both are of the same floating-point format.
-
- Complex numbers may be notated by writing the characters #C followed by a list
- of the real and imaginary parts. If the two parts as notated are not of the
- same type, then they are converted according to the rules of floating-point
- contagion as described in chapter 12. (Indeed, #C(a b) is equivalent to
- #,(complex a b); see the description of the function complex.) For example:
-
- #C(3.0s1 2.0s-1) ;Real and imaginary parts are short format
- #C(5 -3) ;A Gaussian integer
- #C(5/3 7.0) ;Will be converted internally to #C(1.66666 7.0)
- #C(0 1) ;The imaginary unit, that is, i
-
- The type of a specific complex number is indicated by a list of the word
- complex and the type of the components; for example, a specialized
- representation for complex numbers with short floating-point parts would be of
- type (complex short-float). The type complex encompasses all complex
- representations.
-
- A complex number of type (complex rational), that is, one whose components are
- rational, can never have a zero imaginary part. If the result of a computation
- would be a complex rational with a zero imaginary part, the result is
- immediately converted to a non-complex rational number by taking the real part.
- This is called the rule of complex canonicalization. This rule does not apply
- to floating-point complex numbers; #C(5.0 0.0) and 5.0 are different.
-
-
- 2.2. Characters
-
- Characters are represented as data objects of type character.
-
- [old_change_begin]
- There are two subtypes of interest, called standard-char and string-char.
- [old_change_end]
-
- [change_begin]
- X3J13 voted in March 1989 (CHARACTER-PROPOSAL) to remove the type
- string-char.
- [change_end]
-
- A character object can be notated by writing #\ followed by the character
- itself. For example, #\g means the character object for a lowercase g. This
- works well enough for printing characters. Non-printing characters have names,
- and can be notated by writing #\ and then the name; for example, #\Space (or
- #\SPACE or #\space or #\sPaCE) means the space character. The syntax for
- character names after #\ is the same as that for symbols. However, only
- character names that are known to the particular implementation may be used.
-
- -------------------------------------------------------------------------------
-
- * Standard Characters
- * Line Divisions
- * Non-standard Characters
- * Character Attributes
- * String Characters
-
- -------------------------------------------------------------------------------
-
-
- 2.2.1. Standard Characters
-
- Common Lisp defines a standard character set (subtype standard-char) for two
- purposes. Common Lisp programs that are written in the standard character set
- can be read by any Common Lisp implementation; and Common Lisp programs that
- use only standard characters as data objects are most likely to be portable.
- The Common Lisp character set consists of a space character #\Space, a newline
- character #\Newline, and the following ninety-four non-blank printing
- characters or their equivalents:
-
- ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
- @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _
- ` a b c d e f g h i j k l m n o p q r s t u v w x y z { | } ~
-
- The Common Lisp standard character set is apparently equivalent to the
- ninety-five standard ASCII printing characters plus a newline character.
- Nevertheless, Common Lisp is designed to be relatively independent of the ASCII
- character encoding. For example, the collating sequence is not specified except
- to say that digits must be properly ordered, the uppercase letters must be
- properly ordered, and the lowercase letters must be properly ordered (see char<
- for a precise specification). Other character encodings, particularly EBCDIC,
- should be easily accommodated (with a suitable mapping of printing characters).
-
- Of the ninety-four non-blank printing characters, the following are used in
- only limited ways in the syntax of Common Lisp programs:
-
- [ ] { } ? ! ^ _ ~ $ %
-
- [old_change_begin]
- All of these characters except ! and _ are used within format strings as
- formatting directives. Except for this, [, ], {, }, ?, and ! are not used in
- Common Lisp and are reserved to the user for syntactic extensions; ^ and _ are
- not yet used in Common Lisp but are part of the syntax of reserved tokens and
- are reserved to implementors; ~ is not yet used in Common Lisp and is reserved
- to implementors; and $ and % are normally regarded as alphabetic characters but
- are not used in the names of any standard Common Lisp functions, variables, or
- other entities.
- [old_change_end]
-
- [change_begin]
- X3J13 voted in June 1989 (PRETTY-PRINT-INTERFACE) to add a format directive
- ~_ (see chapter 27).
- [change_end]
-
- The following characters are called semi-standard:
-
- #\Backspace #\Tab #\Linefeed #\Page #\Return #\Rubout
-
- Not all implementations of Common Lisp need to support them; but those
- implementations that use the standard ASCII character set should support them,
- treating them as corresponding respectively to the ASCII characters BS (octal
- code 010), HT (011), LF (012), FF (014), CR (015), and DEL (177). These
- characters are not members of the subtype standard-char unless synonymous with
- one of the standard characters specified above. For example, in a given
- implementation it might be sensible for the implementor to define #\Linefeed or
- #\Return to be synonymous with #\Newline, or #\Tab to be synonymous with
- #\Space.
-
-
- 2.2.2. Line Divisions
-
- The treatment of line divisions is one of the most difficult issues in
- designing portable software, simply because there is so little agreement among
- operating systems. Some use a single character to delimit lines; the
- recommended ASCII character for this purpose is the line feed character LF
- (also called the new line character, NL), but some systems use the carriage
- return character CR. Much more common is the two-character sequence CR followed
- by LF. Frequently line divisions have no representation as a character but are
- implicit in the structuring of a file into records, each record containing a
- line of text. A deck of punched cards has this structure, for example.
-
- Common Lisp provides an abstract interface by requiring that there be a single
- character, #\Newline, that within the language serves as a line delimiter. (The
- language C has a similar requirement.) An implementation of Common Lisp must
- translate between this internal single-character representation and whatever
- external representation(s) may be used.
-
- -------------------------------------------------------------------------------
- Implementation note: How the character called #\Newline is represented
- internally is not specified here, but it is strongly suggested that the ASCII
- LF character be used in Common Lisp implementations that use the ASCII
- character encoding. The ASCII CR character is a workable, but in most cases
- inferior, alternative.
- -------------------------------------------------------------------------------
-
- [change_begin]
- When the first edition was written it was not yet clear that UNIX would become
- so widely accepted. The decision to represent the line delimiter as a single
- character has proved to be a good one.
- [change_end]
-
- The requirement that a line division be represented as a single character has
- certain consequences. A character string written in the middle of a program in
- such a way as to span more than one line must contain exactly one character to
- represent each line division. Consider this code fragment:
-
- (setq a-string "This string
- contains
- forty-two characters.")
-
- Between g and c there must be exactly one character, #\Newline; a two-character
- sequence, such as #\Return and then #\Newline, is not acceptable, nor is the
- absence of a character. The same is true between s and f.
-
- When the character #\Newline is written to an output file, the Common Lisp
- implementation must take the appropriate action to produce a line division.
- This might involve writing out a record or translating #\Newline to a CR/LF
- sequence.
-
- -------------------------------------------------------------------------------
- Implementation note: If an implementation uses the ASCII character encoding,
- uses the CR/LF sequence externally to delimit lines, uses LF to represent
- #\Newline internally, and supports #\Return as a data object corresponding to
- the ASCII character CR, the question arises as to what action to take when the
- program writes out #\Return followed by #\Newline. It should first be noted
- that #\Return is not a standard Common Lisp character, and the action to be
- taken when #\Return is written out is therefore not defined by the Common Lisp
- language. A plausible approach is to buffer the #\Return character and suppress
- it if and only if the next character is #\Newline (the net effect is to
- generate a CR/LF sequence). Another plausible approach is simply to ignore the
- difficulty and declare that writing #\Return and then #\Newline results in the
- sequence CR/CR/LF in the output.
- -------------------------------------------------------------------------------
-
-
- 2.2.3. Non-standard Characters
-
- Any implementation may provide additional characters, whether printing
- characters or named characters. Some plausible examples:
-
- #\ #\ #\Break #\Home-Up #\Escape
-
- The use of such characters may render Common Lisp programs non-portable.
-
-
- [old_change_begin]
-
- 2.2.4. Character Attributes
-
- Every object of type character has three attributes: code, bits, and font. The
- code attribute is intended to distinguish among the printed glyphs and
- formatting functions for characters; it is a numerical encoding of the
- character proper. The bits attribute allows extra flags to be associated with a
- character. The font attribute permits a specification of the style of the
- glyphs (such as italics). Each of these attributes may be understood to be a
- non-negative integer.
-
- The font attribute may be notated in unsigned decimal notation between the #
- and the \. For example, #3\a means the letter a in font 3. This might mean the
- same thing as #\ if font 3 were used to represent Greek letters. Note that
- not all Common Lisp implementations provide for non-zero font attributes; see
- char-font-limit.
-
- The bits attribute may be notated by preceding the name of the character by the
- names or initials of the bits, separated by hyphens. The character itself may
- be written instead of the name, preceded if necessary by \. For example:
-
- #\Control-Meta-Return #\Meta-Control-Q
- #\Hyper-Space #\Meta-\a
- #\Control-A #\Meta-Hyper-\:
- #\C-M-Return #\Hyper-\
-
- Note that not all Common Lisp implementations provide for non-zero bits
- attributes; see char-bits-limit.
- [old_change_end]
-
- [change_begin]
- X3J13 voted in March 1989 (CHARACTER-PROPOSAL) to replace the notion of bits
- and font attributes with that of implementation-defined attributes.
- [change_end]
-
-
- [old_change_begin]
-
- 2.2.5. String Characters
-
- Any character whose bits and font attributes are zero may be contained in
- strings. All such characters together constitute a subtype of the characters;
- this subtype is called string-char.
- [old_change_end]
-
- [change_begin]
- X3J13 voted in March 1989 (CHARACTER-PROPOSAL) to eliminate the type
- string-char. Two new subtypes of character are base-character, defined to be
- equivalent to the result of the function call
-
- (upgraded-array-element-type 'standard-char)
-
- and extended-character, defined to be equivalent to the type specifier
-
- (and character (not base-character))
-
- An implementation may support additional subtypes of character that may or may
- not be supertypes of base-character. In addition, an implementation may define
- base-character to be equivalent to character. The choice of any base characters
- that are not standard characters is implementation-defined. Only base
- characters can be elements of a base string. No upper bound is specified for
- the number of distinct characters of type base-character-that is
- implementation-dependent-but the lower bound is 96, the number of standard
- Common Lisp characters.
- [change_end]
-
-
- 2.3. Symbols
-
- Symbols are Lisp data objects that serve several purposes and have several
- interesting characteristics. Every object of type symbol has a name, called its
- print name. Given a symbol, one can obtain its name in the form of a string.
- Conversely, given the name of a symbol as a string, one can obtain the symbol
- itself. (More precisely, symbols are organized into packages, and all the
- symbols in a package are uniquely identified by name. See chapter 11.)
-
- Symbols have a component called the property list, or plist. By convention this
- is always a list whose even-numbered components (calling the first component
- zero) are symbols, here functioning as property names, and whose odd-numbered
- components are associated property values. Functions are provided for
- manipulating this property list; in effect, these allow a symbol to be treated
- as an extensible record structure.
-
- Symbols are also used to represent certain kinds of variables in Lisp programs,
- and there are functions for dealing with the values associated with symbols in
- this role.
-
- A symbol can be notated simply by writing its name. If its name is not empty,
- and if the name consists only of uppercase alphabetic, numeric, or certain
- pseudo-alphabetic special characters (but not delimiter characters such as
- parentheses or space), and if the name of the symbol cannot be mistaken for a
- number, then the symbol can be notated by the sequence of characters in its
- name. Any uppercase letters that appear in the (internal) name may be written
- in either case in the external notation (more on this below). For example:
-
- FROBBOZ ;The symbol whose name is FROBBOZ
- frobboz ;Another way to notate the same symbol
- fRObBoz ;Yet another way to notate it
- unwind-protect ;A symbol with a - in its name
- +$ ;The symbol named +$
- 1+ ;The symbol named 1+
- +1 ;This is the integer 1, not a symbol
- pascal_style ;This symbol has an underscore in its name
- b^2-4*a*c ;This is a single symbol!
- ; It has several special characters in its name
- file.rel.43 ;This symbol has periods in its name
- /usr/games/zork ;This symbol has slashes in its name
-
- In addition to letters and numbers, the following characters are normally
- considered to be alphabetic for the purposes of notating symbols:
-
- + - * / @ $ % ^ & _ = < > ~ .
-
- Some of these characters have conventional purposes for naming things; for
- example, symbols that name special variables generally have names beginning and
- ending with *. The last character listed above, the period, is considered
- alphabetic provided that a token does not consist entirely of periods. A single
- period standing by itself is used in the notation of conses and dotted lists; a
- token consisting of two or more periods is syntactically illegal. (The period
- also serves as the decimal point in the notation of numbers.)
-
- The following characters are also alphabetic by default but are explicitly
- reserved to the user for definition as reader macro characters (see section
- 22.1.3) or any other desired purpose and therefore should not be used routinely
- in names of symbols:
-
- ? ! [ ] { }
-
- A symbol may have uppercase letters, lowercase letters, or both in its print
- name. However, the Lisp reader normally converts lowercase letters to the
- corresponding uppercase letters when reading symbols. The net effect is that
- most of the time case makes no difference when notating symbols. Case does make
- a difference internally and when printing a symbol. Internally the symbols that
- name all standard Common Lisp functions, variables, and keywords have uppercase
- names; their names appear in lowercase in this book for readability. Typing
- such names with lowercase letters works because the function read will convert
- lowercase letters to the equivalent uppercase letters.
-
- [change_begin]
- X3J13 voted in June 1989 (READ-CASE-SENSITIVITY) to introduce readtable-case,
- which controls whether read will alter the case of letters read as part of the
- name of a symbol.
- [change_end]
-
- If a symbol cannot be simply notated by the characters of its name because the
- (internal) name contains special characters or lowercase letters, then there
- are two ``escape'' conventions for notating them. Writing a character before
- any character causes the character to be treated itself as an ordinary
- character for use in a symbol name; in particular, it suppresses internal
- conversion of lowercase letters to their uppercase equivalents. If any
- character in a notation is preceded by , then that notation can never be
- interpreted as a number. For example:
-
- \( ;The symbol whose name is (
- \+1 ;The symbol whose name is +1
- +\1 ;Also the symbol whose name is +1
- \frobboz ;The symbol whose name is fROBBOZ
- 3.14159265\s0 ;The symbol whose name is 3.14159265s0
- 3.14159265\S0 ;A different symbol, whose name is 3.14159265S0
- 3.14159265s0 ;A short-format floating-point approximation to
- APL\\360 ;The symbol whose name is APL 360
- apl\\360 ;Also the symbol whose name is APL 360
- \(b^2\)\ -\ 4*a*c ;The name is (B^2) - 4*A*C;
- ; it has parentheses and two spaces in it
- \(\b^2\)\ -\ 4*\a*\c ;The name is (b^2) - 4*a*c;
- ; the letters are explicitly lowercase
-
- It may be tedious to insert a \ before every delimiter character in the name of
- a symbol if there are many of them. An alternative convention is to surround
- the name of a symbol with vertical bars; these cause every character between
- them to be taken as part of the symbol's name, as if \ had been written before
- each one, excepting only | itself and \, which must nevertheless be preceded by
- \. For example:
-
- |"| ;The same as writing \"
- |(b^2) - 4*a*c| ;The name is (b^2) - 4*a*c
- |frobboz| ;The name is frobboz, not FROBBOZ
- |APL\360| ;The name is APL360, because the \ quotes the 3
- |APL\\360| ;The name is APL\360
- |apl\\360| ;The name is apl\360
- |\|\|| ;Same as \|\|: the name is ||
- |(B^2) - 4*A*C| ;The name is (B^2) - 4*A*C;
- ; it has parentheses and two spaces in it
- |(b^2) - 4*a*c| ;The name is (b^2) - 4*a*c
-
-
- 2.4. Lists and Conses
-
- A cons is a record structure containing two components called the car and the
- cdr. Conses are used primarily to represent lists.
-
- A list is recursively defined to be either the empty list or a cons whose cdr
- component is a list. A list is therefore a chain of conses linked by their cdr
- components and terminated by nil, the empty list. The car components of the
- conses are called the elements of the list. For each element of the list there
- is a cons. The empty list has no elements at all.
-
- A list is notated by writing the elements of the list in order, separated by
- blank space (space, tab, or return characters) and surrounded by parentheses.
-
- (a b c) ;A list of three symbols
- (2.0s0 (a 1) #\*) ;A list of three things: a short floating-point
- ; number, another list, and a character object
-
- The empty list nil therefore can be written as (), because it is a list with no
- elements.
-
- A dotted list is one whose last cons does not have nil for its cdr, rather some
- other data object (which is also not a cons, or the first-mentioned cons would
- not be the last cons of the list). Such a list is called ``dotted'' because of
- the special notation used for it: the elements of the list are written between
- parentheses as before, but after the last element and before the right
- parenthesis are written a dot (surrounded by blank space) and then the cdr of
- the last cons. As a special case, a single cons is notated by writing the car
- and the cdr between parentheses and separated by a space-surrounded dot. For
- example:
-
- (a . 4) ;A cons whose car is a symbol
- ; and whose cdr is an integer
- (a b c . d) ;A dotted list with three elements whose last cons
- ; has the symbol d in its cdr
-
- -------------------------------------------------------------------------------
- Compatibility note: In MacLisp, the dot in dotted-list notation need not be
- surrounded by white space or other delimiters. The dot is required to be
- delimited in Common Lisp, as in Lisp Machine Lisp.
- -------------------------------------------------------------------------------
-
- It is legitimate to write something like (a b . (c d)); this means the same as
- (a b c d). The standard Lisp output routines will never print a list in the
- first form, however; they will avoid dot notation wherever possible.
-
- Often the term list is used to refer either to true lists or to dotted lists.
- When the distinction is important, the term ``true list'' will be used to refer
- to a list terminated by nil. Most functions advertised to operate on lists
- expect to be given true lists. Throughout this book, unless otherwise
- specified, it is an error to pass a dotted list to a function that is specified
- to require a list as an argument.
-
- -------------------------------------------------------------------------------
- Implementation note: Implementors are encouraged to use the equivalent of the
- predicate endp wherever it is necessary to test for the end of a list. Whenever
- feasible, this test should explicitly signal an error if a list is found to be
- terminated by a non-nil atom. However, such an explicit error signal is not
- required, because some such tests occur in important loops where efficiency is
- important. In such cases, the predicate atom may be used to test for the end of
- the list, quietly treating any non-nil list-terminating atom as if it were nil.
- -------------------------------------------------------------------------------
-
- Sometimes the term tree is used to refer to some cons and all the other conses
- transitively accessible to it through car and cdr links until non-conses are
- reached; these non-conses are called the leaves of the tree.
-
- Lists, dotted lists, and trees are not mutually exclusive data types; they are
- simply useful points of view about structures of conses. There are yet other
- terms, such as association list. None of these are true Lisp data types. Conses
- are a data type, and nil is the sole object of type null. The Lisp data type
- list is taken to mean the union of the cons and null data types, and therefore
- encompasses both true lists and dotted lists.
-
-
- 2.5. Arrays
-
- An array is an object with components arranged according to a Cartesian
- coordinate system. In general, these components may be any Lisp data objects.
-
- The number of dimensions of an array is called its rank (this terminology is
- borrowed from APL); the rank is a non-negative integer. Likewise, each
- dimension is itself a non-negative integer. The total number of elements in the
- array is the product of all the dimensions.
-
- An implementation of Common Lisp may impose a limit on the rank of an array,
- but this limit may not be smaller than 7. Therefore, any Common Lisp program
- may assume the use of arrays of rank 7 or less. (A program may determine the
- actual limit on array ranks for a given implementation by examining the
- constant array-rank-limit.)
-
- It is permissible for a dimension to be zero. In this case, the array has no
- elements, and any attempt to access an element is in error. However, other
- properties of the array, such as the dimensions themselves, may be used. If the
- rank is zero, then there are no dimensions, and the product of the dimensions
- is then by definition 1. A zero-rank array therefore has a single element.
-
- An array element is specified by a sequence of indices. The length of the
- sequence must equal the rank of the array. Each index must be a non-negative
- integer strictly less than the corresponding array dimension. Array indexing is
- therefore zero-origin, not one-origin as in (the default case of) Fortran.
-
- As an example, suppose that the variable foo names a 3-by-5 array. Then the
- first index may be 0, 1, or 2, and the second index may be 0, 1, 2, 3, or 4.
- One may refer to array elements using the function aref; for example, (aref foo
- 2 1) refers to element (2, 1) of the array. Note that aref takes a variable
- number of arguments: an array, and as many indices as the array has dimensions.
- A zero-rank array has no dimensions, and therefore aref would take such an
- array and no indices, and return the sole element of the array.
-
- In general, arrays can be multidimensional, can share their contents with other
- array objects, and can have their size altered dynamically (either enlarging or
- shrinking) after creation. A one-dimensional array may also have a fill
- pointer.
-
- Multidimensional arrays store their components in row-major order; that is,
- internally a multidimensional array is stored as a one-dimensional array, with
- the multidimensional index sets ordered lexicographically, last index varying
- fastest. This is important in two situations: (1) when arrays with different
- dimensions share their contents, and (2) when accessing very large arrays in a
- virtual-memory implementation. (The first situation is a matter of semantics;
- the second, a matter of efficiency.)
-
- An array that is not displaced to another array, has no fill pointer, and is
- not to have its size adjusted dynamically after creation is called a simple
- array. The user may provide declarations that certain arrays will be simple.
- Some implementations can handle simple arrays in an especially efficient
- manner; for example, simple arrays may have a more compact representation than
- non-simple arrays.
-
- [change_begin]
- X3J13 voted in June 1989 (ADJUST-ARRAY-NOT-ADJUSTABLE) to clarify that if one
- or more of the :adjustable, :fill-pointer, and :displaced-to arguments is true
- when make-array is called, then whether the resulting array is simple is
- unspecified; but if all three arguments are false, then the resulting array is
- guaranteed to be simple.
- [change_end]
- -------------------------------------------------------------------------------
-
- * Vectors
- * Strings
- * Bit-Vectors
-
-
- 2.5.1. Vectors
-
- One-dimensional arrays are called vectors in Common Lisp and constitute the
- type vector (which is therefore a subtype of array). Vectors and lists are
- collectively considered to be sequences. They differ in that any component of a
- one-dimensional array can be accessed in constant time, whereas the average
- component access time for a list is linear in the length of the list; on the
- other hand, adding a new element to the front of a list takes constant time,
- whereas the same operation on an array takes time linear in the length of the
- array.
-
- A general vector (a one-dimensional array that can have any data object as an
- element but that has no additional paraphernalia) can be notated by notating
- the components in order, separated by whitespace and surrounded by #( and ).
- For example:
-
- #(a b c) ;A vector of length 3
- #() ;An empty vector
- #(2 3 5 7 11 13 17 19 23 29 31 37 41 43 47)
- ;A vector containing the primes below 50
-
- Note that when the function read parses this syntax, it always constructs a
- simple general vector.
-
- -------------------------------------------------------------------------------
- Rationale: Many people have suggested that brackets be used to notate vectors,
- as [a b c] instead of #(a b c). This notation would be shorter, perhaps more
- readable, and certainly in accord with cultural conventions in other parts of
- computer science and mathematics. However, to preserve the usefulness of the
- user-definable macro-character feature of the function read, it is necessary to
- leave some characters to the user for this purpose. Experience in MacLisp has
- shown that users, especially implementors of languages for use in artificial
- intelligence research, often want to define special kinds of brackets.
- Therefore Common Lisp avoids using brackets and braces for any syntactic
- purpose.
- -------------------------------------------------------------------------------
-
- Implementations may provide certain specialized representations of arrays for
- efficiency in the case where all the components are of the same specialized
- (typically numeric) type. All implementations provide specialized arrays for
- the cases when the components are characters (or rather, a special subset of
- the characters); the one-dimensional instances of this specialization are
- called strings. All implementations are also required to provide specialized
- arrays of bits, that is, arrays of type (array bit); the one-dimensional
- instances of this specialization are called bit-vectors.
-
-
- 2.5.2. Strings
-
- [old_change_begin]
- A string is simply a vector of characters. More precisely, a string is a
- specialized vector whose elements are of type string-char.
- [old_change_end]
-
- [change_begin]
- X3J13 voted in March 1989 (CHARACTER-PROPOSAL) to eliminate the type
- string-char and to redefine the type string to be the union of one or more
- specialized vector types, the types of whose elements are subtypes of the type
- character. Subtypes of string include simple-string, base-string, and
- simple-base-string.
-
- base-string == (vector base-character)
- simple-base-string == (simple-array base-character (*))
-
- An implementation may support other string subtypes as well. All Common Lisp
- functions that operate on strings treat all strings uniformly; note, however,
- that it is an error to attempt to insert an extended character into a base
- string.
- [change_end]
-
- The type string is therefore a subtype of the type vector.
-
- A string can be written as the sequence of characters contained in the string,
- preceded and followed by a " (double quote) character. Any " or \ character in
- the sequence must additionally have a \ character before it.
-
- For example:
-
- "Foo" ;A string with three characters in it
- "" ;An empty string
- "\"APL\\360?\" he cried." ;A string with twenty characters
- "|x| = |-x|" ;A ten-character string
-
- Notice that any vertical bar | in a string need not be preceded by a \.
- Similarly, any double quote in the name of a symbol written using vertical-bar
- notation need not be preceded by a \. The double-quote and vertical-bar
- notations are similar but distinct: double quotes indicate a character string
- containing the sequence of characters, whereas vertical bars indicate a symbol
- whose name is the contained sequence of characters.
-
- The characters contained by the double quotes, taken from left to right, occupy
- locations within the string with increasing indices. The leftmost character is
- string element number 0, the next one is element number 1, the next one is
- element number 2, and so on.
-
- Note that the function prin1 will print any character vector (not just a simple
- one) using this syntax, but the function read will always construct a simple
- string when it reads this syntax.
-
-
- 2.5.3. Bit-Vectors
-
- A bit-vector can be written as the sequence of bits contained in the string,
- preceded by #*; any delimiter character, such as whitespace, will terminate the
- bit-vector syntax. For example:
-
- #*10110 ;A five-bit bit-vector; bit 0 is a 1
- #* ;An empty bit-vector
-
- The bits notated following the #*, taken from left to right, occupy locations
- within the bit-vector with increasing indices. The leftmost notated bit is
- bit-vector element number 0, the next one is element number 1, and so on.
-
- The function prin1 will print any bit-vector (not just a simple one) using this
- syntax, but the function read will always construct a simple bit-vector when it
- reads this syntax.
-
-
- 2.6. Hash Tables
-
- Hash tables provide an efficient way of mapping any Lisp object (a key) to an
- associated object. They are provided as primitives of Common Lisp because some
- implementations may need to use internal storage management strategies that
- would make it very difficult for the user to implement hash tables in a
- portable fashion. Hash tables are described in chapter 16.
-
-
- 2.7. Readtables
-
- A readtable is a data structure that maps characters into syntax types for the
- Lisp expression parser. In particular, a readtable indicates for each character
- with syntax macro character what its macro definition is. This is a mechanism
- by which the user may reprogram the parser to a limited but useful extent. See
- section 22.1.5.
-
-
- 2.8. Packages
-
- Packages are collections of symbols that serve as name spaces. The parser
- recognizes symbols by looking up character sequences in the current package.
- Packages can be used to hide names internal to a module from other code.
- Mechanisms are provided for exporting symbols from a given package to the
- primary ``user'' package. See chapter 11.
-
-
- 2.9. Pathnames
-
- Pathnames are the means by which a Common Lisp program can interface to an
- external file system in a reasonably implementation-independent manner. See
- section 23.1.1.
-
-
- 2.10. Streams
-
- A stream is a source or sink of data, typically characters or bytes. Nearly all
- functions that perform I/O do so with respect to a specified stream. The
- function open takes a pathname and returns a stream connected to the file
- specified by the pathname. There are a number of standard streams that are used
- by default for various purposes. See chapter 21.
-
- [change_begin]
- X3J13 voted in January 1989 (STREAM-ACCESS) to introduce subtypes of type
- stream: broadcast-stream, concatenated-stream, echo-stream, synonym-stream,
- string-stream, file-stream, and two-way-stream are disjoint subtypes of stream.
- Note particularly that a synonym stream is always and only of type
- synonym-stream, regardless of the type of the stream for which it is a synonym.
-
- [change_end]
-
-
- 2.11. Random-States
-
- An object of type random-state is used to encapsulate state information used by
- the pseudo-random number generator. For more information about random-state
- objects, see section 12.9.
-
-
- 2.12. Structures
-
- Structures are instances of user-defined data types that have a fixed number of
- named components. They are analogous to records in Pascal. Structures are
- declared using the defstruct construct; defstruct automatically defines access
- and constructor functions for the new data type.
-
- Different structures may print out in different ways; the definition of a
- structure type may specify a print procedure to use for objects of that type
- (see the :print-function option to defstruct). The default notation for
- structures is
-
- #S(structure-name
- slot-name-1 slot-value-1
- slot-name-2 slot-value-2
- ...)
-
- where #S indicates structure syntax, structure-name is the name (a symbol) of
- the structure type, each slot-name is the name (also a symbol) of a component,
- and each corresponding slot-value is the representation of the Lisp object in
- that slot.
-
-
- 2.13. Functions
-
- [old_change_begin]
- A function is anything that may be correctly given to the funcall or apply
- function, and is to be executed as code when arguments are supplied.
-
- A compiled-function is a compiled code object.
-
- A lambda-expression (a list whose car is the symbol lambda) may serve as a
- function. Depending on the implementation, it may be possible for other lists
- to serve as functions. For example, an implementation might choose to represent
- a ``lexical closure'' as a list whose car contains some special marker.
-
- A symbol may serve as a function; an attempt to invoke a symbol as a function
- causes the contents of the symbol's function cell to be used. See
- symbol-function and defun.
-
- The result of evaluating a function special form will always be a function.
- [old_change_end]
-
- [change_begin]
- X3J13 voted in June 1988 (FUNCTION-TYPE) to revise these specifications. The
- type function is to be disjoint from cons and symbol, and so a list whose car
- is lambda is not, properly speaking, of type function, nor is any symbol.
- However, standard Common Lisp functions that accept functional arguments will
- accept a symbol or a list whose car is lambda and automatically coerce it to be
- a function; such standard functions include funcall, apply, and mapcar. Such
- functions do not, however, accept a lambda-expression as a functional argument;
- therefore one may not write
-
- (mapcar '(lambda (x y) (sqrt (* x y))) p q)
-
- but instead one must write something like
-
- (mapcar #'(lambda (x y) (sqrt (* x y))) p q)
-
- This change makes it impermissible to represent a lexical closure as a list
- whose car is some special marker.
-
- The value of a function special form will always be of type function.
- [change_end]
-
-
- 2.14. Unreadable Data Objects
-
- Some objects may print in implementation-dependent ways. Such objects cannot
- necessarily be reliably reconstructed from a printed representation, and so
- they are usually printed in a format informative to the user but not acceptable
- to the read function: #<useful information>. The Lisp reader will signal an
- error on encountering #<.
-
- As a hypothetical example, an implementation might print
-
- #<stack-pointer si:rename-within-new-definition-maybe #o311037552>
-
- for an implementation-specific ``internal stack pointer'' data type whose
- printed representation includes the name of the type, some information about
- the stack slot pointed to, and the machine address (in octal) of the stack
- slot.
-
- [change_begin]
- See print-unreadable-object, a macro that prints an object using #< syntax.
- [change_end]
-
-
- 2.15. Overlap, Inclusion, and Disjointness of Types
-
- The Common Lisp data type hierarchy is tangled and purposely left somewhat
- open-ended so that implementors may experiment with new data types as
- extensions to the language. This section explicitly states all the defined
- relationships between types, including subtype/supertype relationships,
- disjointness, and exhaustive partitioning. The user of Common Lisp should not
- depend on any relationships not explicitly stated here. For example, it is not
- valid to assume that because a number is not complex and not rational that it
- must be a float, because implementations are permitted to provide yet other
- kinds of numbers.
-
- First we need some terminology. If x is a supertype of y, then any object of
- type y is also of type x, and y is said to be a subtype of x. If types x and y
- are disjoint, then no object (in any implementation) may be both of type x and
- of type y. Types through are an exhaustive union of type x if each is
- a subtype of x, and any object of type x is necessarily of at least one of the
- types ; through are furthermore an exhaustive partition if they are
- also pairwise disjoint.
-
- * The type t is a supertype of every type whatsoever. Every object is of
- type t.
-
- * The type nil is a subtype of every type whatsoever. No object is of type
- nil.
-
- [old_change_begin]
-
- * The types cons, symbol, array, number, and character are pairwise
- disjoint.
-
- [old_change_end]
-
- [change_begin]
- X3J13 voted in June 1988 (DATA-TYPES-HIERARCHY-UNDERSPECIFIED) to extend the
- preceding paragraph as follows.
-
- * The types cons, symbol, array, number, character, hash-table, readtable,
- package, pathname, stream, random-state, and any single other type created
- by defstruct or defclass are pairwise disjoint.
-
- The wording of the first edition was intended to allow implementors to use the
- defstruct facility to define the built-in types hash-table, readtable, package,
- pathname, stream, random-state. The change still permits this implementation
- strategy but forbids these built-in types from including, or being included in,
- other types (in the sense of the defstruct :include option).
-
- X3J13 voted in June 1988 (FUNCTION-TYPE) to specify that the type function is
- disjoint from the types cons, symbol, array, number, and character. The type
- compiled-function is a subtype of function; implementations are free to define
- other subtypes of function.
- [change_end]
-
- [old_change_begin]
-
- * The types rational, float, and complex are pairwise disjoint subtypes of
- number.
-
- [old_change_end]
-
- [change_begin]
- X3J13 voted in March 1989 (REAL-NUMBER-TYPE) to rewrite the preceding item as
- follows.
-
- * The types real and complex are pairwise disjoint subtypes of number.
-
- -------------------------------------------------------------------------------
- Rationale: It might be thought that real and complex should form an exhaustive
- partition of the type number. This is purposely avoided here in order to permit
- compatible experimentation with extensions to the Common Lisp number system.
- -------------------------------------------------------------------------------
-
- * The types rational and float are pairwise disjoint subtypes of real.
-
- -------------------------------------------------------------------------------
- Rationale: It might be thought that rational and float should form an
- exhaustive partition of the type real. This is purposely avoided here in order
- to permit compatible experimentation with extensions to the Common Lisp number
- system.
- -------------------------------------------------------------------------------
- [change_end]
-
- * The types integer and ratio are disjoint subtypes of rational.
-
- -------------------------------------------------------------------------------
- Rationale: It might be thought that integer and ratio should form an exhaustive
- partition of the type rational. This is purposely avoided here in order to
- permit compatible experimentation with extensions to the Common Lisp rational
- number system.
- -------------------------------------------------------------------------------
-
- [old_change_begin]
-
- * The types fixnum and bignum are disjoint subtypes of integer.
-
- -------------------------------------------------------------------------------
- Rationale: It might be thought that fixnum and bignum should form an exhaustive
- partition of the type integer. This is purposely avoided here in order to
- permit compatible experimentation with extensions to the Common Lisp integer
- number system, such as the idea of adding explicit representations of infinity
- or of positive and negative infinity.
- -------------------------------------------------------------------------------
-
- [old_change_end]
-
- [change_begin]
- X3J13 voted in January 1989 (FIXNUM-NON-PORTABLE) to specify that the types
- fixnum and bignum do in fact form an exhaustive partition of the type integer;
- more precisely, they voted to specify that the type bignum is by definition
- equivalent to (and integer (not fixnum)). This is consistent with the first
- edition text in section 2.1.1.
-
- I interpret this to mean that implementators could still experiment with such
- extensions as adding explicit representations of infinity, but such infinities
- would necessarily be of type bignum.
- [change_end]
-
- * The types short-float, single-float, double-float, and long-float are
- subtypes of float. Any two of them must be either disjoint or identical;
- if identical, then any other types between them in the above ordering must
- also be identical to them (for example, if single-float and long-float are
- identical types, then double-float must be identical to them also).
-
- * The type null is a subtype of symbol; the only object of type null is
- nil.
-
- * The types cons and null form an exhaustive partition of the type list.
-
- [old_change_begin]
-
- * The type standard-char is a subtype of string-char; string-char is a
- subtype of character.
-
- [old_change_end]
-
- [change_begin]
- X3J13 voted in March 1989 (CHARACTER-PROPOSAL) to remove the type
- string-char. The preceding item is replaced by the following.
-
- * The type standard-char is a subtype of base-character. The types
- base-character and extended-character form an exhaustive partition of
- character.
-
- [change_end]
-
- [old_change_begin]
-
- * The type string is a subtype of vector, for string means (vector
- string-char).
-
- [old_change_end]
-
- [change_begin]
- X3J13 voted in March 1989 (CHARACTER-PROPOSAL) to remove the type
- string-char. The preceding item is replaced by the following.
-
- * The type string is a subtype of vector; it is the union of all types
- (vector c) such that c is a subtype of character.
-
- [change_end]
-
- * The type bit-vector is a subtype of vector, for bit-vector means (vector
- bit).
-
- * The types (vector t), string, and bit-vector are disjoint.
-
- * The type vector is a subtype of array; for all types x, the type (vector
- x) is the same as the type (array x (*)).
-
- * The type simple-array is a subtype of array.
-
- [old_change_begin]
-
- * The types simple-vector, simple-string, and simple-bit-vector are
- disjoint subtypes of simple-array, for they respectively mean
- (simple-array t (*)), (simple-array string-char (*)), and (simple-array
- bit (*)).
-
- [old_change_end]
-
- [change_begin]
- X3J13 voted in March 1989 (CHARACTER-PROPOSAL) to remove the type
- string-char. The preceding item is replaced by the following.
-
- * The types simple-vector, simple-string, and simple-bit-vector are
- disjoint subtypes of simple-array, for they mean (simple-array t (*)), the
- union of all types (simple-array c (*)) such that c is a subtype of
- character, and (simple-array bit (*)), respectively.
-
- [change_end]
-
- * The type simple-vector is a subtype of vector and indeed is a subtype of
- (vector t).
-
- * The type simple-string is a subtype of string. (Note that although string
- is a subtype of vector, simple-string is not a subtype of simple-vector.)
-
- -------------------------------------------------------------------------------
- Rationale: The hypothetical name simple-general-vector would have been more
- accurate than simple-vector, but in this instance euphony and user convenience
- were deemed more important to the design of Common Lisp than a rigid symmetry.
- -------------------------------------------------------------------------------
-
- * The type simple-bit-vector is a subtype of bit-vector. (Note that
- although bit-vector is a subtype of vector, simple-bit-vector is not a
- subtype of simple-vector.)
-
- * The types vector and list are disjoint subtypes of sequence.
-
- * The types random-state, readtable, package, pathname, stream, and
- hash-table are pairwise disjoint.
-
- [change_begin]
- X3J13 voted in June 1988 (DATA-TYPES-HIERARCHY-UNDERSPECIFIED) to make
- random-state, readtable, package, pathname, stream, and hash-table pairwise
- disjoint from a number of other types as well; see note above.
-
- X3J13 voted in January 1989 (STREAM-ACCESS) to introduce subtypes of type
- stream.
-
- * The types two-way-stream, echo-stream, broadcast-stream, file-stream,
- synonym-stream, string-stream, and concatenated-stream are disjoint
- subtypes of stream.
-
- [change_end]
-
- * Any two types created by defstruct are disjoint unless one is a supertype
- of the other by virtue of the :include option.
-
- [old_change_begin]
-
- * An exhaustive union for the type common is formed by the types cons,
- symbol, (array x) where x is either t or a subtype of common, string,
- fixnum, bignum, ratio, short-float, single-float, double-float,
- long-float, (complex x) where x is a subtype of common, standard-char,
- hash-table, readtable, package, pathname, stream, random-state, and all
- types created by the user via defstruct. An implementation may not
- unilaterally add subtypes to common; however, future revisions to the
- Common Lisp standard may extend the definition of the common data type.
- Note that a type such as number or array may or may not be a subtype of
- common, depending on whether or not the given implementation has extended
- the set of objects of that type.
-
- [old_change_end]
-
- [change_begin]
- X3J13 voted in March 1989 (COMMON-TYPE) to remove the type common from the
- language.
- [change_end]
-
-
-