Celestin Apprentice 7

home *** CD-ROM | disk | FTP | other *** search

/ Celestin Apprentice 7 / Apprentice-Release7.iso / Environments / PowerLisp 2.01 / Supplemental Documentation / Documentation / Chapter 02. Data Types < prev next >

Wrap

Text File | 1995-03-27 | 74.1 KB | 1,589 lines | [TEXT/ROSA]

Common Lisp the Language, 2nd Edition ------------------------------------------------------------------------------- 2. Data Types Common Lisp provides a variety of types of data objects. It is important to note that in Lisp it is data objects that are typed, not variables. Any variable can have any Lisp object as its value. (It is possible to make an explicit declaration that a variable will in fact take on one of only a limited set of values. However, such a declaration may always be omitted, and the program will still run correctly. Such a declaration merely constitutes advice from the user that may be useful in gaining efficiency. See declare.) In Common Lisp, a data type is a (possibly infinite) set of Lisp objects. Many Lisp objects belong to more than one such set, and so it doesn't always make sense to ask what is the type of an object; instead, one usually asks only whether an object belongs to a given type. The predicate typep may be used to ask whether an object belongs to a given type, and the function type-of returns a type to which a given object belongs. The data types defined in Common Lisp are arranged into a hierarchy (actually a partial order) defined by the subset relationship. Certain sets of objects, such as the set of numbers or the set of strings, are interesting enough to deserve labels. Symbols are used for most such labels (here, and throughout this book, the word ``symbol'' refers to atomic symbols, one kind of Lisp object, elsewhere known as literal atoms). See chapter 4 for a complete description of type specifiers. The set of all objects is specified by the symbol t. The empty data type, which contains no objects, is denoted by nil. [old_change_begin] A type called common encompasses all the data objects required by the Common Lisp language. A Common Lisp implementation is free to provide other data types that are not subtypes of common. [old_change_end] [change_begin] X3J13 voted in March 1989 (COMMON-TYPE) to remove the type common (and the predicate commonp) from the language, on the grounds that it has not proved to be useful in practice and that it could be difficult to redefine in the face of other changes to the Common Lisp type system (such as the introduction of CLOS classes). [change_end] The following categories of Common Lisp objects are of particular interest: numbers, characters, symbols, lists, arrays, structures, and functions. There are others as well. Some of these categories have many subdivisions. There are also standard types defined to be the union of two or more of these categories. The categories listed above, while they are data types, are neither more nor less ``real'' than other data types; they simply constitute a particularly useful slice across the type hierarchy for expository purposes. Here are brief descriptions of various Common Lisp data types. The remaining sections of this chapter go into more detail and also describe notations for objects of each type. Descriptions of Lisp functions that operate on data objects of each type appear in later chapters. * Numbers are provided in various forms and representations. Common Lisp provides a true integer data type: any integer, positive or negative, has in principle a representation as a Common Lisp data object, subject only to total memory limitations (rather than machine word width). A true rational data type is provided: the quotient of two integers, if not an integer, is a ratio. Floating-point numbers of various ranges and precisions are also provided, as well as Cartesian complex numbers. * Characters represent printed glyphs such as letters or text formatting operations. Strings are one-dimensional arrays of characters. Common Lisp provides for a rich character set, including ways to represent characters of various type styles. * Symbols (sometimes called atomic symbols for emphasis or clarity) are named data objects. Lisp provides machinery for locating a symbol object, given its name (in the form of a string). Symbols have property lists, which in effect allow symbols to be treated as record structures with an extensible set of named components, each of which may be any Lisp object. Symbols also serve to name functions and variables within programs. * Lists are sequences represented in the form of linked cells called conses. There is a special object (the symbol nil) that is the empty list. All other lists are built recursively by adding a new element to the front of an existing list. This is done by creating a new cons, which is an object having two components called the car and the cdr. The car may hold anything, and the cdr is made to point to the previously existing list. (Conses may actually be used completely generally as two-element record structures, but their most important use is to represent lists.) * Arrays are dimensioned collections of objects. An array can have any non-negative number of dimensions and is indexed by a sequence of integers. A general array can have any Lisp object as a component; other types of arrays are specialized for efficiency and can hold only certain types of Lisp objects. It is possible for two arrays, possibly with differing dimension information, to share the same set of elements (such that modifying one array modifies the other also) by causing one to be displaced to the other. One-dimensional arrays of any kind are called vectors. One-dimensional arrays of characters are called strings. One-dimensional arrays of bits (that is, of integers whose values are 0 or 1) are called bit-vectors. * Hash tables provide an efficient way of mapping any Lisp object (a key) to an associated object. * Readtables are used to control the built-in expression parser read. * Packages are collections of symbols that serve as name spaces. The parser recognizes symbols by looking up character sequences in the current package. * Pathnames represent names of files in a fairly implementation-independent manner. They are used to interface to the external file system. * Streams represent sources or sinks of data, typically characters or bytes. They are used to perform I/O, as well as for internal purposes such as parsing strings. * Random-states are data structures used to encapsulate the state of the built-in random-number generator. * Structures are user-defined record structures, objects that have named components. The defstruct facility is used to define new structure types. Some Common Lisp implementations may choose to implement certain system-supplied data types, such as bignums, readtables, streams, hash tables, and pathnames, as structures, but this fact will be invisible to the user. [old_change_begin] * Functions are objects that can be invoked as procedures; these may take arguments and return values. (All Lisp procedures can be construed to return values and therefore every procedure is a function.) Such objects include compiled-functions (compiled code objects). Some functions are represented as a list whose car is a particular symbol such as lambda. Symbols may also be used as functions. [old_change_end] [change_begin] X3J13 voted in June 1988 (FUNCTION-TYPE) to specify that symbols are not of type function, but are automatically coerced to functions in certain situations (see section 2.13). X3J13 voted in June 1988 (CONDITION-SYSTEM) to adopt the Common Lisp Condition System, thereby introducing a new category of data objects: * Conditions are objects used to affect control flow in certain conventional ways by means of signals and handlers that intercept those signals. In particular, errors are signaled by raising particular conditions, and errors may be trapped by establishing handlers for those conditions. X3J13 voted in June 1988 (CLOS) to adopt the Common Lisp Object System, thereby introducing additional categories of data objects: * Classes determine the structure and behavior of other objects, their instances. Every Common Lisp data object belongs to some class. (In some ways the CLOS class system is a generalization of the system of type specifiers of the first edition of this book, but the class system augments the type system rather than supplanting it.) * Methods are chunks of code that operate on arguments satisfying a particular pattern of classes. Methods are not functions; they are not invoked directly on arguments but instead are bundled into generic functions. * Generic functions are functions that contain, among other information, a set of methods. When invoked, a generic function executes a subset of its methods. The subset chosen for execution depends in a specific way on the classes or identities of the arguments to which it is applied. [change_end] These categories are not always mutually exclusive. The required relationships among the various data types are explained in more detail in section 2.15. ------------------------------------------------------------------------------- * Numbers o Integers o Ratios o Floating-Point Numbers o Complex Numbers * Characters o Standard Characters o Line Divisions o Non-standard Characters o Character Attributes o String Characters * Symbols * Lists and Conses * Arrays o Vectors o Strings o Bit-Vectors * Hash Tables * Readtables * Packages * Pathnames * Streams * Random-States * Structures * Functions * Unreadable Data Objects * Overlap, Inclusion, and Disjointness of Types 2.1. Numbers Several kinds of numbers are defined in Common Lisp. They are divided into integers; ratios; floating-point numbers, with names provided for up to four different floating-point representations; and complex numbers. [change_begin] X3J13 voted in March 1989 (REAL-NUMBER-TYPE) to add the type real. The number data type encompasses all kinds of numbers. For convenience, there are names for some subclasses of numbers as well. Integers and ratios are of type rational. Rational numbers and floating-point numbers are of type real. Real numbers and complex numbers are of type number. Although the names of these types were chosen with the terminology of mathematics in mind, the correspondences are not always exact. Integers and ratios model the corresponding mathematical concepts directly. Numbers of type float may be used to approximate real numbers, both rational and irrational. The real type includes all Common Lisp numbers that represent mathematical real numbers, though there are mathematical real numbers (irrational numbers) that do not have an exact Common Lisp representation. Only real numbers may be ordered using the <, >, <=, and >= functions. ------------------------------------------------------------------------------- Compatibility note: The Fortran 77 standard defines the term real datum to mean ``a processor approximation to the value of a real number.'' In practice the Fortran basic real type is the floating-point data type that Common Lisp calls single-float. The Fortran double precision type is Common Lisp's double-float. The Pascal real data type is an ``implementation-defined subset of the real numbers.'' In practice this is usually a floating-point type, often what Common Lisp calls double-float. A translation of an algorithm written in Fortran or Pascal that uses real data usually will use some appropriate precision of Common Lisp's float type. Some algorithms may gain accuracy or flexibility by using Common Lisp's rational or real type instead. ------------------------------------------------------------------------------- [change_end] ------------------------------------------------------------------------------- * Integers * Ratios * Floating-Point Numbers * Complex Numbers 2.1.1. Integers The integer data type is intended to represent mathematical integers. Unlike most programming languages, Common Lisp in principle imposes no limit on the magnitude of an integer; storage is automatically allocated as necessary to represent large integers. In every Common Lisp implementation there is a range of integers that are represented more efficiently than others; each such integer is called a fixnum, and an integer that is not a fixnum is called a bignum. Common Lisp is designed to hide this distinction as much as possible; the distinction between fixnums and bignums is visible to the user in only a few places where the efficiency of representation is important. Exactly which integers are fixnums is implementation-dependent; typically they will be those integers in the range to , inclusive, for some n not less than 15. See most-positive-fixnum and most-negative-fixnum. [change_begin] X3J13 voted in January 1989 (FIXNUM-NON-PORTABLE) to specify that fixnum must be a supertype of the type (signed-byte 16), and additionally that the value of array-dimension-limit must be a fixnum (implying that the implementor should choose the range of fixnums to be large enough to accommodate the largest size of array to be supported). ------------------------------------------------------------------------------- Rationale: This specification allows programmers to declare variables in portable code to be of type fixnum for efficiency. Fixnums are guaranteed to encompass at least the set of 16-bit signed integers (compare this to the data type short int in the C programming language). In addition, any valid array index must be a fixnum, and therefore variables used to hold array indices (such as a dotimes variable) may be declared fixnum in portable code. ------------------------------------------------------------------------------- [change_end] Integers are ordinarily written in decimal notation, as a sequence of decimal digits, optionally preceded by a sign and optionally followed by a decimal point. For example: 0 ;Zero -0 ;This always means the same as 0 +6 ;The first perfect number 28 ;The second perfect number 1024. ;Two to the tenth power -1 ; 15511210043330985984000000. ;25 factorial (25!), probably a bignum ------------------------------------------------------------------------------- Compatibility note: MacLisp and Lisp Machine Lisp normally assume that integers are written in octal (radix-8) notation unless a decimal point is present. Interlisp assumes integers are written in decimal notation and uses a trailing Q to indicate octal radix; however, a decimal point, even in trailing position, always indicates a floating-point number. This is of course consistent with Fortran. Ada does not permit trailing decimal points but instead requires them to be embedded. In Common Lisp, integers written as described above are always construed to be in decimal notation, whether or not the decimal point is present; allowing the decimal point to be present permits compatibility with MacLisp. ------------------------------------------------------------------------------- Integers may be notated in radices other than ten. The notation #nnrddddd or #nnRddddd means the integer in radix-nn notation denoted by the digits ddddd. More precisely, one may write #, a non-empty sequence of decimal digits representing an unsigned decimal integer n, r (or R), an optional sign, and a sequence of radix-n digits, to indicate an integer written in radix n (which must be between 2 and 36, inclusive). Only legal digits for the specified radix may be used; for example, an octal number may contain only the digits 0 through 7. For digits above 9, letters of the alphabet of either case may be used in order. Binary, octal, and hexadecimal radices are useful enough to warrant the special abbreviations #b for #2r, #o for #8r, and #x for #16r. For example: #2r11010101 ;Another way of writing 213 decimal #b11010101 ;Ditto #b+11010101 ;Ditto #o325 ;Ditto, in octal radix #xD5 ;Ditto, in hexadecimal radix #16r+D5 ;Ditto #o-300 ;Decimal -192, written in base 8 #3r-21010 ;Same thing in base 3 #25R-7H ;Same thing in base 25 #xACCEDED ;181202413, in hexadecimal radix 2.1.2. Ratios A ratio is a number representing the mathematical ratio of two integers. Integers and ratios collectively constitute the type rational. The canonical representation of a rational number is as an integer if its value is integral, and otherwise as the ratio of two integers, the numerator and denominator, whose greatest common divisor is 1, and of which the denominator is positive (and in fact greater than 1, or else the value would be integral). A ratio is notated with / as a separator, thus: 3/5. It is possible to notate ratios in non-canonical (unreduced) forms, such as 4/6, but the Lisp function prin1 always prints the canonical form for a ratio. If any computation produces a result that is a ratio of two integers such that the denominator evenly divides the numerator, then the result is immediately converted to the equivalent integer. This is called the rule of rational canonicalization. Rational numbers may be written as the possibly signed quotient of decimal numerals: an optional sign followed by two non-empty sequences of digits separated by a /. This syntax may be described as follows: ratio ::= [sign] {digit}+ / {digit}+ The second sequence may not consist entirely of zeros. For example: 2/3 ;This is in canonical form 4/6 ;A non-canonical form for the same number -17/23 ;A not very interesting ratio -30517578125/32768 ;This is 10/5 ;The canonical form for this is 2 To notate rational numbers in radices other than ten, one uses the same radix specifiers (one of #nnR, #O, #B, or #X) as for integers. For example: #o-101/75 ;Octal notation for -65/61 #3r120/21 ;Ternary notation for 15/7 #Xbc/ad ;Hexadecimal notation for 188/173 #xFADED/FACADE ;Hexadecimal notation for 1027565/16435934 2.1.3. Floating-Point Numbers Common Lisp allows an implementation to provide one or more kinds of floating-point number, which collectively make up the type float. Now a floating-point number is a (mathematical) rational number of the form , where s is +1 or -1, the sign; b is an integer greater than 1, the base or radix of the representation; p is a positive integer, the precision (in base-b digits) of the floating-point number; f is a positive integer between and (inclusive), the significand; and e is an integer, the exponent. The value of p and the range of e depends on the implementation and on the type of floating-point number within that implementation. In addition, there is a floating-point zero; depending on the implementation, there may also be a ``minus zero.'' If there is no minus zero, then 0.0 and -0.0 are both interpreted as simply a floating-point zero. ------------------------------------------------------------------------------- Implementation note: The form of the above description should not be construed to require the internal representation to be in sign-magnitude form. Two's-complement and other representations are also acceptable. Note that the radix of the internal representation may be other than 2, as on the IBM 360 and 370, which use radix 16; see float-radix. ------------------------------------------------------------------------------- Floating-point numbers may be provided in a variety of precisions and sizes, depending on the implementation. High-quality floating-point software tends to depend critically on the precise nature of the floating-point arithmetic and so may not always be completely portable. As an aid in writing programs that are moderately portable, however, certain definitions are made here: * A short floating-point number (type short-float) is of the representation of smallest fixed precision provided by an implementation. * A long floating-point number (type long-float) is of the representation of the largest fixed precision provided by an implementation. * Intermediate between short and long formats are two others, arbitrarily called single and double (types single-float and double-float). The precise definition of these categories is implementation-dependent. However, the rough intent is that short floating-point numbers be precise to at least four decimal places (but also have a space-efficient representation); single floating-point numbers, to at least seven decimal places; and double floating-point numbers, to at least fourteen decimal places. It is suggested that the precision (measured in bits, computed as ) and the exponent size (also measured in bits, computed as the base-2 logarithm of 1 plus the maximum exponent value) be at least as great as the values in table 2-1. Floating-point numbers are written in either decimal fraction or computerized scientific notation: an optional sign, then a non-empty sequence of digits with an embedded decimal point, then an optional decimal exponent specification. If there is no exponent specifier, then the decimal point is required, and there must be digits after it. The exponent specifier consists of an exponent marker, an optional sign, and a non-empty sequence of digits. For preciseness, here is a modified-BNF description of floating-point notation. floating-point-number ::= [sign] {digit}* decimal-point {digit}* [exponent] | [sign] {digit}+ [decimal-point {digit}*] exponent sign ::= + | - decimal-point ::= . digit ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 exponent ::= exponent-marker [sign] {digit}+ exponent-marker ::= e | s | f | d | l | E | S | F | D | L If no exponent specifier is present, or if the exponent marker e (or E) is used, then the precise format to be used is not specified. When such a representation is read and converted to an internal floating-point data object, the format specified by the variable *read-default-float-format* is used; the initial value of this variable is single-float. The letters s, f, d, and l (or their respective uppercase equivalents) explicitly specify the use of short, single, double, and long format, respectively. Examples of floating-point numbers: 0.0 ;Floating-point zero in default format 0E0 ;Also floating-point zero in default format -.0 ;This may be a zero or a minus zero, ; depending on the implementation 0. ;The integer zero, not a floating-point zero! 0.0s0 ;A floating-point zero in short format 0s0 ;Also a floating-point zero in short format 3.1415926535897932384d0 ;A double-format approximation to 6.02E+23 ;Avogadro's number, in default format 602E+21 ;Also Avogadro's number, in default format 3.010299957f-1 ; , in single format -0.000000001s9 ; in short format, the hard way [change_begin] Notice of correction. The first edition unfortunately listed an incorrect value (3.1010299957f-1) for the base-10 logarithm of 2. [change_end] The internal format used for an external representation depends only on the exponent marker and not on the number of decimal digits in the external representation. While Common Lisp provides terminology and notation sufficient to accommodate four distinct floating-point formats, not all implementations will have the means to support that many distinct formats. An implementation is therefore permitted to provide fewer than four distinct internal floating-point formats, in which case at least one of them will be ``shared'' by more than one of the external format names short, single, double, and long according to the following rules: * If one internal format is provided, then it is considered to be single, but serves also as short, double, and long. The data types short-float, single-float, double-float, and long-float are considered to be identical. An expression such as (eql 1.0s0 1.0d0) will be true in such an implementation because the two numbers 1.0s0 and 1.0d0 will be converted into the same internal format and therefore be considered to have the same data type, despite the differing external syntax. Similarly, (typep 1.0L0 'short-float) will be true in such an implementation. For output purposes all floating-point numbers are assumed to be of single format and thus will print using the exponent letter E or F. * If two internal formats are provided, then either of two correspondences may be used, depending on which is the more appropriate: o One format is short; the other is single and serves also as double and long. The data types single-float, double-float, and long-float are considered to be identical, but short-float is distinct. An expression such as (eql 1.0s0 1.0d0) will be false, but (eql 1.0f0 1.0d0) will be true. Similarly, (typep 1.0L0 'short-float) will be false, but (typep 1.0L0 'single-float) will be true. For output purposes all floating-point numbers are assumed to be of short or single format. o One format is single and serves also as short; the other is double and serves also as long. The data types short-float and single-float are considered to be identical, and the data types double-float and long-float are considered to be identical. An expression such as (eql 1.0s0 1.0d0) will be false, as will (eql 1.0f0 1.0d0); but (eql 1.0d0 1.0L0) will be true. Similarly, (typep 1.0L0 'short-float) will be false, but (typep 1.0L0 'double-float) will be true. For output purposes all floating-point numbers are assumed to be of single or double format. * If three internal formats are provided, then either of two correspondences may be used, depending on which is the more appropriate: o One format is short; another format is single; and the third format is double and serves also as long. Similar constraints apply. o One format is single and serves also as short; another is double; and the third format is long. ------------------------------------------------------------------------------- Implementation note: It is recommended that an implementation provide as many distinct floating-point formats as feasible, using table 2-1 as a guideline. Ideally, short-format floating-point numbers should have an ``immediate'' representation that does not require heap allocation; single-format floating-point numbers should approximate IEEE proposed standard single-format floating-point numbers; and double-format floating-point numbers should approximate IEEE proposed standard double-format floating-point numbers [23,17,16]. ------------------------------------------------------------------------------- 2.1.4. Complex Numbers Complex numbers (type complex) are represented in Cartesian form, with a real part and an imaginary part, each of which is a non-complex number (integer, ratio, or floating-point number). It should be emphasized that the parts of a complex number are not necessarily floating-point numbers; in this, Common Lisp is like PL/I and differs from Fortran. However, both parts must be of the same type: either both are rational, or both are of the same floating-point format. Complex numbers may be notated by writing the characters #C followed by a list of the real and imaginary parts. If the two parts as notated are not of the same type, then they are converted according to the rules of floating-point contagion as described in chapter 12. (Indeed, #C(a b) is equivalent to #,(complex a b); see the description of the function complex.) For example: #C(3.0s1 2.0s-1) ;Real and imaginary parts are short format #C(5 -3) ;A Gaussian integer #C(5/3 7.0) ;Will be converted internally to #C(1.66666 7.0) #C(0 1) ;The imaginary unit, that is, i The type of a specific complex number is indicated by a list of the word complex and the type of the components; for example, a specialized representation for complex numbers with short floating-point parts would be of type (complex short-float). The type complex encompasses all complex representations. A complex number of type (complex rational), that is, one whose components are rational, can never have a zero imaginary part. If the result of a computation would be a complex rational with a zero imaginary part, the result is immediately converted to a non-complex rational number by taking the real part. This is called the rule of complex canonicalization. This rule does not apply to floating-point complex numbers; #C(5.0 0.0) and 5.0 are different. 2.2. Characters Characters are represented as data objects of type character. [old_change_begin] There are two subtypes of interest, called standard-char and string-char. [old_change_end] [change_begin] X3J13 voted in March 1989 (CHARACTER-PROPOSAL) to remove the type string-char. [change_end] A character object can be notated by writing #\ followed by the character itself. For example, #\g means the character object for a lowercase g. This works well enough for printing characters. Non-printing characters have names, and can be notated by writing #\ and then the name; for example, #\Space (or #\SPACE or #\space or #\sPaCE) means the space character. The syntax for character names after #\ is the same as that for symbols. However, only character names that are known to the particular implementation may be used. ------------------------------------------------------------------------------- * Standard Characters * Line Divisions * Non-standard Characters * Character Attributes * String Characters ------------------------------------------------------------------------------- 2.2.1. Standard Characters Common Lisp defines a standard character set (subtype standard-char) for two purposes. Common Lisp programs that are written in the standard character set can be read by any Common Lisp implementation; and Common Lisp programs that use only standard characters as data objects are most likely to be portable. The Common Lisp character set consists of a space character #\Space, a newline character #\Newline, and the following ninety-four non-blank printing characters or their equivalents: ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e f g h i j k l m n o p q r s t u v w x y z { | } ~ The Common Lisp standard character set is apparently equivalent to the ninety-five standard ASCII printing characters plus a newline character. Nevertheless, Common Lisp is designed to be relatively independent of the ASCII character encoding. For example, the collating sequence is not specified except to say that digits must be properly ordered, the uppercase letters must be properly ordered, and the lowercase letters must be properly ordered (see char< for a precise specification). Other character encodings, particularly EBCDIC, should be easily accommodated (with a suitable mapping of printing characters). Of the ninety-four non-blank printing characters, the following are used in only limited ways in the syntax of Common Lisp programs: [ ] { } ? ! ^ _ ~ $ % [old_change_begin] All of these characters except ! and _ are used within format strings as formatting directives. Except for this, [, ], {, }, ?, and ! are not used in Common Lisp and are reserved to the user for syntactic extensions; ^ and _ are not yet used in Common Lisp but are part of the syntax of reserved tokens and are reserved to implementors; ~ is not yet used in Common Lisp and is reserved to implementors; and $ and % are normally regarded as alphabetic characters but are not used in the names of any standard Common Lisp functions, variables, or other entities. [old_change_end] [change_begin] X3J13 voted in June 1989 (PRETTY-PRINT-INTERFACE) to add a format directive ~_ (see chapter 27). [change_end] The following characters are called semi-standard: #\Backspace #\Tab #\Linefeed #\Page #\Return #\Rubout Not all implementations of Common Lisp need to support them; but those implementations that use the standard ASCII character set should support them, treating them as corresponding respectively to the ASCII characters BS (octal code 010), HT (011), LF (012), FF (014), CR (015), and DEL (177). These characters are not members of the subtype standard-char unless synonymous with one of the standard characters specified above. For example, in a given implementation it might be sensible for the implementor to define #\Linefeed or #\Return to be synonymous with #\Newline, or #\Tab to be synonymous with #\Space. 2.2.2. Line Divisions The treatment of line divisions is one of the most difficult issues in designing portable software, simply because there is so little agreement among operating systems. Some use a single character to delimit lines; the recommended ASCII character for this purpose is the line feed character LF (also called the new line character, NL), but some systems use the carriage return character CR. Much more common is the two-character sequence CR followed by LF. Frequently line divisions have no representation as a character but are implicit in the structuring of a file into records, each record containing a line of text. A deck of punched cards has this structure, for example. Common Lisp provides an abstract interface by requiring that there be a single character, #\Newline, that within the language serves as a line delimiter. (The language C has a similar requirement.) An implementation of Common Lisp must translate between this internal single-character representation and whatever external representation(s) may be used. ------------------------------------------------------------------------------- Implementation note: How the character called #\Newline is represented internally is not specified here, but it is strongly suggested that the ASCII LF character be used in Common Lisp implementations that use the ASCII character encoding. The ASCII CR character is a workable, but in most cases inferior, alternative. ------------------------------------------------------------------------------- [change_begin] When the first edition was written it was not yet clear that UNIX would become so widely accepted. The decision to represent the line delimiter as a single character has proved to be a good one. [change_end] The requirement that a line division be represented as a single character has certain consequences. A character string written in the middle of a program in such a way as to span more than one line must contain exactly one character to represent each line division. Consider this code fragment: (setq a-string "This string contains forty-two characters.") Between g and c there must be exactly one character, #\Newline; a two-character sequence, such as #\Return and then #\Newline, is not acceptable, nor is the absence of a character. The same is true between s and f. When the character #\Newline is written to an output file, the Common Lisp implementation must take the appropriate action to produce a line division. This might involve writing out a record or translating #\Newline to a CR/LF sequence. ------------------------------------------------------------------------------- Implementation note: If an implementation uses the ASCII character encoding, uses the CR/LF sequence externally to delimit lines, uses LF to represent #\Newline internally, and supports #\Return as a data object corresponding to the ASCII character CR, the question arises as to what action to take when the program writes out #\Return followed by #\Newline. It should first be noted that #\Return is not a standard Common Lisp character, and the action to be taken when #\Return is written out is therefore not defined by the Common Lisp language. A plausible approach is to buffer the #\Return character and suppress it if and only if the next character is #\Newline (the net effect is to generate a CR/LF sequence). Another plausible approach is simply to ignore the difficulty and declare that writing #\Return and then #\Newline results in the sequence CR/CR/LF in the output. ------------------------------------------------------------------------------- 2.2.3. Non-standard Characters Any implementation may provide additional characters, whether printing characters or named characters. Some plausible examples: #\ #\ #\Break #\Home-Up #\Escape The use of such characters may render Common Lisp programs non-portable. [old_change_begin] 2.2.4. Character Attributes Every object of type character has three attributes: code, bits, and font. The code attribute is intended to distinguish among the printed glyphs and formatting functions for characters; it is a numerical encoding of the character proper. The bits attribute allows extra flags to be associated with a character. The font attribute permits a specification of the style of the glyphs (such as italics). Each of these attributes may be understood to be a non-negative integer. The font attribute may be notated in unsigned decimal notation between the # and the \. For example, #3\a means the letter a in font 3. This might mean the same thing as #\ if font 3 were used to represent Greek letters. Note that not all Common Lisp implementations provide for non-zero font attributes; see char-font-limit. The bits attribute may be notated by preceding the name of the character by the names or initials of the bits, separated by hyphens. The character itself may be written instead of the name, preceded if necessary by \. For example: #\Control-Meta-Return #\Meta-Control-Q #\Hyper-Space #\Meta-\a #\Control-A #\Meta-Hyper-\: #\C-M-Return #\Hyper-\ Note that not all Common Lisp implementations provide for non-zero bits attributes; see char-bits-limit. [old_change_end] [change_begin] X3J13 voted in March 1989 (CHARACTER-PROPOSAL) to replace the notion of bits and font attributes with that of implementation-defined attributes. [change_end] [old_change_begin] 2.2.5. String Characters Any character whose bits and font attributes are zero may be contained in strings. All such characters together constitute a subtype of the characters; this subtype is called string-char. [old_change_end] [change_begin] X3J13 voted in March 1989 (CHARACTER-PROPOSAL) to eliminate the type string-char. Two new subtypes of character are base-character, defined to be equivalent to the result of the function call (upgraded-array-element-type 'standard-char) and extended-character, defined to be equivalent to the type specifier (and character (not base-character)) An implementation may support additional subtypes of character that may or may not be supertypes of base-character. In addition, an implementation may define base-character to be equivalent to character. The choice of any base characters that are not standard characters is implementation-defined. Only base characters can be elements of a base string. No upper bound is specified for the number of distinct characters of type base-character-that is implementation-dependent-but the lower bound is 96, the number of standard Common Lisp characters. [change_end] 2.3. Symbols Symbols are Lisp data objects that serve several purposes and have several interesting characteristics. Every object of type symbol has a name, called its print name. Given a symbol, one can obtain its name in the form of a string. Conversely, given the name of a symbol as a string, one can obtain the symbol itself. (More precisely, symbols are organized into packages, and all the symbols in a package are uniquely identified by name. See chapter 11.) Symbols have a component called the property list, or plist. By convention this is always a list whose even-numbered components (calling the first component zero) are symbols, here functioning as property names, and whose odd-numbered components are associated property values. Functions are provided for manipulating this property list; in effect, these allow a symbol to be treated as an extensible record structure. Symbols are also used to represent certain kinds of variables in Lisp programs, and there are functions for dealing with the values associated with symbols in this role. A symbol can be notated simply by writing its name. If its name is not empty, and if the name consists only of uppercase alphabetic, numeric, or certain pseudo-alphabetic special characters (but not delimiter characters such as parentheses or space), and if the name of the symbol cannot be mistaken for a number, then the symbol can be notated by the sequence of characters in its name. Any uppercase letters that appear in the (internal) name may be written in either case in the external notation (more on this below). For example: FROBBOZ ;The symbol whose name is FROBBOZ frobboz ;Another way to notate the same symbol fRObBoz ;Yet another way to notate it unwind-protect ;A symbol with a - in its name +$ ;The symbol named +$ 1+ ;The symbol named 1+ +1 ;This is the integer 1, not a symbol pascal_style ;This symbol has an underscore in its name b^2-4*a*c ;This is a single symbol! ; It has several special characters in its name file.rel.43 ;This symbol has periods in its name /usr/games/zork ;This symbol has slashes in its name In addition to letters and numbers, the following characters are normally considered to be alphabetic for the purposes of notating symbols: + - * / @ $ % ^ & _ = < > ~ . Some of these characters have conventional purposes for naming things; for example, symbols that name special variables generally have names beginning and ending with *. The last character listed above, the period, is considered alphabetic provided that a token does not consist entirely of periods. A single period standing by itself is used in the notation of conses and dotted lists; a token consisting of two or more periods is syntactically illegal. (The period also serves as the decimal point in the notation of numbers.) The following characters are also alphabetic by default but are explicitly reserved to the user for definition as reader macro characters (see section 22.1.3) or any other desired purpose and therefore should not be used routinely in names of symbols: ? ! [ ] { } A symbol may have uppercase letters, lowercase letters, or both in its print name. However, the Lisp reader normally converts lowercase letters to the corresponding uppercase letters when reading symbols. The net effect is that most of the time case makes no difference when notating symbols. Case does make a difference internally and when printing a symbol. Internally the symbols that name all standard Common Lisp functions, variables, and keywords have uppercase names; their names appear in lowercase in this book for readability. Typing such names with lowercase letters works because the function read will convert lowercase letters to the equivalent uppercase letters. [change_begin] X3J13 voted in June 1989 (READ-CASE-SENSITIVITY) to introduce readtable-case, which controls whether read will alter the case of letters read as part of the name of a symbol. [change_end] If a symbol cannot be simply notated by the characters of its name because the (internal) name contains special characters or lowercase letters, then there are two ``escape'' conventions for notating them. Writing a character before any character causes the character to be treated itself as an ordinary character for use in a symbol name; in particular, it suppresses internal conversion of lowercase letters to their uppercase equivalents. If any character in a notation is preceded by , then that notation can never be interpreted as a number. For example: $ ;The symbol whose name is ( \+1 ;The symbol whose name is +1 +\1 ;Also the symbol whose name is +1 \frobboz ;The symbol whose name is fROBBOZ 3.14159265\s0 ;The symbol whose name is 3.14159265s0 3.14159265\S0 ;A different symbol, whose name is 3.14159265S0 3.14159265s0 ;A short-format floating-point approximation to APL\\360 ;The symbol whose name is APL 360 apl\\360 ;Also the symbol whose name is APL 360 \(b^2$\ -\ 4*a*c ;The name is (B^2) - 4*A*C; ; it has parentheses and two spaces in it $\b^2$\ -\ 4*\a*\c ;The name is (b^2) - 4*a*c; ; the letters are explicitly lowercase It may be tedious to insert a \ before every delimiter character in the name of a symbol if there are many of them. An alternative convention is to surround the name of a symbol with vertical bars; these cause every character between them to be taken as part of the symbol's name, as if \ had been written before each one, excepting only | itself and \, which must nevertheless be preceded by \. For example: |"| ;The same as writing \" |(b^2) - 4*a*c| ;The name is (b^2) - 4*a*c |frobboz| ;The name is frobboz, not FROBBOZ |APL\360| ;The name is APL360, because the \ quotes the 3 |APL\\360| ;The name is APL\360 |apl\\360| ;The name is apl\360 |\|\|| ;Same as \|\|: the name is || |(B^2) - 4*A*C| ;The name is (B^2) - 4*A*C; ; it has parentheses and two spaces in it |(b^2) - 4*a*c| ;The name is (b^2) - 4*a*c 2.4. Lists and Conses A cons is a record structure containing two components called the car and the cdr. Conses are used primarily to represent lists. A list is recursively defined to be either the empty list or a cons whose cdr component is a list. A list is therefore a chain of conses linked by their cdr components and terminated by nil, the empty list. The car components of the conses are called the elements of the list. For each element of the list there is a cons. The empty list has no elements at all. A list is notated by writing the elements of the list in order, separated by blank space (space, tab, or return characters) and surrounded by parentheses. (a b c) ;A list of three symbols (2.0s0 (a 1) #\*) ;A list of three things: a short floating-point ; number, another list, and a character object The empty list nil therefore can be written as (), because it is a list with no elements. A dotted list is one whose last cons does not have nil for its cdr, rather some other data object (which is also not a cons, or the first-mentioned cons would not be the last cons of the list). Such a list is called ``dotted'' because of the special notation used for it: the elements of the list are written between parentheses as before, but after the last element and before the right parenthesis are written a dot (surrounded by blank space) and then the cdr of the last cons. As a special case, a single cons is notated by writing the car and the cdr between parentheses and separated by a space-surrounded dot. For example: (a . 4) ;A cons whose car is a symbol ; and whose cdr is an integer (a b c . d) ;A dotted list with three elements whose last cons ; has the symbol d in its cdr ------------------------------------------------------------------------------- Compatibility note: In MacLisp, the dot in dotted-list notation need not be surrounded by white space or other delimiters. The dot is required to be delimited in Common Lisp, as in Lisp Machine Lisp. ------------------------------------------------------------------------------- It is legitimate to write something like (a b . (c d)); this means the same as (a b c d). The standard Lisp output routines will never print a list in the first form, however; they will avoid dot notation wherever possible. Often the term list is used to refer either to true lists or to dotted lists. When the distinction is important, the term ``true list'' will be used to refer to a list terminated by nil. Most functions advertised to operate on lists expect to be given true lists. Throughout this book, unless otherwise specified, it is an error to pass a dotted list to a function that is specified to require a list as an argument. ------------------------------------------------------------------------------- Implementation note: Implementors are encouraged to use the equivalent of the predicate endp wherever it is necessary to test for the end of a list. Whenever feasible, this test should explicitly signal an error if a list is found to be terminated by a non-nil atom. However, such an explicit error signal is not required, because some such tests occur in important loops where efficiency is important. In such cases, the predicate atom may be used to test for the end of the list, quietly treating any non-nil list-terminating atom as if it were nil. ------------------------------------------------------------------------------- Sometimes the term tree is used to refer to some cons and all the other conses transitively accessible to it through car and cdr links until non-conses are reached; these non-conses are called the leaves of the tree. Lists, dotted lists, and trees are not mutually exclusive data types; they are simply useful points of view about structures of conses. There are yet other terms, such as association list. None of these are true Lisp data types. Conses are a data type, and nil is the sole object of type null. The Lisp data type list is taken to mean the union of the cons and null data types, and therefore encompasses both true lists and dotted lists. 2.5. Arrays An array is an object with components arranged according to a Cartesian coordinate system. In general, these components may be any Lisp data objects. The number of dimensions of an array is called its rank (this terminology is borrowed from APL); the rank is a non-negative integer. Likewise, each dimension is itself a non-negative integer. The total number of elements in the array is the product of all the dimensions. An implementation of Common Lisp may impose a limit on the rank of an array, but this limit may not be smaller than 7. Therefore, any Common Lisp program may assume the use of arrays of rank 7 or less. (A program may determine the actual limit on array ranks for a given implementation by examining the constant array-rank-limit.) It is permissible for a dimension to be zero. In this case, the array has no elements, and any attempt to access an element is in error. However, other properties of the array, such as the dimensions themselves, may be used. If the rank is zero, then there are no dimensions, and the product of the dimensions is then by definition 1. A zero-rank array therefore has a single element. An array element is specified by a sequence of indices. The length of the sequence must equal the rank of the array. Each index must be a non-negative integer strictly less than the corresponding array dimension. Array indexing is therefore zero-origin, not one-origin as in (the default case of) Fortran. As an example, suppose that the variable foo names a 3-by-5 array. Then the first index may be 0, 1, or 2, and the second index may be 0, 1, 2, 3, or 4. One may refer to array elements using the function aref; for example, (aref foo 2 1) refers to element (2, 1) of the array. Note that aref takes a variable number of arguments: an array, and as many indices as the array has dimensions. A zero-rank array has no dimensions, and therefore aref would take such an array and no indices, and return the sole element of the array. In general, arrays can be multidimensional, can share their contents with other array objects, and can have their size altered dynamically (either enlarging or shrinking) after creation. A one-dimensional array may also have a fill pointer. Multidimensional arrays store their components in row-major order; that is, internally a multidimensional array is stored as a one-dimensional array, with the multidimensional index sets ordered lexicographically, last index varying fastest. This is important in two situations: (1) when arrays with different dimensions share their contents, and (2) when accessing very large arrays in a virtual-memory implementation. (The first situation is a matter of semantics; the second, a matter of efficiency.) An array that is not displaced to another array, has no fill pointer, and is not to have its size adjusted dynamically after creation is called a simple array. The user may provide declarations that certain arrays will be simple. Some implementations can handle simple arrays in an especially efficient manner; for example, simple arrays may have a more compact representation than non-simple arrays. [change_begin] X3J13 voted in June 1989 (ADJUST-ARRAY-NOT-ADJUSTABLE) to clarify that if one or more of the :adjustable, :fill-pointer, and :displaced-to arguments is true when make-array is called, then whether the resulting array is simple is unspecified; but if all three arguments are false, then the resulting array is guaranteed to be simple. [change_end] ------------------------------------------------------------------------------- * Vectors * Strings * Bit-Vectors 2.5.1. Vectors One-dimensional arrays are called vectors in Common Lisp and constitute the type vector (which is therefore a subtype of array). Vectors and lists are collectively considered to be sequences. They differ in that any component of a one-dimensional array can be accessed in constant time, whereas the average component access time for a list is linear in the length of the list; on the other hand, adding a new element to the front of a list takes constant time, whereas the same operation on an array takes time linear in the length of the array. A general vector (a one-dimensional array that can have any data object as an element but that has no additional paraphernalia) can be notated by notating the components in order, separated by whitespace and surrounded by #( and ). For example: #(a b c) ;A vector of length 3 #() ;An empty vector #(2 3 5 7 11 13 17 19 23 29 31 37 41 43 47) ;A vector containing the primes below 50 Note that when the function read parses this syntax, it always constructs a simple general vector. ------------------------------------------------------------------------------- Rationale: Many people have suggested that brackets be used to notate vectors, as [a b c] instead of #(a b c). This notation would be shorter, perhaps more readable, and certainly in accord with cultural conventions in other parts of computer science and mathematics. However, to preserve the usefulness of the user-definable macro-character feature of the function read, it is necessary to leave some characters to the user for this purpose. Experience in MacLisp has shown that users, especially implementors of languages for use in artificial intelligence research, often want to define special kinds of brackets. Therefore Common Lisp avoids using brackets and braces for any syntactic purpose. ------------------------------------------------------------------------------- Implementations may provide certain specialized representations of arrays for efficiency in the case where all the components are of the same specialized (typically numeric) type. All implementations provide specialized arrays for the cases when the components are characters (or rather, a special subset of the characters); the one-dimensional instances of this specialization are called strings. All implementations are also required to provide specialized arrays of bits, that is, arrays of type (array bit); the one-dimensional instances of this specialization are called bit-vectors. 2.5.2. Strings [old_change_begin] A string is simply a vector of characters. More precisely, a string is a specialized vector whose elements are of type string-char. [old_change_end] [change_begin] X3J13 voted in March 1989 (CHARACTER-PROPOSAL) to eliminate the type string-char and to redefine the type string to be the union of one or more specialized vector types, the types of whose elements are subtypes of the type character. Subtypes of string include simple-string, base-string, and simple-base-string. base-string == (vector base-character) simple-base-string == (simple-array base-character (*)) An implementation may support other string subtypes as well. All Common Lisp functions that operate on strings treat all strings uniformly; note, however, that it is an error to attempt to insert an extended character into a base string. [change_end] The type string is therefore a subtype of the type vector. A string can be written as the sequence of characters contained in the string, preceded and followed by a " (double quote) character. Any " or \ character in the sequence must additionally have a \ character before it. For example: "Foo" ;A string with three characters in it "" ;An empty string "\"APL\\360?\" he cried." ;A string with twenty characters "|x| = |-x|" ;A ten-character string Notice that any vertical bar | in a string need not be preceded by a \. Similarly, any double quote in the name of a symbol written using vertical-bar notation need not be preceded by a \. The double-quote and vertical-bar notations are similar but distinct: double quotes indicate a character string containing the sequence of characters, whereas vertical bars indicate a symbol whose name is the contained sequence of characters. The characters contained by the double quotes, taken from left to right, occupy locations within the string with increasing indices. The leftmost character is string element number 0, the next one is element number 1, the next one is element number 2, and so on. Note that the function prin1 will print any character vector (not just a simple one) using this syntax, but the function read will always construct a simple string when it reads this syntax. 2.5.3. Bit-Vectors A bit-vector can be written as the sequence of bits contained in the string, preceded by #*; any delimiter character, such as whitespace, will terminate the bit-vector syntax. For example: #*10110 ;A five-bit bit-vector; bit 0 is a 1 #* ;An empty bit-vector The bits notated following the #*, taken from left to right, occupy locations within the bit-vector with increasing indices. The leftmost notated bit is bit-vector element number 0, the next one is element number 1, and so on. The function prin1 will print any bit-vector (not just a simple one) using this syntax, but the function read will always construct a simple bit-vector when it reads this syntax. 2.6. Hash Tables Hash tables provide an efficient way of mapping any Lisp object (a key) to an associated object. They are provided as primitives of Common Lisp because some implementations may need to use internal storage management strategies that would make it very difficult for the user to implement hash tables in a portable fashion. Hash tables are described in chapter 16. 2.7. Readtables A readtable is a data structure that maps characters into syntax types for the Lisp expression parser. In particular, a readtable indicates for each character with syntax macro character what its macro definition is. This is a mechanism by which the user may reprogram the parser to a limited but useful extent. See section 22.1.5. 2.8. Packages Packages are collections of symbols that serve as name spaces. The parser recognizes symbols by looking up character sequences in the current package. Packages can be used to hide names internal to a module from other code. Mechanisms are provided for exporting symbols from a given package to the primary ``user'' package. See chapter 11. 2.9. Pathnames Pathnames are the means by which a Common Lisp program can interface to an external file system in a reasonably implementation-independent manner. See section 23.1.1. 2.10. Streams A stream is a source or sink of data, typically characters or bytes. Nearly all functions that perform I/O do so with respect to a specified stream. The function open takes a pathname and returns a stream connected to the file specified by the pathname. There are a number of standard streams that are used by default for various purposes. See chapter 21. [change_begin] X3J13 voted in January 1989 (STREAM-ACCESS) to introduce subtypes of type stream: broadcast-stream, concatenated-stream, echo-stream, synonym-stream, string-stream, file-stream, and two-way-stream are disjoint subtypes of stream. Note particularly that a synonym stream is always and only of type synonym-stream, regardless of the type of the stream for which it is a synonym. [change_end] 2.11. Random-States An object of type random-state is used to encapsulate state information used by the pseudo-random number generator. For more information about random-state objects, see section 12.9. 2.12. Structures Structures are instances of user-defined data types that have a fixed number of named components. They are analogous to records in Pascal. Structures are declared using the defstruct construct; defstruct automatically defines access and constructor functions for the new data type. Different structures may print out in different ways; the definition of a structure type may specify a print procedure to use for objects of that type (see the :print-function option to defstruct). The default notation for structures is #S(structure-name slot-name-1 slot-value-1 slot-name-2 slot-value-2 ...) where #S indicates structure syntax, structure-name is the name (a symbol) of the structure type, each slot-name is the name (also a symbol) of a component, and each corresponding slot-value is the representation of the Lisp object in that slot. 2.13. Functions [old_change_begin] A function is anything that may be correctly given to the funcall or apply function, and is to be executed as code when arguments are supplied. A compiled-function is a compiled code object. A lambda-expression (a list whose car is the symbol lambda) may serve as a function. Depending on the implementation, it may be possible for other lists to serve as functions. For example, an implementation might choose to represent a ``lexical closure'' as a list whose car contains some special marker. A symbol may serve as a function; an attempt to invoke a symbol as a function causes the contents of the symbol's function cell to be used. See symbol-function and defun. The result of evaluating a function special form will always be a function. [old_change_end] [change_begin] X3J13 voted in June 1988 (FUNCTION-TYPE) to revise these specifications. The type function is to be disjoint from cons and symbol, and so a list whose car is lambda is not, properly speaking, of type function, nor is any symbol. However, standard Common Lisp functions that accept functional arguments will accept a symbol or a list whose car is lambda and automatically coerce it to be a function; such standard functions include funcall, apply, and mapcar. Such functions do not, however, accept a lambda-expression as a functional argument; therefore one may not write (mapcar '(lambda (x y) (sqrt (* x y))) p q) but instead one must write something like (mapcar #'(lambda (x y) (sqrt (* x y))) p q) This change makes it impermissible to represent a lexical closure as a list whose car is some special marker. The value of a function special form will always be of type function. [change_end] 2.14. Unreadable Data Objects Some objects may print in implementation-dependent ways. Such objects cannot necessarily be reliably reconstructed from a printed representation, and so they are usually printed in a format informative to the user but not acceptable to the read function: #<useful information>. The Lisp reader will signal an error on encountering #<. As a hypothetical example, an implementation might print #<stack-pointer si:rename-within-new-definition-maybe #o311037552> for an implementation-specific ``internal stack pointer'' data type whose printed representation includes the name of the type, some information about the stack slot pointed to, and the machine address (in octal) of the stack slot. [change_begin] See print-unreadable-object, a macro that prints an object using #< syntax. [change_end] 2.15. Overlap, Inclusion, and Disjointness of Types The Common Lisp data type hierarchy is tangled and purposely left somewhat open-ended so that implementors may experiment with new data types as extensions to the language. This section explicitly states all the defined relationships between types, including subtype/supertype relationships, disjointness, and exhaustive partitioning. The user of Common Lisp should not depend on any relationships not explicitly stated here. For example, it is not valid to assume that because a number is not complex and not rational that it must be a float, because implementations are permitted to provide yet other kinds of numbers. First we need some terminology. If x is a supertype of y, then any object of type y is also of type x, and y is said to be a subtype of x. If types x and y are disjoint, then no object (in any implementation) may be both of type x and of type y. Types through are an exhaustive union of type x if each is a subtype of x, and any object of type x is necessarily of at least one of the types ; through are furthermore an exhaustive partition if they are also pairwise disjoint. * The type t is a supertype of every type whatsoever. Every object is of type t. * The type nil is a subtype of every type whatsoever. No object is of type nil. [old_change_begin] * The types cons, symbol, array, number, and character are pairwise disjoint. [old_change_end] [change_begin] X3J13 voted in June 1988 (DATA-TYPES-HIERARCHY-UNDERSPECIFIED) to extend the preceding paragraph as follows. * The types cons, symbol, array, number, character, hash-table, readtable, package, pathname, stream, random-state, and any single other type created by defstruct or defclass are pairwise disjoint. The wording of the first edition was intended to allow implementors to use the defstruct facility to define the built-in types hash-table, readtable, package, pathname, stream, random-state. The change still permits this implementation strategy but forbids these built-in types from including, or being included in, other types (in the sense of the defstruct :include option). X3J13 voted in June 1988 (FUNCTION-TYPE) to specify that the type function is disjoint from the types cons, symbol, array, number, and character. The type compiled-function is a subtype of function; implementations are free to define other subtypes of function. [change_end] [old_change_begin] * The types rational, float, and complex are pairwise disjoint subtypes of number. [old_change_end] [change_begin] X3J13 voted in March 1989 (REAL-NUMBER-TYPE) to rewrite the preceding item as follows. * The types real and complex are pairwise disjoint subtypes of number. ------------------------------------------------------------------------------- Rationale: It might be thought that real and complex should form an exhaustive partition of the type number. This is purposely avoided here in order to permit compatible experimentation with extensions to the Common Lisp number system. ------------------------------------------------------------------------------- * The types rational and float are pairwise disjoint subtypes of real. ------------------------------------------------------------------------------- Rationale: It might be thought that rational and float should form an exhaustive partition of the type real. This is purposely avoided here in order to permit compatible experimentation with extensions to the Common Lisp number system. ------------------------------------------------------------------------------- [change_end] * The types integer and ratio are disjoint subtypes of rational. ------------------------------------------------------------------------------- Rationale: It might be thought that integer and ratio should form an exhaustive partition of the type rational. This is purposely avoided here in order to permit compatible experimentation with extensions to the Common Lisp rational number system. ------------------------------------------------------------------------------- [old_change_begin] * The types fixnum and bignum are disjoint subtypes of integer. ------------------------------------------------------------------------------- Rationale: It might be thought that fixnum and bignum should form an exhaustive partition of the type integer. This is purposely avoided here in order to permit compatible experimentation with extensions to the Common Lisp integer number system, such as the idea of adding explicit representations of infinity or of positive and negative infinity. ------------------------------------------------------------------------------- [old_change_end] [change_begin] X3J13 voted in January 1989 (FIXNUM-NON-PORTABLE) to specify that the types fixnum and bignum do in fact form an exhaustive partition of the type integer; more precisely, they voted to specify that the type bignum is by definition equivalent to (and integer (not fixnum)). This is consistent with the first edition text in section 2.1.1. I interpret this to mean that implementators could still experiment with such extensions as adding explicit representations of infinity, but such infinities would necessarily be of type bignum. [change_end] * The types short-float, single-float, double-float, and long-float are subtypes of float. Any two of them must be either disjoint or identical; if identical, then any other types between them in the above ordering must also be identical to them (for example, if single-float and long-float are identical types, then double-float must be identical to them also). * The type null is a subtype of symbol; the only object of type null is nil. * The types cons and null form an exhaustive partition of the type list. [old_change_begin] * The type standard-char is a subtype of string-char; string-char is a subtype of character. [old_change_end] [change_begin] X3J13 voted in March 1989 (CHARACTER-PROPOSAL) to remove the type string-char. The preceding item is replaced by the following. * The type standard-char is a subtype of base-character. The types base-character and extended-character form an exhaustive partition of character. [change_end] [old_change_begin] * The type string is a subtype of vector, for string means (vector string-char). [old_change_end] [change_begin] X3J13 voted in March 1989 (CHARACTER-PROPOSAL) to remove the type string-char. The preceding item is replaced by the following. * The type string is a subtype of vector; it is the union of all types (vector c) such that c is a subtype of character. [change_end] * The type bit-vector is a subtype of vector, for bit-vector means (vector bit). * The types (vector t), string, and bit-vector are disjoint. * The type vector is a subtype of array; for all types x, the type (vector x) is the same as the type (array x (*)). * The type simple-array is a subtype of array. [old_change_begin] * The types simple-vector, simple-string, and simple-bit-vector are disjoint subtypes of simple-array, for they respectively mean (simple-array t (*)), (simple-array string-char (*)), and (simple-array bit (*)). [old_change_end] [change_begin] X3J13 voted in March 1989 (CHARACTER-PROPOSAL) to remove the type string-char. The preceding item is replaced by the following. * The types simple-vector, simple-string, and simple-bit-vector are disjoint subtypes of simple-array, for they mean (simple-array t (*)), the union of all types (simple-array c (*)) such that c is a subtype of character, and (simple-array bit (*)), respectively. [change_end] * The type simple-vector is a subtype of vector and indeed is a subtype of (vector t). * The type simple-string is a subtype of string. (Note that although string is a subtype of vector, simple-string is not a subtype of simple-vector.) ------------------------------------------------------------------------------- Rationale: The hypothetical name simple-general-vector would have been more accurate than simple-vector, but in this instance euphony and user convenience were deemed more important to the design of Common Lisp than a rigid symmetry. ------------------------------------------------------------------------------- * The type simple-bit-vector is a subtype of bit-vector. (Note that although bit-vector is a subtype of vector, simple-bit-vector is not a subtype of simple-vector.) * The types vector and list are disjoint subtypes of sequence. * The types random-state, readtable, package, pathname, stream, and hash-table are pairwise disjoint. [change_begin] X3J13 voted in June 1988 (DATA-TYPES-HIERARCHY-UNDERSPECIFIED) to make random-state, readtable, package, pathname, stream, and hash-table pairwise disjoint from a number of other types as well; see note above. X3J13 voted in January 1989 (STREAM-ACCESS) to introduce subtypes of type stream. * The types two-way-stream, echo-stream, broadcast-stream, file-stream, synonym-stream, string-stream, and concatenated-stream are disjoint subtypes of stream. [change_end] * Any two types created by defstruct are disjoint unless one is a supertype of the other by virtue of the :include option. [old_change_begin] * An exhaustive union for the type common is formed by the types cons, symbol, (array x) where x is either t or a subtype of common, string, fixnum, bignum, ratio, short-float, single-float, double-float, long-float, (complex x) where x is a subtype of common, standard-char, hash-table, readtable, package, pathname, stream, random-state, and all types created by the user via defstruct. An implementation may not unilaterally add subtypes to common; however, future revisions to the Common Lisp standard may extend the definition of the common data type. Note that a type such as number or array may or may not be a subtype of common, depending on whether or not the given implementation has extended the set of objects of that type. [old_change_end] [change_begin] X3J13 voted in March 1989 (COMMON-TYPE) to remove the type common from the language. [change_end]