home *** CD-ROM | disk | FTP | other *** search
Text File | 1988-05-03 | 25.3 KB | 1,322 lines |
-
-
-
-
-
- The following document is a draft of the corresponding chapter of the
- version of the Ada Reference Manual produced in response to the Ansi
- Canvass. It is given a limited circulation to Ada implementers and to
- other groups contributing comments (according to the conventions defined in
- RRM.comments). This draft should not be referred to in any publication.
-
-
-
- ANSI-RM-02-v23 - Draft Chapter
-
- 2 Lexical Elements
- version 23
-
- 83-02-11
-
-
- This revision has addressed all comments up to #5795
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- 2. Lexical Elements
-
-
-
- The text of a program consists of the texts of one or more compilations.
- The text of a compilation is a sequence of lexical elements, each composed
- of characters; the rules of composition are given in this chapter.
- Pragmas, which provide certain information for the compiler, are also
- described in this chapter.
-
- References: character 2.1, compilation 10.1, lexical element 2.2, pragma
- 2.8
-
-
-
-
- 2.1 Character Set
-
-
- The only characters allowed in the text of a program are the graphic
- characters and format effectors. Each graphic character corresponds to a
- unique code of the ISO seven-bit coded character set (ISO standard 646),
- and is represented (visually) by a graphical symbol. Some graphic
- characters are represented by different graphical symbols in alternative
- national representations of the ISO character set. The description of the
- language definition in this standard reference manual uses the ASCII
- graphical symbols, the ANSI graphical representation of the ISO character
- set.
-
- graphic_character ::= basic_graphic_character
- | lower_case_letter | other_special_character
-
- basic_graphic_character ::=
- upper_case_letter | digit
- | special_character | space_character
-
- basic_character ::=
- basic_graphic_character | format_effector
-
- The basic character set is sufficient for writing any program. The
- characters included in each of the categories of basic graphic characters
- are defined as follows:
-
- (a) upper case letters
- A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
-
- (b) digits
- 0 1 2 3 4 5 6 7 8 9
-
-
-
-
- 2 - 1
-
-
-
-
-
-
-
-
- (c) special characters
- " # & ' ( ) * + , - . / : ; < = > _ |
-
- (d) the space character
-
- Format effectors are the ISO (and ASCII) characters called horizontal
- tabulation, vertical tabulation, carriage return, line feed, and form feed.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- 2 - 2
-
-
-
-
-
-
-
-
- The characters included in each of the remaining categories of graphic
- characters are defined as follows:
-
- (e) lower case letters
- a b c d e f g h i j k l m n o p q r s t u v w x y z
-
- (f) other special characters
- ! $ % ? @ [ \ ] ^ ` { }
-
- Allowable replacements for the special characters vertical bar (|), sharp
- (#), and quotation (") are defined in section 2.10.
-
- Notes:
-
- The ISO character that corresponds to the sharp graphical symbol in the
- ASCII representation appears as a pound sterling symbol in the French,
- German, and United Kingdom standard national representations. In any case,
- the font design of graphical symbols (for example, whether they are in
- italic or bold typeface) is not part of the ISO standard.
-
- The meanings of the acronyms used in this section are as follows: ANSI
- stands for American National Standards Institute, ASCII stands for American
- Standard Code for Information Interchange, and ISO stands for International
- Organization for Standardization.
-
- The following names are used when referring to special characters and other
- special characters:
-
- symbol name symbol name
-
- " quotation > greater than
- # sharp _ underline
- & ampersand | vertical bar
- ' apostrophe ! exclamation mark
- ( left parenthesis $ dollar
- ) right parenthesis % percent
- * star, multiply ? question mark
- + plus @ commercial at
- , comma [ left square bracket
- - hyphen, minus \ back-slash
- . dot, point, period ] right square bracket
- / slash, divide ^ circumflex
- : colon ` grave accent
- ; semicolon { left brace
- < less than } right brace
- = equal ` tilde
-
-
-
-
- 2.2 Lexical Elements, Separators, and Delimiters
-
-
- The text of a program consists of the texts of one or more compilations.
- The text of each compilation is a sequence of separate lexical elements.
-
-
- 2 - 3
-
-
-
-
-
-
-
-
- Each lexical element is either a delimiter, an identifier (which may be a
- reserved word), a numeric literal, a character literal, a string literal,
- or a comment. The effect of a program depends only on the particular
- sequences of lexical elements that form its compilations, excluding the
- comments, if any.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- 2 - 4
-
-
-
-
-
-
-
-
- In some cases an explicit separator is required to separate adjacent
- lexical elements (namely, when without separation, interpretation as a
- single lexical element is possible). A separator is any of a space
- character, a format effector, or the end of a line. A space character is a
- separator except within a comment, a string literal, or a space character
- literal. Format effectors other than horizontal tabulation are always
- separators. Horizontal tabulation is a separator except within a comment.
-
- The end of a line is always a separator. The language does not define what
- causes the end of a line. However if, for a given implementation, the end
- of a line is signified by one or more characters, then these characters
- must be format effectors other than horizontal tabulation. In any case, a
- sequence of one or more format effectors other than horizontal tabulation
- must cause at least one end of line.
-
- One or more separators are allowed between any two adjacent lexical
- elements, before the first of each compilation, or after the last. At
- least one separator is required between an identifier or a numeric literal
- and an adjacent identifier or numeric literal.
-
- A delimiter is either one of the following special characters (in the basic
- character set)
-
- & ' ( ) * + , - . / : ; < = > |
-
- or one of the following compound delimiters each composed of two adjacent
- special characters
-
- => .. ** := /= >= <= << >> <>
-
- Each of the special characters listed for single character delimiters is a
- single delimiter except if this character is used as a character of a
- compound delimiter, or as a character of a comment, string literal,
- character literal, or numeric literal.
-
- The remaining forms of lexical element are described in other sections of
- this chapter.
-
- Notes:
-
- Each lexical element must fit on one line, since the end of a line is a
- separator. The quotation, sharp, and underline characters, likewise two
- adjacent hyphens, are not delimiters, but may form part of other lexical
- elements.
-
- The following names are used when referring to compound delimiters:
-
- delimiter name
-
- => arrow
- .. double dot
- ** double star, exponentiate
- := assignment (pronounced: "becomes")
- /= inequality (pronounced: "not equal")
- >= greater than or equal
-
-
- 2 - 5
-
-
-
-
-
-
-
-
- <= less than or equal
- << left label bracket
- >> right label bracket
- <> box
-
- References: character literal 2.5, comment 2.7, compilation 10.1, format
- effector 2.1, identifier 2.3, numeric literal 2.4, reserved word 2.9, space
- character 2.1, special character 2.1, string literal 2.6
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- 2 - 6
-
-
-
-
-
-
-
-
- 2.3 Identifiers
-
-
- Identifiers are used as names and also as reserved words.
-
- identifier ::=
- letter {[underline] letter_or_digit}
-
- letter_or_digit ::= letter | digit
-
- letter ::= upper_case_letter | lower_case_letter
-
- All characters of an identifier are significant, including any underline
- character inserted between a letter or digit and an adjacent letter or
- digit. Identifiers differing only in the use of corresponding upper and
- lower case letters are considered as the same.
-
- Examples:
-
- COUNT X get_symbol Ethelyn Marion
-
- SNOBOL_4 X1 PageCount STORE_NEXT_ITEM
-
- Note:
-
- No space is allowed within an identifier since a space is a separator.
-
- References: digit 2.1, lower case letter 2.1, name 4.1, reserved word 2.9,
- separator 2.2, space character 2.1, upper case letter 2.1
-
-
-
-
- 2.4 Numeric Literals
-
-
- There are two classes of numeric literals: real literals and integer
- literals. A real literal is a numeric literal that includes a point; an
- integer literal is a numeric literal without a point. Real literals are
- the literals of the type universal_real. Integer literals are the literals
- of the type universal_integer.
-
- numeric_literal ::= decimal_literal | based_literal
-
-
- References: literal 4.2, universal_integer type 3.5.4, universal_real type
- 3.5.6
-
-
-
-
- 2.4.1 Decimal Literals
-
-
-
-
-
- 2 - 7
-
-
-
-
-
-
-
-
- A decimal literal is a numeric literal expressed in the conventional
- decimal notation (that is, the base is implicitly ten).
-
- decimal_literal ::= integer [.integer] [exponent]
-
- integer ::= digit {[underline] digit}
-
- exponent ::= E [+] integer | E - integer
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- 2 - 8
-
-
-
-
-
-
-
-
- An underline character inserted between adjacent digits of a decimal
- literal does not affect the value of this numeric literal. The letter E of
- the exponent, if any, can be written either in lower case or in upper case,
- with the same meaning.
-
- An exponent indicates the power of ten by which the value of the decimal
- literal without the exponent is to be multiplied to obtain the value of the
- decimal literal with the exponent. An exponent for an integer literal must
- not have a minus sign.
-
- Examples:
-
- 12 0 1E6 123_456 -- integer literals
-
- 12.0 0.0 0.456 3.14159_26 -- real literals
-
- 1.34E-12 1.0E+6 -- real literals with exponent
-
-
- Notes:
-
- Leading zeros are allowed. No space is allowed in a numeric literal, not
- even between constituents of the exponent, since a space is a separator. A
- zero exponent is allowed for an integer literal.
-
- References: digit 2.1, lower case letter 2.1, numeric literal 2.4,
- separator 2.2, space character 2.1, upper case letter 2.1
-
-
-
-
- 2.4.2 Based Literals
-
-
- A based literal is a numeric literal expressed in a form that specifies the
- base explicitly. The base must be at least two and at most sixteen.
-
- based_literal ::=
- base # based_integer [.based_integer] # [exponent]
-
- base ::= integer
-
- based_integer ::=
- extended_digit {[underline] extended_digit}
-
- extended_digit ::= digit | letter
-
- An underline character inserted between adjacent digits of a based literal
- does not affect the value of this numeric literal. The base and the
- exponent, if any, are in decimal notation. The only letters allowed as
- extended digits are the letters A through F for the digits ten through
- fifteen. A letter in a based literal (either an extended digit or the
- letter E of an exponent) can be written either in lower case or in upper
- case, with the same meaning.
-
-
-
- 2 - 9
-
-
-
-
-
-
-
-
- The conventional meaning of based notation is assumed; in particular the
- value of each extended digit of a based literal must be less than the base.
- An exponent indicates the power of the base by which the value of the based
- literal without the exponent is to be multiplied to obtain the value of the
- based literal with the exponent.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- 2 - 10
-
-
-
-
-
-
-
-
- Examples:
-
- 2#1111_1111# 16#FF# 016#0FF# -- integer literals of value 255
- 16#E#E1 2#1110_0000# -- integer literals of value 224
- 16#F.FF#E+2 2#1.1111_1111_111#E11 -- real literals of value 4095.0
-
- References: digit 2.1, exponent 2.4.1, letter 2.3, lower case letter 2.1,
- numeric literal 2.4, upper case letter 2.1
-
-
-
-
- 2.5 Character Literals
-
-
- A character literal is formed by enclosing one of the 95 graphic
- characters (including the space) between two apostrophe characters. A
- character literal has a value that belongs to a character type.
-
- character_literal ::= 'graphic_character'
-
- Examples:
-
- 'A' '*' ''' ' '
-
- References: character type 3.5.2, graphic character 2.1, literal 4.2,
- space character 2.1
-
-
-
-
- 2.6 String Literals
-
-
- A string literal is formed by a sequence of graphic characters (possibly
- none) enclosed between two quotation characters used as string brackets.
-
- string_literal ::= "{graphic_character}"
-
- A string literal has a value that is a sequence of character values
- corresponding to the graphic characters of the string literal apart from
- the quotation character itself. If a quotation character value is to be
- represented in the sequence of character values, then a pair of adjacent
- quotation characters must be written at the corresponding place within the
- string literal. (This means that a string literal that includes two
- adjacent quotation characters is never interpreted as two adjacent string
- literals.)
-
- The length of a string literal is the number of character values in the
- sequence represented. (Each doubled quotation character is counted as a
- single character.)
-
- Examples:
-
-
-
-
- 2 - 11
-
-
-
-
-
-
-
-
- "Message of the day:"
-
- "" -- an empty string literal
- " " "A" """" -- three string literals of length 1
-
- "Characters such as $, %, and } are allowed in string literals"
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- 2 - 12
-
-
-
-
-
-
-
-
- Note:
-
- A string literal must fit on one line since it is a lexical element (see
- 2.2). Longer sequences of graphic character values can be obtained by
- catenation of string literals. Similarly catenation of constants declared
- in the package ASCII can be used to obtain sequences of character values
- that include nongraphic character values (the so-called control
- characters). Examples of such uses of catenation are given below:
-
- "FIRST PART OF A SEQUENCE OF CHARACTERS " &
- "THAT CONTINUES ON THE NEXT LINE"
-
- "sequence that includes the" & ASCII.ACK & "control character"
-
- References: ascii predefined package C, catenation operation 4.5.3,
- character value 3.5.2, constant 3.2.1, declaration 3.1, end of a line 2.2,
- graphic character 2.1, lexical element 2.2
-
-
-
-
- 2.7 Comments
-
-
- A comment starts with two adjacent hyphens and extends up to the end of the
- line. A comment can appear on any line of a program. The presence or
- absence of comments has no influence on whether a program is legal or
- illegal. Furthermore, comments do not influence the effect of a program;
- their sole purpose is the enlightenment of the human reader.
-
- Examples:
-
- -- the last sentence above echoes the Algol 68 report
-
- end; -- processing of LINE is complete
-
- -- a long comment may be split onto
- -- two or more consecutive lines
-
- ---------------- the first two hyphens start the comment
-
- Note:
-
- Horizontal tabulation can be used in comments, after the double hyphen,
- and is equivalent to one or more spaces (see 2.2).
-
- References: end of a line 2.2, illegal 1.6, legal 1.6, space character 2.1
-
-
-
-
- 2.8 Pragmas
-
-
-
-
-
- 2 - 13
-
-
-
-
-
-
-
-
- A pragma is used to convey information to the compiler. A pragma starts
- with the reserved word pragma followed by an identifier that is the name of
- the pragma.
-
- pragma ::=
- pragma identifier [(argument_association {, argument_association})];
-
- argument_association ::=
- [argument_identifier =>] name
- | [argument_identifier =>] expression
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- 2 - 14
-
-
-
-
-
-
-
-
- Pragmas are only allowed at the following places in a program:
-
- - After a semicolon delimiter, but not within a formal part or
- discriminant part.
-
- - At any place where the syntax rules allow a construct defined by a
- syntactic category whose name ends with "declaration", "statement",
- "clause", or "alternative", or one of the syntactic categories variant
- and exception handler; but not in place of such a construct. Also at
- any place where a compilation unit would be allowed.
-
- Additional restrictions exist for the placement of specific pragmas.
-
- Some pragmas have arguments. Argument associations can be either
- positional or named as for parameter associations of subprogram calls (see
- 6.4). Named associations are, however, only possible if the argument
- identifiers are defined. A name given in an argument must be either a name
- visible at the place of the pragma or an identifier specific to the pragma.
-
- The pragmas defined by the language are described in Annex B: they must be
- supported by every implementation. In addition, an implementation may
- provide implementation-defined pragmas, which must then be described in
- Appendix F. An implementation is not allowed to define pragmas whose
- presence or absence influences the legality of the text outside such
- pragmas. Consequently, the legality of a program does not depend on the
- presence or absence of implementation-defined pragmas.
-
- A pragma that is not language-defined has no effect if its identifier is
- not recognized by the (current) implementation. Furthermore, a pragma
- (whether language-defined or implementation-defined) has no effect if its
- placement or its arguments do not correspond to what is allowed for the
- pragma. The region of text over which a pragma has an effect depends on
- the pragma.
-
- Examples:
-
- pragma LIST(OFF);
- pragma OPTIMIZE(TIME);
- pragma INLINE(SETMASK);
- pragma SUPPRESS(RANGE_CHECK, ON => INDEX);
-
- Note:
-
- It is recommended (but not required) that implementations issue warnings
- for pragmas that are not recognized and therefore ignored.
-
- References: compilation unit 10.1, delimiter 2.2, discriminant part 3.7.1,
- exception handler 11.2, expression 4.4, formal part 6.1, identifier 2.3,
- implementation-defined pragma F, language-defined pragma B, legal 1.6, name
- 4.1, reserved word 2.9, statement 5, static expression 4.9, variant 3.7.3,
- visibility 8.3
-
- Categories ending with "declaration" comprise: basic declaration 3.1,
- component declaration 3.7, entry declaration 9.5, generic parameter
- declaration 12.1
-
-
- 2 - 15
-
-
-
-
-
-
-
-
- Categories ending with "clause" comprise: alignment clause 13.4, component
- clause 13.4, context clause 10.1.1, representation clause 13.1, use clause
- 8.4, with clause 10.1.1
-
- Categories ending with "alternative" comprise: accept alternative 9.7.1,
- case statement alternative 5.4, delay alternative 9.7.1, select alternative
- 9.7.1, selective wait alternative 9.7.1, terminate alternative 9.7.1
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- 2 - 16
-
-
-
-
-
-
-
-
- 2.9 Reserved Words
-
-
- The identifiers listed below are called reserved words and are reserved for
- special significance in the language. For readability of this manual, the
- reserved words appear in lower case boldface.
-
-
- abort declare generic of select
- abs delay goto or separate
- accept delta others subtype
- access digits if out
- all do in task
- and is package terminate
- array pragma then
- at else private type
- elsif limited procedure
- end loop
- begin entry raise use
- body exception range
- exit mod record when
- rem while
- new renames with
- case for not return
- constant function null reverse xor
-
- A reserved word must not be used as a declared identifier.
-
- Notes:
-
- Reserved words differing only in the use of corresponding upper and lower
- case letters are considered as the same (see 2.3). In some attributes the
- identifier that appears after the apostrophe is identical to some reserved
- word.
-
- References: attribute 4.1.4, declaration 3.1, identifier 2.3, lower case
- letter 2.1, upper case letter 2.1
-
-
-
-
- 2.10 Allowable Replacements of Characters
-
-
- The following replacements are allowed for the vertical bar, sharp, and
- quotation basic characters:
-
- - A vertical bar character (|) can be replaced by an exclamation mark
- (!) where used as a delimiter.
-
- - The sharp characters (#) of a based literal can be replaced by colons
- (:) provided that the replacement is done for both occurrences.
-
- - The quotation characters (") used as string brackets at both ends of a
- string literal can be replaced by percent characters (%) provided that
-
-
- 2 - 17
-
-
-
-
-
-
-
-
- the enclosed sequence of characters contains no quotation character,
- and provided that both string brackets are replaced. Any percent
- character within the sequence of characters must then be doubled and
- each such doubled percent character is interpreted as a single percent
- character value.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- 2 - 18
-
-
-
-
-
-
-
-
- These replacements do not change the meaning of the program.
-
- Notes:
-
- It is recommended that use of the replacements for the vertical bar, sharp,
- and quotation characters be restricted to cases where the corresponding
- graphical symbols are not available. Note that the vertical bar appears as
- a broken bar on some equipment; replacement is not recommended in this
- case.
-
- The rules given for identifiers and numeric literals are such that lower
- case and upper case letters can be used indifferently; these lexical
- elements can thus be written using only characters of the basic character
- set. If a string literal of the predefined type STRING contains characters
- that are not in the basic character set, the same sequence of character
- values can be obtained by catenating string literals that contain only
- characters of the basic character set with suitable character constants
- declared in the predefined package ASCII. Thus the string literal "AB$CD"
- could be replaced by "AB" & ASCII.DOLLAR & "CD". Similarly, the string
- literal "ABcd" with lower case letters could be replaced by "AB" &
- ASCII.LC_C & ASCII.LC_D.
-
- References: ascii predefined package C, based literal 2.4.2, basic
- character 2.1, catenation operation 4.5.3, character value 3.5.2, delimiter
- 2.2, graphic character 2.1, graphical symbol 2.1, identifier 2.3, lexical
- element 2.2, lower case letter 2.1, numeric literal 2.4, string bracket
- 2.6, string literal 2.6, upper case letter 2.1
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- 2 - 19
-
-
-
-
-