home *** CD-ROM | disk | FTP | other *** search
-
-
-
- SSSSGGGGMMMMLLLLSSSS((((1111)))) UUUUNNNNIIIIXXXX SSSSyyyysssstttteeeemmmm VVVV SSSSGGGGMMMMLLLLSSSS((((1111))))
-
-
-
- NNNNAAAAMMMMEEEE
- sgmls - a validating SGML parser
-
- An SGML System Conforming to
- International Standard ISO 8879 -
- Standard Generalized Markup Language
-
- SSSSYYYYNNNNOOOOPPPPSSSSIIIISSSS
- ssssggggmmmmllllssss [ ----ddddeeeeggggllllpppprrrrssssuuuuvvvv ] [ ----cccc_f_i_l_e ] [ ----iiii_n_a_m_e ] [ _f_i_l_e_n_a_m_e... ]
-
- DDDDEEEESSSSCCCCRRRRIIIIPPPPTTTTIIIIOOOONNNN
- _S_g_m_l_s parses and validates the SGML document entity in
- _f_i_l_e_n_a_m_e... and prints on the standard output a simple ASCII
- representation of its Element Structure Information Set.
- (This is the information set which a structure-controlled
- conforming SGML application should act upon.) Note that the
- document entity may be spread amongst several files; for
- example, the SGML declaration, document type declaration and
- document instance set could each be in a separate file. If
- no filenames are specified, then _s_g_m_l_s will read the
- document entity from the standard input. A filename of ----
- can also be used to refer to the standard input.
-
- The following options are available:
-
- ----cccc_f_i_l_e
- Write a report of capacity usage to _f_i_l_e. The report
- is in the format of a RACT result. RACT is the
- Reference Application for Capacity Testing defined in
- the Proposed American National Standard _C_o_n_f_o_r_m_a_n_c_e
- _T_e_s_t_i_n_g _f_o_r _S_t_a_n_d_a_r_d _G_e_n_e_r_a_l_i_z_e_d _M_a_r_k_u_p _L_a_n_g_u_a_g_e (_S_G_L)
- _S_y_s_t_e_m_s (X3.190-199X), Draft July 1991.
-
- ----dddd Warn about duplicate entity declarations.
-
- ----eeee Describe open entities in error messages. Error
- messages always include the position of the most
- recently opened external entity.
-
- ----gggg Show the GIs of open elements in error messages.
-
- ----iiii_n_a_m_e
- Pretend that
-
- <<<<!!!!EEEENNNNTTTTIIIITTTTYYYY %%%% _n_a_m_e """"IIIINNNNCCCCLLLLUUUUDDDDEEEE"""">>>>
-
- occurs at the start of the document type declaration
- subset in the SGML document entity. Since repeated
- definitions of an entity are ignored, this definition
- will take precedence over any other definitions of this
- entity in the document type declaration. Multiple ----iiii
- options are allowed. If the SGML declaration replaces
-
-
-
- Page 1 (printed 7/3/94)
-
-
-
-
-
-
- SSSSGGGGMMMMLLLLSSSS((((1111)))) UUUUNNNNIIIIXXXX SSSSyyyysssstttteeeemmmm VVVV SSSSGGGGMMMMLLLLSSSS((((1111))))
-
-
-
- the reserved name IIIINNNNCCCCLLLLUUUUDDDDEEEE then the new reserved name
- will be the replacement text of the entity. Typically
- the document type declaration will contain
-
- <<<<!!!!EEEENNNNTTTTIIIITTTTYYYY %%%% _n_a_m_e """"IIIIGGGGNNNNOOOORRRREEEE"""">>>>
-
- and will use %%%%_n_a_m_e;;;; in the status keyword specification
- of a marked section declaration. In this case the
- effect of the option will be to cause the marked
- section not to be ignored.
-
- ----llll Output LLLL commands giving the current line number and
- filename.
-
- ----pppp Parse only the prolog. _S_g_m_l_s will exit after parsing
- the document type declaration. Implies ----ssss.
-
- ----rrrr Warn about defaulted references.
-
- ----ssss Suppress output. Error messages will still be printed.
-
- ----uuuu Warn about undefined elements: elements used in the DTD
- but not defined. Also warn about undefined short
- reference maps.
-
- ----vvvv Print the version number.
-
- EEEEnnnnttttiiiittttyyyy MMMMaaaannnnaaaaggggeeeerrrr
- An external entity resides in one or more files. The entity
- manager component of _s_g_m_l_s maps a sequence of files into an
- entity in three sequential stages:
-
- 1. each carriage return character is turned into a non-
- SGML character;
-
- 2. each newline character is turned into a record end
- character, and at the same time a record start
- character is inserted at the beginning of each line;
-
- 3. the files are concatenated.
-
- A system identifier is interpreted as a list of filenames
- separated by colons. A filename of ---- can be used to refer
- to the standard input. If no system identifier is supplied,
- then the entity manager will attempt to generate a filename
- using the public identifier (if there is one) and other
- information available to it. Notation identifiers are not
- subject to this treatment. This process is controlled by
- the environment variable SSSSGGGGMMMMLLLL____PPPPAAAATTTTHHHH; this contains a colon-
- separated list of filename templates. A filename template
- is a filename that may contain substitution fields; a
- substitution field is a %%%% character followed by a single
-
-
-
- Page 2 (printed 7/3/94)
-
-
-
-
-
-
- SSSSGGGGMMMMLLLLSSSS((((1111)))) UUUUNNNNIIIIXXXX SSSSyyyysssstttteeeemmmm VVVV SSSSGGGGMMMMLLLLSSSS((((1111))))
-
-
-
- letter that indicates the value of the substitution. If
- SSSSGGGGMMMMLLLL____PPPPAAAATTTTHHHH uses the %%%%SSSS field (the value of which is the
- system identifier), then the entity manager will also use
- SSSSGGGGMMMMLLLL____PPPPAAAATTTTHHHH to generate a filename when a system identifier
- that does not contain any colons is supplied. The value of
- a substitution can either be a string or it can be _n_u_l_l.
- The entity manager transforms the list of filename templates
- into a list of filenames by substituting for each
- substitution field and discarding any template that
- contained a substitution field whose value was null. It
- then uses the first resulting filename that exists and is
- readable. Substitution values are transformed before being
- used for substitution: firstly, any names that were subject
- to upper case substitution are folded to lower case;
- secondly, space characters are mapped to underscores and
- slashes are mapped to percents. The value of the %%%%SSSS field
- is not transformed. The values of substitution fields are
- as follows:
-
- %%%%%%%% A single %%%%.
-
- %%%%DDDD The entity's data content notation. This substitution
- will succeed only for external data entities.
-
- %%%%NNNN The entity, notation or document type name.
-
- %%%%PPPP The public identifier if there was a public identifier,
- otherwise null.
-
- %%%%SSSS The system identifier if there was a system identifier
- otherwise null.
-
- %%%%XXXX (This is provided mainly for compatibility with
- ARCSGML.) A three-letter string chosen as follows:
-
- tab(&); c|c|c s c|c|c s c|c|c|c c|c|c|c l|lB|lB|lB.
- &&With public identifier &&_ &No public&Device&Device
- &identifier&independent&dependent _ Data or subdocument
- entity&nsd&pns&vns General SGML text entity&gml&pge&vge
- Parameter entity&spe&ppe&vpe Document type
- definition&dtd&pdt&vdt Link process
- definition&lpd&plp&vlp
- The device dependent version is selected if the public
- text class allows a public text display version but no
- public text display version was specified.
-
- %%%%YYYY The type of thing for which the filename is being
- generated:
- tab(&); l lB. SGML subdocument entity&sgml Data
- entity&data General text entity&text Parameter
-
-
-
- Page 3 (printed 7/3/94)
-
-
-
-
-
-
- SSSSGGGGMMMMLLLLSSSS((((1111)))) UUUUNNNNIIIIXXXX SSSSyyyysssstttteeeemmmm VVVV SSSSGGGGMMMMLLLLSSSS((((1111))))
-
-
-
- entity&parm Document type definition&dtd Link process
- definition&lpd
- The value of the following substitution fields will be null
- unless a valid formal public identifier was supplied.
-
- %%%%AAAA Null if the text identifier in the formal public
- identifier contains an unavailable text indicator,
- otherwise the empty string.
-
- %%%%CCCC The public text class, mapped to lower case.
-
- %%%%EEEE The public text designating sequence (escape sequence)
- if the public text class is CCCCHHHHAAAARRRRSSSSEEEETTTT, otherwise null.
-
- %%%%IIII The empty string if the owner identifier in the formal
- public identifier is an ISO owner identifier, otherwise
- null.
-
- %%%%LLLL The public text language, mapped to lower case, unless
- the public text class is CCCCHHHHAAAARRRRSSSSEEEETTTT, in which case null.
-
- %%%%OOOO The owner identifier (with the ++++//////// or ----//////// prefix
- stripped.)
-
- %%%%RRRR The empty string if the owner identifier in the formal
- public identifier is a registered owner identifier,
- otherwise null.
-
- %%%%TTTT The public text description.
-
- %%%%UUUU The empty string if the owner identifier in the formal
- public identifier is an unregistered owner identifier,
- otherwise null.
-
- %%%%VVVV The public text display version. This substitution
- will be null if the public text class does not allow a
- display version or if no version was specified. If an
- empty version was specified, a value of ddddeeeeffffaaaauuuulllltttt will be
- used.
-
-
-
-
-
-
-
-
-
-
-
-
-
- Page 4 (printed 7/3/94)
-
-
-
-
-
-
- SSSSGGGGMMMMLLLLSSSS((((1111)))) UUUUNNNNIIIIXXXX SSSSyyyysssstttteeeemmmm VVVV SSSSGGGGMMMMLLLLSSSS((((1111))))
-
-
-
- SSSSyyyysssstttteeeemmmm ddddeeeeccccllllaaaarrrraaaattttiiiioooonnnn
- The system declaration for _s_g_m_l_s is as follows:
-
- tab(&); c1 s1 s1 s1 s1 s1 s1 s1 s c s s s s s s s s l l s s
- s s s s s l l s s s s s s s l l s s s s s s s l l l s s s s
- s s c s s s s s s s s l l l l l l l l l l l l l l l l l l l
- l l l l l l l l l l s s s s s s s l l l s s s s s s l l l s
- s s s s s c s s s s s s s s l l l l l l l l l. SYSTEM "ISO
- 8879:1986" CHARSET BASESET&"ISO 646-1983//CHARSET
- & International Reference Version (IRV)//ESC 2/5 4/0"
- DESCSET&0 128 0 CAPACITY&PUBLIC&"ISO 8879:1986//CAPACITY
- Reference//EN" FEATURES
- MINIMIZE&DATATAG&NO&OMITTAG&YES&RANK&NO&SHORTTAG&YES
- LINK&SIMPLE&NO&IMPLICIT&NO&EXPLICIT&NO
- OTHER&CONCUR&NO&SUBDOC&YES 1&FORMAL&YES SCOPE&DOCUMENT
- SYNTAX&PUBLIC&"ISO 8879:1986//SYNTAX Reference//EN"
- SYNTAX&PUBLIC&"ISO 8879:1986//SYNTAX Core//EN" VALIDATE
- &GENERAL&YES&MODEL&YES&EXCLUDE&YES&CAPACITY&YES
- &NONSGML&YES&SGML&YES&FORMAL&YES c s s s s s s s s l l l l l
- l l l l. SDIF &PACK&NO&UNPACK&NO
- The memory usage of _s_g_m_l_s is not a function of the capacity
- points used by a document; however, _s_g_m_l_s can handle
- capacities significantly greater than the reference capacity
- set.
-
- In some environments, higher values may be supported for the
- SUBDOC parameter.
-
- Documents that do not use optional features are also
- supported. For example, if FFFFOOOORRRRMMMMAAAALLLL NNNNOOOO is specified in the
- SGML declaration, public identifiers will not be required to
- be valid formal public identifiers.
-
- Certain parts of the concrete syntax may be changed:
-
- The shunned character numbers can be changed.
-
- Eight bit characters can be assigned to LCNMSTRT,
- UCNMSTRT, LCNMCHAR and UCNMCHAR. Declaring this
- requires that the syntax reference character set be
- declared like this:
- tab(&); l l. BASESET&"ISO Registration Number
- 100//CHARSET & ECMA-94 Right Part of Latin
- Alphabet Nr. 1//ESC 2/13 4/1" DESCSET&0 256 0
- Uppercase substitution can be performed or not
- performed both for entity names and for other names.
-
- Either short reference delimiters assigned by the
- reference delimiter set or no short reference
-
-
- Page 5 (printed 7/3/94)
-
-
-
-
-
-
- SSSSGGGGMMMMLLLLSSSS((((1111)))) UUUUNNNNIIIIXXXX SSSSyyyysssstttteeeemmmm VVVV SSSSGGGGMMMMLLLLSSSS((((1111))))
-
-
-
- delimiters are supported.
-
- The reserved names can be changed.
-
- The quantity set can be increased within certain limits
- subject to there being sufficient memory available.
- The upper limit on NAMELEN is 239. The upper limits on
- ATTCNT, ATTSPLEN, BSEQLEN, ENTLVL, LITLEN, PILEN,
- TAGLEN, and TAGLVL are more than thirty times greater
- than the reference limits. The upper limit on GRPCNT,
- GRPGTCNT, and GRPLVL is 253. NORMSEP cannot be
- changed. DTAGLEN are DTEMPLEN irrelevant since _s_g_m_l_s
- does not support the DATATAG feature.
-
- SSSSGGGGMMMMLLLL ddddeeeeccccllllaaaarrrraaaattttiiiioooonnnn
- The SGML declaration may be omitted, the following
- declaration will be implied:
- tab(&); c1 s1 s1 s1 s1 s1 s1 s1 s c s s s s s s s s l l s s
- s s s s s. <!SGML "ISO 8879:1986" CHARSET BASESET&"ISO
- 646-1983//CHARSET & International Reference Version
- (IRV)//ESC 2/5 4/0" DESCSET& 0 9 UNUSED & 9 2 9
- & 11 2 UNUSED & 13 1 13 & 14 18 UNUSED & 32 95 32
- &127 1 UNUSED l l l s s s s s s l l s s s s s s s l l l s s
- s s s s c s s s s s s s s l l l l l l l l l.
- CAPACITY&PUBLIC&"ISO 8879:1986//CAPACITY Reference//EN"
- SCOPE&DOCUMENT SYNTAX&PUBLIC&"ISO 8879:1986//SYNTAX
- Reference//EN" FEATURES
- MINIMIZE&DATATAG&NO&OMITTAG&YES&RANK&NO&SHORTTAG&YES
- LINK&SIMPLE&NO&IMPLICIT&NO&EXPLICIT&NO
- OTHER&CONCUR&NO&SUBDOC&YES 99999999&FORMAL&YES c s s s s s s
- s s. APPINFO NONE>
- with the exception that characters 128 through 254 will be
- assigned to DATACHAR. When exporting documents that use
- characters in this range, an accurate description of the
- upper half of the document character set should be added to
- this declaration. For ISO Latin-1, an appropriate
- description would be:
- tab(&); l l. BASESET&"ISO Registration Number 100//CHARSET
- & ECMA-94 Right Part of Latin Alphabet Nr. 1//ESC 2/13 4/1"
- DESCSET&128 32 UNUSED &160 95 32 &255 1 UNUSED
- OOOOuuuuttttppppuuuutttt ffffoooorrrrmmmmaaaatttt
- The output is a series of lines. Lines can be arbitrarily
- long. Each line consists of an initial command character
- and one or more arguments. Arguments are separated by a
- single space, but when a command takes a fixed number of
- arguments the last argument can contain spaces. There is no
- space between the command character and the first argument.
- Arguments can contain the following escape sequences.
-
-
-
-
- Page 6 (printed 7/3/94)
-
-
-
-
-
-
- SSSSGGGGMMMMLLLLSSSS((((1111)))) UUUUNNNNIIIIXXXX SSSSyyyysssstttteeeemmmm VVVV SSSSGGGGMMMMLLLLSSSS((((1111))))
-
-
-
- \\\\\\\\ A \\\\....
-
- \\\\nnnn A record end character.
-
- \\\\|||| Internal SDATA entities are bracketed by these.
-
- \\\\_n_n_n The character whose code is _n_n_n octal.
-
- A record start character will be represented by \\\\000011112222. Most
- applications will need to ignore \\\\000011112222 and translate \\\\nnnn into
- newline.
-
- The possible command characters and arguments are as
- follows:
-
- ((((_g_i The start of an element whose generic identifier is _g_i.
- Any attributes for this element will have been
- specified with AAAA commands.
-
- ))))_g_i The end an element whose generic identifier is _g_i.
-
- ----_d_a_t_a
- Data.
-
- &&&&_n_a_m_e
- A reference to an external data entity _n_a_m_e; _n_a_m_e will
- have been defined using an EEEE command.
-
- ????_p_i A processing instruction with data _p_i.
-
- AAAA_n_a_m_e _v_a_l
- The next element to start has an attribute _n_a_m_e with
- value _v_a_l which takes one of the following forms:
-
- IIIIMMMMPPPPLLLLIIIIEEEEDDDD
- The value of the attribute is implied.
-
- CCCCDDDDAAAATTTTAAAA _d_a_t_a
- The attribute is character data. This is used for
- attributes whose declared value is CCCCDDDDAAAATTTTAAAA.
-
- NNNNOOOOTTTTAAAATTTTIIIIOOOONNNN _n_n_a_m_e
- The attribute is a notation name; _n_n_a_m_e will have
- been defined using a NNNN command. This is used for
- attributes whose declared value is NNNNOOOOTTTTAAAATTTTIIIIOOOONNNN.
-
- EEEENNNNTTTTIIIITTTTYYYY _n_a_m_e...
- The attribute is a list of general entity names.
- Each entity name will have been defined using an
- IIII, EEEE or SSSS command. This is used for attributes
- whose declared value is EEEENNNNTTTTIIIITTTTYYYY or EEEENNNNTTTTIIIITTTTIIIIEEEESSSS.
-
-
-
-
- Page 7 (printed 7/3/94)
-
-
-
-
-
-
- SSSSGGGGMMMMLLLLSSSS((((1111)))) UUUUNNNNIIIIXXXX SSSSyyyysssstttteeeemmmm VVVV SSSSGGGGMMMMLLLLSSSS((((1111))))
-
-
-
- TTTTOOOOKKKKEEEENNNN _t_o_k_e_n...
- The attribute is a list of tokens. This is used
- for attributes whose declared value is anything
- else.
-
- DDDD_e_n_a_m_e _n_a_m_e _v_a_l
- This is the same as the AAAA command, except that it
- specifies a data attribute for an external entity named
- _e_n_a_m_e. Any DDDD commands will come after the EEEE command
- that defines the entity to which they apply, but before
- any &&&& or AAAA commands that reference the entity.
-
- NNNN_n_n_a_m_e
- _n_n_a_m_e. Define a notation This command will be preceded
- by a pppp command if the notation was declared with a
- public identifier, and by a ssss command if the notation
- was declared with a system identifier. A notation will
- only be defined if it is to be referenced in an EEEE
- command or in an AAAA command for an attribute with a
- declared value of NNNNOOOOTTTTAAAATTTTIIIIOOOONNNN.
-
- EEEE_e_n_a_m_e _t_y_p _n_n_a_m_e
- Define an external data entity named _e_n_a_m_e with type
- _t_y_p (CCCCDDDDAAAATTTTAAAA, NNNNDDDDAAAATTTTAAAA or SSSSDDDDAAAATTTTAAAA) and notation _n_o_t. This
- command will be preceded by one or more ffff commands
- giving the filenames generated by the entity manager
- from the system and public identifiers, by a pppp command
- if a public identifier was declared for the entity, and
- by a ssss command if a system identifier was declared for
- the entity. _n_o_t will have been defined using a NNNN
- command. Data attributes may be specified for the
- entity using DDDD commands. An external data entity will
- only be defined if it is to be referenced in a &&&&
- command or in an AAAA command for an attribute whose
- declared value is EEEENNNNTTTTIIIITTTTYYYY or EEEENNNNTTTTIIIITTTTIIIIEEEESSSS.
-
- IIII_e_n_a_m_e _t_y_p _t_e_x_t
- Define an internal data entity named _e_n_a_m_e with type
- _t_y_p (CCCCDDDDAAAATTTTAAAA or SSSSDDDDAAAATTTTAAAA) and entity text _t_e_x_t. An internal
- data entity will only be defined if it is referenced in
- an AAAA command for an attribute whose declared value is
- EEEENNNNTTTTIIIITTTTYYYY or EEEENNNNTTTTIIIITTTTIIIIEEEESSSS.
-
- SSSS_e_n_a_m_e
- Define a subdocument entity named _e_n_a_m_e. This command
- will be preceded by one or more ffff commands giving the
- filenames generated by the entity manager from the
- system and public identifiers, by a pppp command if a
- public identifier was declared for the entity, and by a
- ssss command if a system identifier was declared for the
- entity. A subdocument entity will only be defined if
- it is referenced in a {{{{ command or in an AAAA command for
-
-
-
- Page 8 (printed 7/3/94)
-
-
-
-
-
-
- SSSSGGGGMMMMLLLLSSSS((((1111)))) UUUUNNNNIIIIXXXX SSSSyyyysssstttteeeemmmm VVVV SSSSGGGGMMMMLLLLSSSS((((1111))))
-
-
-
- an attribute whose declared value is EEEENNNNTTTTIIIITTTTYYYY or
- EEEENNNNTTTTIIIITTTTIIIIEEEESSSS.
-
- ssss_s_y_s_i_d
- This command applies to the next EEEE, SSSS or NNNN command and
- specifies the associated system identifier.
-
- pppp_p_u_b_i_d
- This command applies to the next EEEE, SSSS or NNNN command and
- specifies the associated public identifier.
-
- ffff_f_i_l_e_n_a_m_e
- This command applies to the next EEEE or SSSS command and
- specifies an associated filename. There will be more
- than one ffff command for a single EEEE or SSSS command if the
- system identifier used a colon.
-
- {{{{_e_n_a_m_e
- The start of the SGML subdocument entity _e_n_a_m_e; _e_n_a_m_e
- will have been defined using a SSSS command.
-
- }}}}_e_n_a_m_e
- The end of the SGML subdocument entity _e_n_a_m_e.
-
- LLLL_l_i_n_e_n_o _f_i_l_e
- LLLL_l_i_n_e_n_o
- Set the current line number and filename. The _f_i_l_e_n_a_m_e
- argument will be omitted if only the line number has
- changed. This will be output only if the ----llll option has
- been given.
-
- ####_t_e_x_t
- An APPINFO parameter of _t_e_x_t was specified in the SGML
- declaration. This is not strictly part of the ESIS,
- but a structure-controlled application is permitted to
- act on it. No #### command will be output if AAAAPPPPPPPPIIIINNNNFFFFOOOO NNNNOOOONNNNEEEE
- was specified. A #### command will occur at most once,
- and may be preceded only by a single LLLL command.
-
- CCCC This command indicates that the document was a
- conforming SGML document. If this command is output,
- it will be the last command. An SGML document is not
- conforming if it references a subdocument entity that
- is not conforming.
-
- BBBBUUUUGGGGSSSS
- Some non-SGML characters in literals are counted as two
- characters for the purposes of quantity and capacity
- calculations.
-
- SSSSEEEEEEEE AAAALLLLSSSSOOOO
- The SGML Handbook, Charles F. Goldfarb
-
-
-
- Page 9 (printed 7/3/94)
-
-
-
-
-
-
- SSSSGGGGMMMMLLLLSSSS((((1111)))) UUUUNNNNIIIIXXXX SSSSyyyysssstttteeeemmmm VVVV SSSSGGGGMMMMLLLLSSSS((((1111))))
-
-
-
- ISO 8879 (Standard Generalized Markup Language),
- International Organization for Standardization
-
- OOOORRRRIIIIGGGGIIIINNNN
- ARCSGML was written by Charles F. Goldfarb.
-
- _S_g_m_l_s was derived from ARCSGML by James Clark
- (jjc@jclark.com), to whom bugs should be reported.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Page 10 (printed 7/3/94)
-
-
-
-