<<<TOC  >>>ENCSEN/CSZVONTranslations

9. XML = SGML with Minor Changes

XML was invented to enable the delivery of SGML information over the Web. XML overcomes the limitations of SGML for Web delivery while providing all of its benefits.
XML is different from SGML in many ways, but there are only a few that are significant from a business managers perspective. The SGML capabilities that were dropped from XML are those that are irrelevant to the delivery of structured information over the Web. However, some of these capabilities are important if not crucial to the creation of structured information.
Its possible that subsequent revisions of XML will restore some or all of the omitted SGML capabilities that are crucial for information creation. In the meantime, continuing to use SGML will insulate you from changes in XML.
The following paragraphs explain the significant differences between XML and SGML and their implications.
No DTD required In order to process SGML data, a processing application requires both the DTD and the data. In contrast, XML does not require a DTD in order to process the data.
To eliminate the requirement for a DTD, XML data contains embedded cues to the datas structure. These embedded cues represent minor changes to the SGML data format.
XML-enabled Web browsers are just one example of an XML processing application. Another XML processing application might be a banking system front-end that can receive XML-based financial transactions and convert them into deposit and withdrawal instructions. The benefit of eliminating the DTD for processing applications such as these is not only to reduce the network bandwidth used up by downloading the DTD, but also to simplify the construction and reduce the size of processing applications because they dont have to interpret a DTD.
Eliminating the requirement for a DTD does not mean that its easier to create XML applications than it is to create SGML applications unless regular structure doesnt matter. For certain types of information, such as informal communications or one-of-a-kind document types, working without a DTD may indeed be an improvement. But for most if not all information thats currently in SGML, which is typically information with a regular structure created within a formal process, DTDs remain crucial.
In other words, to obtain all of the benefits you traditionally associate with SGML reuse, interchange, and automation youll still want to use a DTD when authoring XML in order to ensure the absolute data consistency you need to achieve those benefits. And that means that SGML and XML are going to involve similar levels of effort to implement. For enterprise-critical applications, XMLs contribution will be to simplify the delivery of structured text and documents over the Web.
Well-formedness Although XML can be delivered without a DTD, XML must still be "well formed." To be well formed, a document must comply with various rules. For example, a well-formed XML document must have at least one tag pair, all elements must be nested and have balanced start and end tags, and there must be declarations for any entities used. This imposes a fairly simple requirement for an XML processing application that does not handle DTD based document validation.
Exceptions Inclusions and exclusions allow you to specify exceptions in your content model. For example, you can use exclusions to enable paragraphs to contain appendix references except when those paragraphs appear in the appendix. This is important because many processing applications may be unable to deal with unexpected constructs. For example, what does a print rendering engine do if it encounters a footnote within a paragraph within a footnote? The lack of support in XML for exceptions is one of the chief reasons that many of the existing industry interchange DTDs arent being quickly replaced by XML.
AND content models There is no support for AND (&) content models in XML. That means that XML prevents authors from inserting elements in any order while still requiring that all elements be used. For example, the lack of "AND" means that you cannot define a title page that allows a title, optional subtitle, and author(s) in any order.
The lack of an AND will have a large effect on some industry exchange DTDs, which are often loose in their enforcement of sequence while remaining strict in their enforcement of completeness. Industry-wide DTDs often choose to leave order up to local implementers using (A&B&C) in the expectation that local DTDs will be derived from the exchange DTD and that these will choose one order. Without an AND, industry-wide DTDs must loosen their content models to ((A|B|C)+) or tighten them to one definite order (A,B,C).
AND models always have an equivalent that can be programmatically generated, but the equivalent can be too large to be practical.
SDATA internal entities If you have small system-specific chunks of information, such as mathematical symbols or other symbols specific to your application, SGML permits you define them with SDATA internal entities. Although these were designed to be system-specific, many SGML tools support a common set. XML does not support this capability.

<<<TOC  >>>ENCSEN/CSZVONTranslations