1. General questions
What is XML?
XML is the
Extensible Markup Language
(extensible because it is not a fixed format like
HTML). It is designed to enable the
use of
SGML on the World Wide Web.
XML is not a single, predefined markup
language: it's a metalanguage -- a language for describing other
languages -- which lets you design your own markup. (A predefined
markup language like HTML defines a way to describe information in one
specific class of documents: XML lets you define your own customized
markup languages for different classes of document.) It can do this
because it's written in SGML, the international standard metalanguage
for markup.
What is XML for?
XML is designed "to make it easy and straightforward to
use
SGML on the Web: easy to define
document types, easy to author and manage SGML-defined documents, and
easy to transmit and share them across the Web."
It defines "an extremely simple dialect of SGML which is
completely described in the
XML Specification. The goal is to
enable generic SGML to be served, received, and processed on the Web in
the way that is now possible with
HTML."
"For this reason, XML has been designed for ease of
implementation, and for interoperability with both SGML and HTML"
[quotes from
the XML spec].
What is SGML?
ISO standards are governed by the
International Organization for Standardization in Geneva, Switzerland,
and voted into or out of existence by representatives from every country's
national standards body.
- If you have a
query about an international standard, you should contact your national
standards body for the name of your country's representative on the
relevant ISO committee or working group.
- If you have a query about your
country's representation in Geneva or about the conduct of your
national standards body, you should contact the relevant government
department in your country, or speak to your public
representative.
The representation of countries at the ISO
is not a matter for this FAQ. Please do not submit queries to the
maintainer about how or why your ISO representatives have or have not
voted.
What is HTML?
Aren't XML, SGML, and HTML all the
same thing?
HTML is
just one of these document types, the one most frequently used in the
Web. It defines a simple,
fixed type of document with markup designed for a common class of
office or technical report, with headings, paragraphs, lists,
illustrations, etc, and some provision
for hypertext and multimedia.
XML is an abbreviated version of SGML, to make it easier for you
to define your own document types, and to make it easier for programmers
to write programs to handle them. It omits the more complex and
less-used parts of SGML in return for the benefits of being easier to
write applications for, easier to understand, and more suited to delivery
and interoperability over the Web. But it is still SGML, and XML files
may still be parsed and validated the same as any other SGML file (see
the question on
XML software).
Programmers may find it useful to think of XML as being
SGML--
rather than HTML++.
What is the difference between SGML/XML and
C or C++
C and C++ (and others like
Fortran, or Pascal, or Basic, or Java or dozens more) are
programming languages with which you specify
calculations, actions, and decisions to be carried out:
do when @front(@date,6) is equal "01-Apr"
print "April Fool!\n"
else
print @days(@datesub("25-Dec",@date)),\
" shopping days to Christmas\n"
done
SGML and XML are markup specification
languages with which you can design ways of
describing information, usually for storage,
transmission, or processing by a program:
<p>It was the week after <event class="festival">Christmas</event>
but <name class="person">Max</name>'s mind was still running on the
prank he had played on <name class="person">Louise</name> the previous
<name class="month">April</name>.</p>
On its own, a file of SGML or XML text (including HTML) doesn't
do anything: you have to have a program to do
something with it.
Who is responsible for XML?
XML is a project of the
World Wide Web Consortium (W3C),
and the development of the specification is being supervised by their
XML Working Group. A Special Interest Group of co-opted contributors and
experts from various fields contributed comments and reviews by email.
XML is a public format: it is not a proprietary development of
any company.
The v1.0 specification was
accepted by the W3C as Recommendation on Feb 10, 1998.
Why is XML such an important
development?
It removes two constraints which are holding back Web
developments:
- dependence on a single, inflexible
document type (HTML);
- the complexity of full SGML,
whose syntax allows many powerful but hard-to-program options.
XML simplifies the levels of optionality in SGML, and allows the
development of user-defined document types on the Web.
How can XML make SGML simpler and still
let you define your own document types?
To make SGML simpler, XML redefines some
of
SGML's internal values and
parameters, and removes a large number of the more complex and sometimes
less-used features which made it harder to write processing programs
(see
http://www.w3.org/TR/NOTE-sgml-xml-971215).
Why not just carry on extending HTML?
HTML is already overburdened
with dozens of interesting but often incompatible inventions from
different manufacturers, because it provides only one way of describing
your information.
XML will allow groups of people or organizations to create their
own customized markup languages for exchanging information in their
domain (music, chemistry, electronics, hill-walking, finance, surfing,
petroleum geology, linguistics, cooking, knitting, stellar cartography,
history, engineering, rabbit-keeping,
mathematics,
etc).
HTML is at the limit of its usefulness as a way of describing
information, and while it will continue to play an important role for
the content it currently represents, many new applications require a
more robust and flexible infrastructure.
Why do we need all this SGML stuff? Why
not just use Word or Notes?
Information on a network which connects
many different types of computer has to be usable on all of them.
Public information cannot afford to be restricted to one make or model
or manufacturer, or to cede control of its data format to private
hands. It is also helpful for such information to be in a form that
can be reused in many different ways, as this can minimize wasted time
and effort. Proprietary data formats, no matter how well documented or
publicized, are simply not an option: their control still resides in
private hands and they can be changed or withdrawn arbitrarily without
notice.
SGML is
the international standard for defining this kind of application, but
those who need an alternative based on different software for other
purposes are entirely free to implement similar services using such a
system, especially if they are for private use.
Where do I find more information about
XML?
The items listed below are the ones I have
been told about: please
mail
me if you come across others.
- The annual XML Conference is run by the
Graphic Communications Association. XML'99
is being held in Philadelphia on December 5-9 and consists as
last year of two conferences in one: the XML Conference '99 and
Markup
Technologies '99
-
SGML/XML
Asia/Pacific is in Sydney on October
18-21.
There are lists of books, articles, and software for
XML in Robin Cover's
SGML and XML
pages. That site should always be your first port of call:
please look there first before using the form in this FAQ to ask about
software or documentation.
Where can I discuss implementation
and development of XML?
Please Read
The Fine Documentation which you will be sent when you join a
mailing list, as it contains important information, particularly about
what to do when your email address changes.
There is a mailing list called xml-dev for
those committed to developing components for XML. You can subscribe by
sending a 1-line mail message to
majordomo@ic.ac.uk
saying:subscribe xml-dev
your@email.address(substituting
your correct email address). To unsubscribe, send a 1-line message to
the same address saying unsubscribe xml-dev
your@email.addressThe list is
hypermailed for online reference at
http://www.lists.ic.ac.uk/hypermail/xml-dev/.
Note that this list is for those people
actively involved in developing resources for XML. It is
not
for general information about XML (see this FAQ and
other sources) or for general
discussion about SGML implementation and resources (see below).
There is a general-purpose mailing list
called XML-L for public discussions: to subscribe, send a
1-line mail message to
LISTSERV@listserv.heanet.ie
sayingsubscribe XML-L forename
surname(substituting
your own forename and surname). To unsubscribe, send a 1-line message to
the same address sayingunsubscribe XML-L(Note that
LISTSERV lists like XML-L don't need you to give
your email address: they read it from your email headers.) You can
access XML-L and its archives, as well as subscribe and unsubscribe
interactively, from
http://listserv.heanet.ie/xml-l.html.
Please note that there is a lot of inaccurate and misleading
information published in print and on the Web about
subscribing to mailing lists. The information given here is
correct - use it.
There are mailing lists being set up in other languages:
-
Gianni Rubagotti writes: "A new Italian mailing
list about XML is born: to subscribe, send a mail message without a
subject line but with text saying subscribe XML-IT
to majordomo@ananas.usr.dsi.unimi.it. Send
discussion messages to:
xml-it@ananas.usr.dsi.unimi.it (only
subscribers may send messages). Everyone, Italian or not, who wants
to debate about XML in our tongue is welcome."
-
JP Theberge writes: "A French mailing list about
XML has been created. To subscribe, send
subscribe to
xml-request@trisome.com. Then post to
xml@trisome.com."
The Usenet newsgroup
comp.text.xml is for
discussions of XML. If this is not available on your local news server, ask your
Internet Provider to add it, or use a Web interface like
DejaNews.