XML

At first glance XML looks quite similar to HTML in that it is made up of text, tags and attributes. Upon closer inspection though, they show themselves to be quite different. While HTML concerns itself with how data should be displayed only, XML allows a sense of what the data means to be incorporated into the document. For example you might markup an address in HTML like this

<TABLE>
      <TR> 
             <TD>484</TD><TD>St Kilda Road</TD>
      </TR>
      <TR>
             <TD>Melbourne</TD>
      </TR>
      <TR>
            <TD>VIC</TD><TD>3000</TD>
      </TR>
</TABLE>

However in an XML document it might look like this

<ADDRESS>
      <NUMBER>484</NUMBER>
      <STREET>St Kilda Road</STREET>
      <CITY>Melbourne</CITY>
      <STATE>VIC</STATE>
      <PCODE>3000</PCODE>
</ADDRESS>

Notice how the XML adds structure and meaning to the data. While in the HTML "St Kilda Road" is just some text in a table in the HTML, the XML specifies that it is a STREET and is part of an ADDRESS.

Of course there are many different structures and meanings that can be applied to the same data. For example if we weren't really interested in the above data as an address but wanted to perform some sort of syntactic analysis on it we might specify it in another piece of XML as

<SENTENCE>
	<NUMERAL> 484 </NUMERAL>
	<NOUN type="proper">
                       <ABBREVIATION>St</ABBREVIATION>
                       <NAME>Kilda</NAME>
        </NOUN>
        <NOUN>Road</NOUN>
        <NOUN type="proper">Melbourne</NOUN>
        <ABBREVIATION>VIC</ABBREVIATION>
        <NUMERAL>3000</NUMERAL>
</SENTENCE>

in which case we don't see it as an ADDRESS but as NUMERALS, NOUNS and ABBREVIATIONS grouped into a SENTENCE.

You can do this sort of thing because XML is eXtensible. Unlike HTML which has a static set of tags, you can create new XML tags to confer whatever meaning and structure you wish to data. In fact if you think about it the HTML fragment first shown is also XML but the tags used are designed to specify the structure for displaying arbitrary text. Actually all HTML documents could be thought of as XML documents if it werent for the fact that XML is a bit stricter. Specifically

1) All XML must be well formed.

HTML is very forgiving when it comes to syntax (which has lead to a lot of very sloppy HTML being produced) but XML isn't. In order to be well formed an XML document must, among other things, have closing tags for all opening tags and present them in the right order. ( For a full description of what constitutes a well formed XML document see http://www.ucc.ie/xml/#FAQ-WF ) The vast majority of HTML documents out there are not well formed, but if they were then they would all also be XML documents.

2) You can specify that XML must also be valid

If you do so then you must provide a Document Type Declaration ( DTD ) for the XML to be validated against. A DTD specifies rules that the tags and elements in the XML document must follow to be considered valid. For example you could specify in a DTD that the contents of a NUMBER tag as used above must consist of one or more numerals followed optionally by a letter. Then a document containing
<NUMBER>27a</NUMBER>
is valid, but one containing
<NUMBER>ABC</NUMBER>
would not be.
Basically a DTD allows you to formally specify a type of XML document and hence the structure and meaning to be conferred to the data.

SGML	XSL	XML schema	XML Application Server
Perl E2 private message XML ticker parser	XHTML	Ned Flanders	gaseous
What I do with my philosophy degree	E2 Explorer	Porn Star	DTD
website	XSLT	graveyard	John Carmack
HTML	CORBA	XML parser	SlashML
Windows NT	Document Type Definition	The Microsoft Sound	Four pounds of sunlight

Page category:

Page category:

Recommended Reading

About Everything2

User Picks

Editor Picks

New Writeups

Login
Password

XML

Page category:

Page category:

Sign In

Recommended Reading

About Everything2

User Picks

Editor Picks

New Writeups