XSLT is a language for describing how an XML document is transformed into another. An XSLT stylesheet is made up of templates that match against the incoming tree and produces the result tree. XSLT is itself an XML document too.

XSLT is often used to produce presentational data from a pure content document. For example to create HTML (or XHTML) from a document consisting of made-up tags specific to an application.

XSLT is a standard being maintained by the W3 Consortium.

XSLT was formerly just a part of XSL.

XSLT is a strictly functional language, meaning that all variables are immutable (they can't be changed at runtime). This has caused me quite a few headaches, but once you use a language like Scheme then the pieces start to fall into place.

Also, XSLT can be used to emit any documents - not just XML. You can produce HTML, text files, CSV databases, and so on.

The actual transformation of the XML data is performed by an XSLT engine. The only one I've got any experience with is the Apache Group's Xalan for Java and C++.

XSLT is also a rather arcane technology.

I am trying to make up my mind whether it is a really cool idea or a really sucky one. Though I think I understand the underlying motivations, every time I try to actually use the thing I cannot help but feeling that having a transformation API with appropriate language bindings (rather than a document specification that is also a programming language that requires its own processor...) would be way better.

Yesterday, as I was trying to put msxml's XSLT processor (the Microsoft implementation of the standard) back into working order, a coworker asked me:

"Do you think we'll be able to get this stuff to work before XML becomes outdated?"

A statement which, at that point, exactly reflected my feelings.

A separate point of concern lies with the CS lingo that has been generously sprinkled on what is basically a trivial tree navigation/selection: the parent axis, the attribute axis... this kind of verbiage puts the entire XSLT shebang at serious death by jargon risk. It looks fatigued already...

Basically, XSLT allows you to transform some input XML document to some other XML document. The transformation is from an input tree to an output tree. This is not about building an arbitrary output string; the XSLT parser requires that the output be well formed XML. There are various hacky things you can do with XSLT to make it output arbitrary text, but in general the output should be XML.

UPDATE, 2004-09-28: Unknown Pedant informs me that XSLT allows arbitrary output, not just XML, no hackery required. This node is nearly 3 years old at this point, so it may be out of date. I no longer work with XML or XSLT very much, and I'm too lazy to verify this myself. However, as it stood 3 years ago, I felt getting plain text out of a XSLT engine involved some hackery. This may or may not have been true, either now or then. Check your XSLT engine documentation and w3.org for definitive information.

XSLT is particular about how an output tree is built. If a tag that is going in the output tree is opened, then it must be closed before the end of the template. If you want to build arbitrary tags (make output tags from the text in an input element, then you have to use xsl:element, rather than xsl:value-of. The same goes for attributes; if you want to add an attribute to a element on the output tree, use xsl:attribute, rather than trying to concatenate strings together to put the attribute inside the tag (that won't work). Examples of all of this will appear in the examples section.

XPath's relationship to XSLT

XPath expressions are used in XSLT to select nodes for processing, specifying conditions used in processing, and for generating text that is inserted in the output tree. XPath expressions appear in XSLT as the value of certain attributes (the match attribute of the xsl:template element, or the select attribute of many other elements) and in attribute value templates.

In XPath, a reference is often made to the context node. This context node is specified by the XSLT processor; it is the node currently being processed.

Vocabulary and syntax

The main problem that people have in understanding XSLT is all the weird vocabulary that you run across when reading about it. Most of this vocabulary actually comes from XPath, which is the language used in XSLT to find parts of the source XML document. See XPath vocabulary and syntax for the most of the relevant information.

XSLT uses normal XPath and XML syntax. An XSLT stylesheet is an XML document, so all of the normal XML rules apply. All attribute values must be in quotes, all opened tag must be closed, element and attribute names are case sensitive, and so on.

One peculiarity in XSLT that I haven't seen elsewhere with XML is curly braces. It deserves some explanation because it's not obvious what they are used for if you just happen across them in a stylesheet somewhere.

Curly braces ({}) specify an attribute template; the XPath expression that they enclose is evaluated and used as the value of an element attribute. Curly braces may only be used in the attribute value of an output element. If you need other text other than just the expression in curly braces in an attribute value, you have to use xsl:attribute and xsl:value-of to get the desired effect. Something like <a href="document?input={@value}"> won't work, at least with Xalan.

Here's an example of how to use curly braces:

<xsl:template match="input">
  <output-element output-attr="{xp-expr}"/>
</xsl:template>

<xsl:template match="input">
  <xsl:text>Arbitrary text: </xsl:text>{xp-expr}
</xsl:template>

<xsl:template match="input">
  <xsl:text>Arbitrary text: </xsl:text> <xsl:value-of select="xp-expr"/>
</xsl:template>

This first template in this example shows the use of an attribute value template. The curly braces inside the value of output-attr tell XSLT to interpret the enclosed text as an XPath expression. The result of evaluating the XPath expression becomes the value of the attribute.

The second template shows an illegal use of an attribute value template. Since the attribute value template isn't in an attribute value, it won't be interpreted. The proper way to do this sort of thing is by using the xsl:value-of element, as shown in the third template (note the lack of curly braces).

Whitespace

XSLT can handle whitespace in an input document in a couple of ways. The XSLT instructions for doing this are xsl:strip-space and xsl:preserve-space. Both of these instructions have an element attribute that specifies the names of the node to operate on.

In the case of xsl:strip-space, any text node it matches that contains only whitespace is removed from the input tree after it has been loaded but before it has been processed. xsl:preserve-space just tells the processor to leave the text nodes that it matches in the input tree.

An example of using xsl:strip-space:

<?xml version="1.0"?>
<xsl:stylesheet 
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
    version="1.0">

<xsl:output method="xml" omit-xml-declaration="yes" indent="yes"/>

<xsl:strip-space elements="*"/>

<xsl:template match="/">
  <!-- Do something here... -->
</xsl:template>

In this example, the * match all elements, so all text nodes containing only whitespace are removed from the input tree before being it is processed.

Templates

XSLT templates are basically constructs that contain templates of what is supposed to go into the output document, with the template being completed by information from the source document. Templates usually match some element in the source document, so that when that element is found, it is processed by the template and the results are placed in the output tree.

Templates can also be thought of as being almost like functions. Templates can be called using the xsl:call-template instruction. Templates can also be passed arbitrary parameters if they declare those parameters using xsl:param and the calling template uses the xsl:with-param instruction within the xsl:call-template instruction.

Templates are more commonly "applied", using the xsl:apply-templates instruction. If a template matches an element, none of that elements descendants are processed unless the template that matched that element contains the xsl:apply-templates instruction.

Variables

XSLT is not a imperative programming language. It's more of a declarative language, like SQL (not like T-SQL). This means that the programmer describes the result required; the implementation takes care of the details of producing that result.

XSLT does have variables like a normal language, but they aren't really variables, not the way an imperative programmer likes to think of variables anyway. XSLT variables can only be set once each time they are processed. If a variable is defined inside a template, it may take on a different value each time the template is processed. An example will help to illustrate this.

Take this document:

<root>
  <countme at1="hi" at2="j" at3="hi"/>
  <countme at1="hi" at2="j" at3="hi" at4="j"/>
  <countme at1="hi" at2="j" at3="hi" at4="j" at5="hi" at6="j"/>
  <countme at1="hi" at2="j" at3="hi" at4="j"/>
</root>

This stylesheet, with interesting things in bold:

<?xml version="1.0"?>
<xsl:stylesheet 
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
    xmlns:fo="http://www.w3.org/1999/XSL/Format" 
    version="1.0">

<xsl:output method="xml" omit-xml-declaration="yes" indent="yes"/>

<xsl:template match="countme">

  <xsl:variable name="pos" select="position()"/>
  <xsl:variable name="numattr" select="count(attribute::node())"/>
  <xsl:text>
Number of attributes in countme #</xsl:text><xsl:value-of select="$pos"/>
  <xsl:text>: </xsl:text><xsl:value-of select="$numattr"/><xsl:text>
</xsl:text>

</xsl:template>

</xsl:stylesheet>

This output is produced:


  
Number of attributes in countme #2: 3

  
Number of attributes in countme #4: 4

  
Number of attributes in countme #6: 6

  
Number of attributes in countme #8: 4


The reason the output starts with countme #2 is because of the whitespace in the input document. Between the end of the <root> tag and the start of the first <countme> tag, there is some whitespace. The stylesheet processor is representing this whitespace as text nodes along the child axis. So the child axis of the <root> element contains both text and element nodes. The position function returns the position of the node along the axis before the node test and the predicate are applied. The whitespace between <root> and the first <countme> is a text node in position number one, so the first <countme> node is actually in position number two. It is possible to request that the XSLT processor remove (strip) text nodes containing only whitespace from the input tree before processing by using the xsl:strip-space instruction.

Anyway, you can see that the output is different for each countme element. The value stored in the variables are clearly changing; they can do this because once the first countme template is finished, the variables disappear (they go out of scope), and they can be set anew the next time the template is processed. However, if I tried to use a second xsl:variable element to set the numattr variable to a different value inside the same template, then the stylesheet processor would give me an error, because a variable can only be set once inside a scope.

Or at least, that's the way it's supposed to work, but I've discovered that Xalan, the XSLT stylesheet parser I've been using, actually allows changing the value of a variable. Testing shows that MSXML3, unlike Xalan, enforces the defined non-mutability of XSLT variables.

Parameters

XSLT's xsl:template directive can be thought of as a function, like those in an imperative programming language. XSLT templates can be called directly from other templates using xsl:call-template directives. This is a very powerful tool in XSLT; in particular it can be used to perform recursion, which can be used to create loops that execute a given number of times (XSLT doesn't support this kind of loop any other way).

Here is an example the illustrates parameters and the use of recursion to create a loop.

The XML document:

<root>
  <countme at1="hi" at2="j" at3="hi"/>
  <countme at1="hi" at2="j" at3="hi" at4="j"/>
  <countme at1="hi" at2="j" at3="hi" at4="j" at5="hi" at6="j"/>
  <countme at1="hi" at2="j" at3="hi" at4="j"/>
</root>

The stylesheet:

<?xml version="1.0"?>
<xsl:stylesheet 
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
    xmlns:fo="http://www.w3.org/1999/XSL/Format"     
    version="1.0">

<xsl:output method="xml" omit-xml-declaration="yes" indent="yes"/>

<xsl:template name="test">
  <xsl:param name="input" select="'this is a string, not an expression.'"/>

  <xsl:text>"Test" called, the input parameter is: </xsl:text>
  <xsl:value-of select="$input"/><xsl:text>
</xsl:text>

  <xsl:if test="$input = true()">
    <xsl:call-template name="test">
      <xsl:with-param name="input" select="$input - 1"/>
    </xsl:call-template>
  </xsl:if>

</xsl:template>

<xsl:template match="countme">

  <xsl:variable name="pos" select="position()"/>
  <xsl:variable name="numattr" select="count(attribute::node())"/>

<!--  <xsl:variable name="pos" select="$pos div 2"/> -->

  <xsl:text>
Number of attributes in countme #</xsl:text><xsl:value-of select="$pos"/>
  <xsl:text>: </xsl:text><xsl:value-of select="$numattr"/><xsl:text>
</xsl:text>

  <xsl:call-template name="test">
    <xsl:with-param name="input" select="'2'"/>
  </xsl:call-template>

</xsl:template>

</xsl:stylesheet>

The output:


  
Number of attributes in countme #2: 3
"Test" called, the input parameter is: 2
"Test" called, the input parameter is: 1
"Test" called, the input parameter is: 0

  
Number of attributes in countme #4: 4
"Test" called, the input parameter is: 2
"Test" called, the input parameter is: 1
"Test" called, the input parameter is: 0

  
Number of attributes in countme #6: 6
"Test" called, the input parameter is: 2
"Test" called, the input parameter is: 1
"Test" called, the input parameter is: 0

  
Number of attributes in countme #8: 4
"Test" called, the input parameter is: 2
"Test" called, the input parameter is: 1
"Test" called, the input parameter is: 0


In this case, the test template is called first by the countme template, and then by itself until the input parameter becomes false (zero). An input of zero is the base case of the recursion. If a recursive function (template for XSLT) doesn't have a base case, it will likely loop forever, or at least until the OS intervenes and shuts it down for using too much stack space.

Loops

With XSLT, loops are used to iterate over a set of nodes. You will need to use loops in situations when you want to change where output appears. For example, if you have an XHTML document with H1 tags throughout it, and you just want to add a table of contents to the top of the document, you can do this by using an xsl:template to match the body tag, and then use a xsl:for-each to match each of the H1 tags in the document and output links to those tags.

Unlike xsl:template matches, which only uses a subset of XPath to match patterns, you can use all of XPath to match elements when you use xsl:for-each, which is the XSL looping construct. For example, you cannot use the ancestor axis with a xsl:template, but ancestor works fine when you use it with xsl:for-each.

Conditions

XSLT provides a couple of mechanisms for handling conditional execution. One of these is xsl:if, and the other is xsl:choose. The xsl:if directive has no else clause, so if you need to test for more than one possible value, xsl:choose is almost certainly a better choice.

The xsl:if element has one attribute. The value of the test attribute must evaluated to a boolean result. If it is true, the fragment enclosed by the xsl:if block is executed.

The xsl:choose element contains one or more xsl:when elements, each of which have a test attribute evaluating to a boolean value, and an optional xsl:otherwise element, which is used if none of the xsl:when clauses are matched.


There are many handy XSLT resources available on the web. A couple of good starting points are http://xml.com and http://www.w3c.org/Style/XSL/.

The full XSLT specification is available at http://www.w3.org/TR/xslt.

Log in or register to write something here or to contact authors.