CSE134A LECTURE NOTES

November 26, 2001
 
 

ANNOUNCEMENTS

This week is the last week of classes.  Next week we'll have office hours as usual, but no review session.

See Discus for clarifications about the current project.  Also see the online lecture notes and section notes for useful links

For good short XML tutorials see here, including this tutorial on XSL .
 
 

ABOUT XSL

These notes are based on Chapter 8 of XML in a Nutshell from O'Reilly. The acronym XSL is short for Extensible Stylesheet Language, which is a language for writing scripts that transform XML documents.

XSL has two parts: XSLT, which stands for XSL Transformations, and XSL-FO, which stands for XSL Formatting Objects.  We'll only consider XSLT, which has many implementations, unlike XSL-FO.

To use XSL, you need an XSL engine.  Internet Explorer 6 includes one for XSLT version 1.0, and IE 5 and 5.5 partially implement an old draft of XSLT.  There are several XSL engines in Java and other languages that you can download and install in your ieng9 account.  PHP version 4.0.6 can include an XSL engine known as Sablotron, but the version installed by ACS on ieng9 does not have an XSL engine, unfortunately.

A FIRST XSL EXAMPLE

To say what stylesheet to use, in your XML document write
<?xml version="1.0"?>
<?xml-stylesheet type = "text/xml" href = "http://ieng9.ucsd.edu/style.xsl"?>
...
The stylesheet, that is the XSLT script, looks like
<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
            xmlns:xsl = "http://www.w3.org/XSL/Transform">
    <xsl:template match="person">A person.</xsl:template>
</xsl:stylesheet>
The result of applying this stylesheet to a document with two <person> elements is
<?xml version="1.0" encoding="utf-8"?>

A Person

A Person

The white space and new lines surrounding the text was carried over from the original XML document.  Note that the output is not a well-formed XML document, despiet the automatically generated XML header
 
 

XSL TEMPLATES

The XSLT engine processes the input document in pre-order, from top to bottom.  Pre-order means that the children of an element are processed after the element itself.

For each element that is processed in the pre-order, if a matching template exists, it is applied.

The element <xsl:template match="person">A person.</xsl:template> is an example of a template.  It says to output the given text whenever an element with the name specified by  match="person" is encountered.

Another example is

<xsl:template match="person">
    <p> Name: <xsl:value-of select = "name"/> </p>
</xsl:template>
Note that stylesheets must be well-formed XML documents themselves, so the <p> tag must be followed by a closing </p> tag.

The value-of an element is its text content after removing all tags inside the element.
 
 

CHANGING WHEN TEMPLATES ARE APPLIED

This example says to process the name child element as soon as the person element is encountered:
<xsl:template match="person">
    <xsl:apply-templates select="name"/>
</xsl:template>
You need separate templates to say how to process name elements.  Note that this template for person implicitly says not to process in any way any other child elements.
 
 

GENERATING HTML

This stylesheet generates the HTML enclosing a page when it encounters the root element of the XML document:
<?xml version="1.0" ?>
 <xsl:stylesheet xmlns:xsl="http://www.w3.org/TR/WD-xsl">
    <xsl:template match="/">
       <html>
          <body>
             <xsl:for-each select="person/name">
                  <p>
                  <xsl:value-of select="name" />
                  <xsl:apply-templates select="@born" />
                  </p>
             </xsl:for-each>
          </body>
       </html>
    </xsl:template>
 </xsl:stylesheet>
Note that the namespace given, http://www.w3.org/TR/WD-xsl, is for the old version of XSLT implemented by IE 5 and 5.5.

The XSLT engine automatically removes the XML text declaration when the root element of the output is <html>.  In this case also, the XSLT engine uses HTML syntax for empty elements like <br> instead of XML syntax like <br/>.
 
 

XPATH

XPATH is a whole language for specifying elements and attributes inside an XML document.  It is used inside select and match attributes of XSLT elements.  For example select="@born" means the attribute named born of the element currently being processed.  "/" means the root element and person/name means a name element inside a person element inside the current element.
 
 

DEFAULT TEMPLATES

Every XSLT script has some builtin template rules.  The most important one copies the value of each text node and attribute node:
<xsl:template match="text()|@*">
     <xsl:value-of select="." />
</xsl:template>
However, by default attribute nodes are not reached, so by default only the text inside elements is output.

The default top-level template is

<xsl:template match="t*|/">
     <xsl:apply-templates/>
</xsl:template>
A more specific template always over-rides a more general template.  So if you provide a template for certain elements, their children are not necessarily processed.
 
 



Copyright (c) by Charles Elkan, 2001.