Basics of XSLT
R. Alexander Milowski
School of Information Management and Systems
milowski at sims.berkeley.edu
#1
Motivation
What's the most common thing you want to do with XML once you've got it?
If your document has this wealth of information, you might want to... ?
What happens when things change?
#2
Motivation - Producing HTML or XHTML
HTML is one of the inherent outputs specified by the XSLT recommendation.
Producing HTML is really just an XML-to-XML transformation.
The serializer of the XSLT processor takes care of the HTML specifics.
#3
Motivation - Extracting XML
XSLT allows you to extract information from your documents.
You can also transform this information into other kinds of XML.
It is a kind of "transformational semantic".
#4
Motivation - XML to XML
You can translate your XML to other XML vocabularies.
This is useful for "upgrading" or "downgrading" your XML between versions.
You can do more than just transliteration of your XML.
#5
Motivation - non-XML Output
XSLT's architecture allows non-XML output.
"text" (no markup) is built in.
But you can define your own... but you have to write code.
#6
Motivation - Extensible
XSLT's architecture is extensible.
You can add to XPath and XSLT new processing semantics.
The syntax is the same but you can define your own semantics (within reason).
...but you have to write code (e.g. Java, C++, etc.).
#7
History of XSLT
XSLT is probably the one of the most successful recommendations.
It was published in November 1999.
The W3C lists the number of processors as "XSLT: too many to list here."
Both IE and Netscape have some kind of XSLT support in the browser.
#8
XSLT is not XSL
Just to confuse you...
XSLT: XSL Transformations
XSL: eXtensible Stylesheet Language
XSL is for formatting XML documents for print or browser display.
XSLT is for transforming XML documents for whatever purpose.
#9
The XSLT Model
XSLT transforms infosets to infosets using rules:
Figure 1. Figure
Rules are packaged in a stylesheet and consist of patterns and actions.
#10
Getting Started
A transformation is specified by a "stylesheet" or "transform" document:
<xsl:stylesheet version='1.0' xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <!-- your rules here --> </xsl:stylesheet>
or
<xsl:transform version='1.0' xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <!-- your rules here --> </xsl:transform>
The choice of 'stylesheet' or 'transform' has no affect on the outcome according to the XSLT recommendation.
Applications may interpret 'stylesheet' differently from 'transform' in terms of:
whether they run the transformation
what they do with the results.
#11
The Top Level
The "Top Level" refers to the children of the document element.
Any element can occur at the top level but it must have a namespace
<xsl:transform version='1.0' xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:my="http://cde.berkeley.edu/my/other/stuff" > <!-- your rules here --> <my:other-stuff type="random-crap"/> </xsl:transform>
But this is illegal:
<xsl:stylesheet version='1.0' xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <!-- your rules here --> <mr-no-namespace-name/> </xsl:stylesheet>
Typically, you'll use elements from the XSLT namespace.
#12
A Simple Example
This stylesheet generates an XHTML document with the "text" of the document:
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version='1.0'> <xsl:template match="/"> <html xmlns='http://www.w3.org/1999/xhtml'> <head><title>Your Document's Value</title></head> <body> <xsl:apply-templates/> </body> </html> </xsl:template> </xsl:transform>
Here's the stylesheet: text-only.xsl
Here's the result: simple.xml (input) (output), mouse-annotation.xml: (input) (output) , namespace.xml (input) (output)
#13
Templates, Actions, and Patterns
A transformation consists of a set of templates that specify the rules.
Templates contain a set of "actions".
Actions create content in the result and/or traverse the document.
Templates are associated to elements via patterns--a subset of XPath.
#14
The Template Model
Templates are "pictures" of the result.
They are associated with the input document by matching patterns (i.e. the 'match' attribute).
Elements in the XSLT namespace are replaced by their actions.
Literal elements are copied to the output.
Example:
<xsl:template match="para"> <p><xsl:apply-templates/></p> </xsl:template>
#15
xsl:apply-templates
xsl:apply-templates matches templates to selected nodes.
By default, the descendants are selected:
<xsl:apply-templates/>
You can specify a different XPath by the 'select' attribute:
<xsl:apply-templates select="section/title"/>
so, here we have selected all the 'title' children elements of any 'section' children.
#16
Patterns
The 'match' attribute of a template requires a subset of XPath.
You can only match along the child or attribute axes.
Predicates are not restricted.
You can have multiple patterns separated by '|' to match several patterns.
Example:
contents/p slide[@show='true'] @href a/@href graphic[ancestors::section]
#17
Template Results
Figure 2. The Results of applying templates to an instance. |
#18
Modes
Modes scope templates by name.
Modes let you have templates with the same match pattern with different actions.
The mode has name (a QName value).
Mode's aren't sticky so you have to keep saying you want to use a particular mode.
#19
Associating a Template with Modes
You just add the 'mode' attribute:
Example:
<xsl:template match="doc/title" mode="head"> <title><xsl:apply-templates/></title> </xsl:template>
The name value is a QName.
#20
Using a Mode
Modes can only be used on xsl:apply-templates.
Example:
<xsl:template match="doc"> <html> <head> <xsl:apply-templates mode="head"/> </head> <body> <xsl:apply-templates/> </body> </html> </xsl:template>
#21
Built-in Template Rules
The rules help you traverse the document when you don't have templates that match.
The built-in template rules:
For parents (elements and documents):
<xsl:template match="*|/"> <xsl:apply-templates/> </xsl:template>
For text and attributes:
<xsl:template match="text()|@*"> <xsl:value-of select="."/> </xsl:template>
For processing instructions and comments
<xsl:template match="processing-instruction()|comment()"/>
Used alone, these rules give you just the "text" along the 'child' axis.
#22
Multiple Templates - Priorities
If multiple templates match, the "most specific" is taken.
The "most specific" is calculated in terms of priorities.
Single names (e.g. "para") have priority 0.
Wildcards (e.g. *, @*) have priority -0.25
Node tests for other nodes (e.g. comment(), node(), etc. ) have priority -0.5
Otherwise, the priority is 0.5
#23
Multiple Templates - Priority Examples
For example:
para → 0 h:* → -0.25 * → -0.25 node() → -0.5 contents/para → 0.5 contents/* → 0.5
You can adjust the priority to get what you want with a 'priority' attribute.
<xsl:template match="h:*" priority="1"> ... </xsl:template>
If two templates match and they have the same priority, it is an error--but the processor can recover and choose the last template in document order.
#24
Outputting Text
Any non-whitespace inside a template is automatically copied to the output.
Whitespace that contains a non-whitespace character is copied as well:
<xsl:template match="foo"> some text </xsl:template>
Whitespace between elements is "stripped" and not copied:
<xsl:template match="foo"> <p>some text</p> </xsl:template>
generates the following without a leading or trailing carriage return:
<p>some text</p>
#25
Preserving Whitespace
The element 'xsl:text' preserves text and whitespace.
For example, to preserve the whitespace in the previous example:
<xsl:template match="foo"><xsl:text> </xsl:text><p>some text</p><xsl:text> </xsl:text></xsl:template>
This is often used to add whitespace between non-literal elements.
#26
Literal Elements
A literal element is a non-XSL element.
It generates a copy of itself in the output.
The children may be generated by subsequent templates.
For example:
<xsl:template match="foo"> <html> <head><title>My Document</title> <style type='text/css'>...</style> </head> <body><xsl:apply-templates/></body> </html> </xsl:template>
#27
Attribute Value Templates
XPath expressions can be used to "insert" content into attribute values.
Attribute value templates are delimited by curly braces: {...}
Double curly braces are used if you want a curly brace in the attribute value.
The expression result becomes the attribute value.
#28
Attribute Value Templates - Example
For example
<img src="{@base-uri}/{@src}"/>
for the content:
<image-data base-uri="http://mydomain.com" src="picture.jpg"/>
would generate:
<img src="http://mydomain.com/picture.jpg"/>
#29
What Happened to my Comments?
Comments and processing instructions are ignored.
They aren't copied to the output.
For example:
<xsl:template match="foo"> <!-- The next element is significant --> <spam type='fried'/> </xsl:template>
generates:
<spam type='fried'/>
#30
Understanding Actions
A literal element or text is really an action to create a copy.
The xsl:apply-templates is an action that specifies where the processor should go next.
There are many other kinds of actions specified by elements in the XSLT namespace:
apply-templates, call-template, apply-imports, for-each, value-of, copy-of, number, choose, if, text, copy, variable, message, fallback, processing-instruction, comment, element, attribute.
XSLT is extensible so you can create your own actions and extend the processor.
#31
Creating Elements "Manually"
Elements can also be created by xsl:element.
This is used when the element name or namespace is created based on a expression.
An example:
<xsl:element name="top"> <a/><b/><c/> </xsl:element>
constructs:
<top> <a/><b/><c/> </top>
The children of 'xsl:element' are the children of the newly created element.
You can use expressions in the name:
<xsl:element name="{@name}"/>
#32
Creating Attributes "Manually"
Attributes can also be created by xsl:attribute.
They must be created before children are added to the element.
You can use them on literal elements:
<section> <xsl:attribute name="id">sect1</xsl:attribute> </section>
Or on xsl:element constructions
<xsl:element name="section"> <xsl:attribute name="id">sect1</xsl:attribute> </xsl:element>
The children of xsl:attribute must be text nodes that represent the value of the attribute.
You can use expressions in the name:
<xsl:attribute name="{child/@ref}"/>
#33
Creating Comments & Processing Instructions
Comments are created by:
<xsl:comment> your comment text here </xsl:comment>
Processing Instructions are created by
<xsl:processing-instruction name="target"> your PI text here </xsl:processing-instruction>
#34
Values
You can get values of elements or attributes via xsl:value-of:
<xsl:value-of select="person/name"/> <xsl:value-of select="@href"/>
The select attribute can contain any XPath expression.
The value is the result of collection the text "children" of the expression.
This is really the same as the string() function being applied to the resulting node set.
#35
Copying Nodes
Sometimes you might want to copy a node to the output.
xsl:copy will copy the matching node to the output.
It only applies to the current node and not its children or attributes.
Example:
<xsl:template match="credit-card"> <xsl:copy>XXXX-XXXX-XXXX-XXXX</xsl:copy> </xsl:template>
applied to:
<credit-card type='visa'>1234-1234-1234-1234</credit-card>
would create:
<credit-card>XXXX-XXXX-XXXX-XXXX</credit-card>
#36
Copying The Attributes
Since xsl:copy copies the current element and not the attributes, you need to tell it to copy the attributes.
This will copy any arbitrary attribute:
<xsl:template match="@*"> <xsl:copy/> </xsl:template>
So, we can fix the last example to get the attribute copied with the following templates:
<xsl:template match="credit-card"> <xsl:copy> <xsl:apply-templates select="@*"/> <xsl:text>XXXX-XXXX-XXXX-XXXX</xsl:text> </xsl:copy> </xsl:template> <xsl:template match="@*"> <xsl:copy/> </xsl:template>
#37
The Identity Transform
This all can be generalized into a compact identity transformation that copies all the nodes to output.
You can specify the identity transformation with xsl:copy:
<xsl:template match="@*|node()"> <xsl:copy> <xsl:apply-templates select="@*|node()"/> </xsl:copy> </xsl:template>
Keep in mind that other match patterns will have higher priority by default.
#38
Making Small Changes with xsl:copy
You can change substructure with identity and a few templates:
This changes the 'href' attribute to 'uri-ref' and changes the 'a' element to 'link'.
<xsl:template match="@href"> <xsl:attribute name="uri-ref"><xsl:value-of select='.'/></xsl:attribute> </xsl:template> <xsl:template match="a"> <link> <xsl:apply-templates select="@*|node()"/> </link> </xsl:template> <xsl:template match="@*|node()"> <xsl:copy> <xsl:apply-templates select="@*|node()"/> </xsl:copy> </xsl:template>
You can try this out yourself: small-changes.xsl (input) (output)
#39
xsl:copy-of
You can also just copy wholes and their structure to the output.
xsl:copy-of has a syntax like xsl:value-of:
<xsl:copy-of select="p"/>
All attributes, children, etc. are copied to the output.
xsl:copy and xsl:copy-of have very different uses.
xsl:copy is often used when you might want to convert a few elements and copy the rest.