JAXP and XSLT in Web Applications
R. Alexander Milowski
milowski at sims.berkeley.edu
#1
XSLT Extras
xsl:output & serialization control
Associating stylesheets with documents
Extensions
Multiple output documents
#2
Serialization Control
The xsl:output element lets you control many things about serialization:
method - xml, html, text, or a QName
version - The XML version
encoding - The encoding to use while serializing.
omit-xml-declaration - "yes" or "no" - omits the <?xml pi
doctype-system/doctype-public - Controls the <!DOCTYPE declaration
indent - Controls pretty-printing of the XML.
media-type - The MIME Type
cdata-section-elements - Elements whose contents should be wrappped with <![CDATA[ ]]>.
Example:
<xsl:output method="xml" omit-xml-declaration="yes" encoding="UTF-8"/>
#3
Serialization Control - Prefixes
You can't control prefixes and namespace declarations very well.
There are a few things you can do:
Declare the namespaces you want in your result document on the root element as well as the stylesheet element.
Use xsl:element with a prefix to copy elements from the input to the output:
<xsl:template match="m:math"> <xsl:element name="m:{local-name(.)}" namespace="{namespace-uri(.)}"> <xsl:apply-templates select="@*|node()"/> </xsl:element> </xsl:template>
Use the exclude-result-prefixes attribute to exclude input and stylesheet prefix bindings.
#4
Serialization Control - Whitespace
xml:space can be used--just be careful with xsl:choose.
Adding a few xsl:text elements with a newline in them will go far.
The ident='yes' attribute may add whitespace where you don't want it and may really increase the size of the output.
Keep in mind that there may be a schema that says what content is significant.
#5
Associating Stylesheets with Documents
Just add a xml-stylesheet PI:
<?xml-stylesheet type="text/xsl" href="mytransform.xsl"?> <doc> <title>Boring Document</title> <body> <p>I need style!</p> </body> </doc>
Adding this PI doesn't force all applications to use your transform. On the other hand, many application--browsers--will automatically do so for XML documents.
#6
Extensions
XSLT is extensible via functions and actions (elements).
You designate the namespaces that are extensions via the 'extension-element-prefixes' attribute:
<xsl:transform version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" extension-element-prefixes="d c" xmlns:d="http://exslt.org/dates-and-times" xmlns:c="http://exslt.org/common" > <xsl:template match="/"> <c:document href="output-{d:year()}-{d:day-in-year()}.xml"> <date><xsl:value-of select="d:date()"/></date> </c:document> </xsl:template> </xsl:transform>
See www.exslt.org for a list of "standard" extensions.
#7
Multiple Output Documents
XSLT 1.0 does not support multiple output documents.
You can use an extension to do this. The children of 'c:document' become the contents of the output file. c:document and its descendants will not appear in the "primary result":
<xsl:transform version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" extension-element-prefixes="c" xmlns:c="http://exslt.org/common" > <xsl:template match="slide"> <c:document href="slide-{count(preceding-sibling::slide)+1}.xhtml"> <html>...</html> </c:document> </xsl:template> </xsl:transform>
#8
What is JAXP?
Java API for XML Processing.
A standard part of Java 1.4:
javax.xml.parsers - bootstrapping and using XML processors.
javax.xml.transform - APIs for transforming documents (e.g. XSLT)
javax.xml.transform.dom - DOM APIs related to transformations.
javax.xml.transform.sax - SAX APIs related to transformations.
javax.xml.transform.stream - I/O stream APIs related to transformations.
API Documentation: Java Standard Edition 1.4
#9
Reference Implementation
The JDK contains a reference impementation.
You can swap out the parser & XSLT implementaiton by system properties.
You can also replace them by putting jar files into the JDK endorsed directory.
Endorsed jars will replace them for all Java applications on your system.
#10
Upgrading to SAXON
For example, you can replace Xalan with SAXON for your XSLT processor:
Locate your the 'lib/ext' directory of your JDK. On the mac, this is under '/System/Library/Frameworks/JavaVM.framework/'.
Copy the saxon.jar file to that directory.
Restart Netbeans (or other Java application).
The replaces the XSLT processor for all Java applications that use that JDK or VM.
#11
Using JAXP to Apply XSLT
The JAXP interface uses a factory design pattern. That is, you use factories to create objects. This hides the implementation classes.
The procedure is:
Create a TransformerFactory instance.
Load your stylesheet into a Transformer instance.
Transform your source to your output using the Transfomer instance.
// Step 1: Establishes the factory/environment/etc. TransformerFactory tfactory = TransformerFactory.newInstance(); // Step 2: Loads and compiles the stylesheet Transformer xform = tfactory.newTransformer(new StreamSource("mystyle.xsl")); // Step 3: Apply the transform to the input document xform.transform(new StreamSource("in.xml"),new StreamSource("out.xhtml"));
#12
Chaining Transforms
You can use DOM to chain transforms in memory.
Example:
// Setup the transformations Transformation tfactory = TransformerFactory.newInstance(); Transformer step1xform = tfactory.newTransformer(new StreamSource("step1.xsl")); Transformer step2xform = tfactory.newTransformer(new StreamSource("step2.xsl")); // Create a DOMResult to hold the between XML document DOMResult between = new DOMResult(); // Transform the input step1xform.transform(new StreamSource("in.xml"),between); // Transform the output of step 1 step2xform.transform(new DOMSource(between.getNode()),new StreamResult("out.xml"));
#13
XSLT in JSP - Step #1
We need to import the right things and setup the page output:
<%@page contentType="text/xml" import="java.io.*,java.util.*,javax.xml.parsers.*,javax.xml.transform.*,javax.xml.transform.stream.*" pageEncoding="UTF-8"%>
#14
XSLT in JSP - Step #2
We can setup a method that we can use to load the XSLT:
<%! File transformFile; public void jspInit() { ServletConfig config = getServletConfig(); transformFile = new File(config.getServletContext().getRealPath("service.xsl")); } public Transformer loadTransformer() { ServletConfig config = getServletConfig(); try { TransformerFactory tfactory = TransformerFactory.newInstance(); return tfactory.newTransformer(new StreamSource(transformFile)); } catch (Exception ex) { config.getServletContext().log("Cannot load stylesheet.",ex); } return null; } %>
#15
XSLT in JSP - Step #3
Now we just need to use it.
Handling a posting of an XML document:
<% Transformer xslt = loadTransformer(); xslt.transform(new StreamSource(request.getInputStream()),new StreamResult(out)); %>
Transforming the document specified by a query string:
<% Transformer xslt = loadTransformer(); xslt.transform(new StreamSource(request.getQueryString()),new StreamResult(out)); %>
#16
Transforms & Threading
The Transformer object is not re-entrant.
To reuse the instance you need to manage who is using it at what time.
...and that makes things a bit more complicated.
#17
XSLT in JSP - Caching the Instance
A simple solution:
<%@page contentType="text/xml" import="java.io.*,java.util.*,javax.xml.parsers.*,javax.xml.transform.*,javax.xml.transform.stream.*" pageEncoding="UTF-8"%><%! long timestamp; List cache; File transformFile; public void jspInit() { ServletConfig config = getServletConfig(); transformFile = new File(config.getServletContext().getRealPath("service.xsl")); timestamp = transformFile.lastModified(); cache = new ArrayList(); } public Transformer getTransformer() { synchronized (cache) { if (transformFile.lastModified()>timestamp) { timestamp = transformFile.lastModified(); cache.clear(); } if (cache.size()>0) { Transformer t = (Transformer)cache.remove(cache.size()-1); return t; } else { return loadTransformer(); } } } public void releaseTransformer(Transformer t,long savedTimestamp) { if (savedTimestamp==timestamp) { cache.add(t); } } public Transformer loadTransformer() { ServletConfig config = getServletConfig(); try { TransformerFactory tfactory = TransformerFactory.newInstance(); return tfactory.newTransformer(new StreamSource(transformFile)); } catch (Exception ex) { config.getServletContext().log("Cannot load stylesheet.",ex); } return null; } %><% Transformer xslt = getTransformer(); long savedTimestamp = timestamp; xslt.transform(new StreamSource(request.getInputStream()),new StreamResult(out)); releaseTransformer(xslt,savedTimestamp); %>
#18
XSLT as a Web Service
The previous example can be used to run XSLT as a web service.
Any posted XML document will be parsed and transformed by the XSLT stylesheet.
And the result will be returned to the sender.
#19
Example: Summing Costs
It is silly, but here's my XSLT that sums cost elements and replaces it with a total:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:template match="add"> <total><xsl:value-of select="sum(cost)"/></total> </xsl:template> <xsl:template match="@*|node()"> <xsl:copy> <xsl:apply-templates select="@*|node()"/> </xsl:copy> </xsl:template> </xsl:stylesheet>
Any non-add or cost element will be returned unchanged.
The element can exist in any context.
#20
Parsing a Document
You need to do the following:
Get and configure a parser factory.
Create a parser.
Set options on the parser.
Call 'parse' on the parser.
#21
Parser Factories
Factories can be obtained by the following code:
SAXParserFactory pfactory = SAXParserFactory.newInstance();
You'll want to turn on namespaces:
pfactory.setNamespaceAware(true);
And maybe validation:
pfactory.setValidating(true);
#22
Create Parses with Schema Validation
Just call 'newSAXParser':
SAXParser parser = pfactory.newSAXParser();
To turn on schema validation, you need to set a property:
parser.setProperty( "http://java.sun.com/xml/jaxp/properties/schemaLanguage", "http://www.w3.org/2001/XMLSchema" );
#23
Readers and Parsers
XMLReader is the "new" interface for parsing.
Just get that from your parser:
XMLReader xmlReader = parser.getXMLReader();
Here you might want to set an error handler:
xmlReader.setErrorHandler(new DefaultHandler() { public void error(SAXParseException sex) throws SAXException { System.err.println(ex.getMessage()); } public void fatalError(SAXParseException ex) throws SAXException { System.err.println(ex.getMessage()); } public void warning(SAXParseException ex) throws SAXException { System.err.println(ex.getMessage()); } });
#24
Adding Catalogs
Xerces has an XML Catalog implementation in the class:
org.apache.xerces.util.XMLCatalogResolver
You just create the class and set the list of catalog files:
String catalogURI = "file:"+(new File(catalogFile)).getAbsolutePath(); String [] catalogs = { catalogURI }; // Create catalog resolver and set a catalog list. XMLCatalogResolver resolver = new XMLCatalogResolver(); resolver.setPreferPublic(true); resolver.setCatalogList(catalogs);
Then set the appropriate property:
// Set the resolver on the parser. xmlReader.setProperty( "http://apache.org/xml/properties/internal/entity-resolver", resolver );
#25
Parsing a Document
Just call parse:
xmlReader.parse("foo.xml");
You can also use open streams:
InputStream is = ...; InputSource s1 = new InputSource(is); xmlReader.parse(s1); Reader r = ...; InputSource s2 = new InputSource(r); xmlReader.parse(r);
InputSource lets you specify a base URI, encoding, etc.
InputSource s = ...; s.setSystemId("http://cde.berkeley.edu/bob.xml"); s.setEncoding("UTF-8");
#26
Handling Content
The ContentHandler interface lets you receive the XML content.
For example:
class ListElements extends DefaultHandler { Map elements = new HashMap(); public void startElement(String namespaceURI, String localName, String qName, Attributes atts) { if (namespaceURI==null) { elements.put(localName,Boolean.TRUE); } else { elements.put('{'+namespaceURI+'}'+localName,Boolean.TRUE); } } }
#27
Serialization
SAX doesn't support serialization.
DOM doesn't support serialization.
Serialization with namespaces is hard.
Complain!
#28
Handling Unicode
You must use a java.io.Reader derived class to handler Unicode.
Unicode characters are stored according to encodings:
UTF-8, UTF-16, etc.
US-ASCII
ISO646-JP (Japanese)
Example:
InputStream is = ...; Reader r = new InputStreamReader(is,"UTF-8"); // or look up the charset Charset cs = Charset.forName("UTF-8"); Reader altr = new InputStreamReader(is,cs);
#29
Posting XML over HTTP - Step #1
Open the connection:
URI url = new URL("http://localhost:8084/jaxp-lectrure/xslt-service.jsp"); // Opens a connection URLConnection uconnection = url.openConnection(); // Set the properties HttpURLConnection connection = (HttpURLConnection)uconnection; connection.setDoOutput(true); connection.setRequestMethod("POST"); connection.setRequestProperty("Content-Type",contentType); connection.setRequestProperty("Content-Encoding","UTF-8"); // Setup the output for serialization OutputStream os = connection.getOutputStream(); OutputStreamWriter o = new OutputStreamWriter(os,"UTF-8");
Now you write your XML to the writer.
#30
Posting XML over HTTP - Step #2
Check the response:
InputStream is = connection.getInputStream(); if (connection.getResponseCode()==HttpURLConnection.HTTP_OK) { String contentType = connection.getContentType(); if (contentType.equals("text/xml")) { // Now you can parse your XML response! } }
#31
Posting XML over HTTP - Step #3
Parse the response:
InputStream is = ...; String encoding = connection.getContentEncoding(); Reader r = encoding==null ? new InputStreamReader(r) : new InputStreamReader(r,encoding); xmlReader.parse(new InputSource(r)); connection.disconnect();