JAXP - Java API for XML Processing
R. Alexander Milowski
milowski@sims.berkeley.edu
School of Information Management and Systems
#1
What is JAXP?
Java API for XML Processing.
A standard part of Java 1.4:
javax.xml.parsers - bootstrapping and using XML processors.
javax.xml.transform - APIs for transforming documents (e.g. XSLT)
javax.xml.transform.dom - DOM APIs related to transformations.
javax.xml.transform.sax - SAX APIs related to transformations.
javax.xml.transform.stream - I/O stream APIs related to transformations.
API Documentation: Java Standard Edition 1.4
#2
Reference Implementation
The JDK contains a reference impementation.
You can swap out the parser & XSLT implementaiton by system properties.
You can also replace them by putting jar files into the JDK endorsed directory.
Endorsed jars will replace them for all Java applications on your system.
#3
Parsing a Document
You need to do the following:
Get and configure a parser factory.
Create a parser.
Set options on the parser.
Call 'parse' on the parser.
#4
Parser Factories
The JAXP interface uses a factory design pattern.
Factories can be obtained by the following code:
SAXParserFactory pfactory = SAXParserFactory.newInstance();
You'll want to turn on namespaces:
pfactory.setNamespaceAware(true);
And maybe validation:
pfactory.setValidating(true);
#5
Creating a Parser w/ Schema Validation
Just call 'newSAXParser':
SAXParser parser = pfactory.newSAXParser();
To turn on schema validation, you need to set a property:
parser.setProperty( "http://java.sun.com/xml/jaxp/properties/schemaLanguage", "http://www.w3.org/2001/XMLSchema" );
#6
XMLReader & Parsers
XMLReader is the "new" interface for parsing.
Just get that from your parser:
XMLReader xmlReader = parser.getXMLReader();
Here you might want to set an error handler:
xmlReader.setErrorHandler(new DefaultHandler() { public void error(SAXParseException sex) throws SAXException { System.err.println(ex.getMessage()); } public void fatalError(SAXParseException ex) throws SAXException { System.err.println(ex.getMessage()); } public void warning(SAXParseException ex) throws SAXException { System.err.println(ex.getMessage()); } });
#7
Adding Catalogs
Xerces has an XML Catalog implementation in the class:
org.apache.xerces.util.XMLCatalogResolver
You just create the class and set the list of catalog files:
String catalogURI = "file:"+(new File(catalogFile)).getAbsolutePath(); String [] catalogs = { catalogURI }; // Create catalog resolver and set a catalog list. XMLCatalogResolver resolver = new XMLCatalogResolver(); resolver.setPreferPublic(true); resolver.setCatalogList(catalogs);
Then set the appropriate property:
// Set the resolver on the parser. xmlReader.setProperty( "http://apache.org/xml/properties/internal/entity-resolver", resolver );
#8
Parsing a Document
Just call parse:
xmlReader.parse("foo.xml");
You can also use open streams:
InputStream is = ...; InputSource s1 = new InputSource(is); xmlReader.parse(s1); Reader r = ...; InputSource s2 = new InputSource(r); xmlReader.parse(r);
InputSource lets you specify a base URI, encoding, etc.
InputSource s = ...; s.setSystemId("http://cde.berkeley.edu/bob.xml"); s.setEncoding("UTF-8");
#9
Handling Content
The ContentHandler interface lets you receive the XML content.
For example:
class ListElements extends DefaultHandler { Map elements = new HashMap(); public void startElement(String namespaceURI, String localName, String qName, Attributes atts) { if (namespaceURI==null) { elements.put(localName,Boolean.TRUE); } else { elements.put('{'+namespaceURI+'}'+localName,Boolean.TRUE); } } }
#10
Transforming with XSLT
The procedure is:
Create a TransformerrFactory instance.
Load your stylesheet into a Transformer instance.
Transform your source to your output using the Transfomer instance.
#11
XSLT in Three Steps!
Create the TransformerFactory:
TransformerFactory tfactory = TransformerFactory.newInstance();
Load the transformation:
Transformer xform = tfactory.newTransformer(new StreamSource("convert.xsl"));
Transform the document:
xform.transform(new StreamSource("in.xml"),new StreamResult("out.xml"));
#12
Chaining Transforms
You can use DOM to chain transforms in memory.
Example:
// Setup the transformations Transformation tfactory = TransformerFactory.newInstance(); Transformer step1xform = tfactory.newTransformer(new StreamSource("step1.xsl")); Transformer step2xform = tfactory.newTransformer(new StreamSource("step2.xsl")); // Create a DOMResult to hold the between XML document DOMResult between = new DOMResult(); // Transform the input step1xform.transform(new StreamSource("in.xml"),between); // Transform the output of step 1 step2xform.transform(new DOMSource(between.getNode()),new StreamResult("out.xml"));
#13
Serialization
SAX doesn't support serialization.
DOM doesn't support serialization.
Serialization with namespaces is hard.
Complain!
#14
Handling Unicode
You must use a java.io.Reader derived class to handler Unicode.
Unicode characters are stored according to encodings:
UTF-8, UTF-16, etc.
US-ASCII
ISO646-JP (Japanese)
Example:
InputStream is = ...; Reader r = new InputStreamReader(is,"UTF-8"); // or look up the charset Charset cs = Charset.forName("UTF-8"); Reader altr = new InputStreamReader(is,cs);
#15
Interacting with HTTP - Step 1
Open the connection:
URI url = new URL("http://localhost:8080/webservice/event.service"); // Opens a connection URLConnection uconnection = url.openConnection(); // Set the properties HttpURLConnection connection = (HttpURLConnection)uconnection; connection.setDoOutput(true); connection.setRequestMethod("POST"); connection.setRequestProperty("Content-Type",contentType); connection.setRequestProperty("Content-Encoding","UTF-8"); // Setup the output for serialization OutputStream os = connection.getOutputStream(); OutputStreamWriter o = new OutputStreamWriter(os,"UTF-8");
Now you write your XML to the writer.
#16
Interacting with HTTP - Step 2
Check the response:
InputStream is = connection.getInputStream(); if (connection.getResponseCode()==HttpURLConnection.HTTP_OK) { String contentType = connection.getContentType(); if (contentType.equals("text/xml")) { // Now you can parse your XML response! } }
#17
Interacting with HTTP - Step 3
Parse the response:
InputStream is = ...; String encoding = connection.getContentEncoding(); Reader r = encoding==null ? new InputStreamReader(r) : new InputStreamReader(r,encoding); xmlReader.parse(new InputSource(r)); connection.disconnect();
#18
JAXP Examples
Parsing and transformation code examples are available in: jaxp-examples.zip
The XParse.java will parse & validate using XML Schema & catalogs.
The Transform.java will apply one or more transforms to a document.