Getting Started with XML Schema
R. Alexander Milowski
milowski@sims.berkeley.edu
School of Information Management and Systems
#1
Tools
Oxygen 3.1 provides validation (red check mark button).
My wrap-up of Xerces 2.6.2 via JAXP available at http://www.milowski.com/jaxp
XSV online: http://www.w3.org/2001/03/webdata/xsv
#2
JAXP/Xerces
Its a command-line tool.
-v turns on validation.
"-c catalog-file" specifies a catalog for schema namespaces.
Example:
xerces -v foo.xml
#3
Checking Schemas
Schemas are validated in two ways:
As an XML document against a schema.
As an XML Schema.
Most tools will do both when you validate your schema.
Your schema must be valid in both ways to validate an instance.
#4
Authoring Schemas
Three choices:
Oxygen - can edit a schema as XML and check it as a schema.
Netbeans - can edit a schema as XML and check it as a schema.
http://www.netbeans.org - (free)
XML Mind - Authors and checks schemas at a "higher" level.
http://www.xmlmind.com - the standard edition (free)
#5
Validating Instances
An instance has a namespace (or no namespace) associated with each of its elements.
The validator needs to find element declarations one of two ways:
Your instance explicitly points to an schema for a specific namespace via xsi:schemaLocation
There is a look-up table (e.g. a catalog) external to the document which tells the validator where to find the schemas.
#6
Catalogs
A catalog maps namespaces to schema documents.
Its an XML document in the namespace: urn:oasis:names:tc:entity:xmlns:xml:catalog
The spec is at: OASIS's Website
There are two things to be concerned with:
Mapping URI values that start with "urn:publicid:..."
Everything else.
Example:
<catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog" prefer='public'> <uri name="http://cde.berkeley.edu/~milowski/schemas/example-form/event/200402" uri="event.xsd"/> <public publicId="IDN cde.berkeley.edu//milowski//schemas//example-form//event//200402" uri="event.xsd"/> </catalog>
#7
Public Identifiers and URNs
Any URN that starts with "urn:publicid:..." is a public identifier.
Its a formal way of naming a resource (e.g. a schema).
The specification at the OASIS site will tell you more.
The URN gets mapped to a public identifer string:
URN Value: urn:publicid:IDN+cde.berkeley.edu:milowski:schemas:example-form:event:200402
Public Identifier: IDN cde.berkeley.edu//milowski//schemas//example-form//event//200402
So you just use the 'public' element in the catalog.
#8
Example via xsi:schemaLocation
There's a namespace for schema "stuff" in instances:
http://www.w3.org/2001/XMLSchema-instance
The prefix 'xsi' is typically used.
The attribute 'schemaLocation' specifies pairs of a namespace name and URL for location schemas.
Example:
<e:event xmlns:e="urn:publicid:IDN+cde.berkeley.edu:milowski:schemas:example-form:event:200402" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="urn:publicid:IDN+cde.berkeley.edu:milowski:schemas:example-form:event:200402 event.xsd"> ... </e:event>
#9
Example via Catalog
Catalogs put much less focus on the instance's construction.
Example:
<e:event xmlns:e="urn:publicid:IDN+cde.berkeley.edu:milowski:schemas:example-form:event:200402"> ... </e:event>
and the catalog has the mapping information:
<catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog" prefer='public'> <public publicId="IDN cde.berkeley.edu//milowski//schemas//example-form//event//200402" uri="event.xsd"/> </catalog>
#10
Example via Catalog w/o Public Identifiers
Catalogs put much less focus on the instance's construction.
Example:
<e:event xmlns:e="http://www.milowski.com/schemas/example-form/event/200402"> ... </e:event>
and the catalog has the mapping information:
<catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog" prefer='public'> <uri name="http://www.milowski.com/schemas/example-form/event/200402" uri="event.xsd"/> </catalog>
#11
Other Catalog Facilities
Catalogs can also map URI values to URI values.
If you receive a document with a location that you can't process, you can map it.
<uri name="bad-value" uri="good-value"/>
You can also delegate to create search paths and overrides.
Catalogs are a necessary part of XML-based interchange.