[html]

XML Documents, Namespace, and Unicode

R. Alexander Milowski

milowski at sims.berkeley.edu

#1

Overview

#2

History of XML

[1] http://www.oasis-open.org/cover/yuriMemColl.html

[2] http://www.w3.org/TR/WD-xml-961114.html

[3] http://www.w3.org/TR/1998/REC-xml-19980210

[4] http://www.w3.org/TR/2004/REC-xml-20040204/

[5] http://www.w3.org/TR/xml11/

#3

The W3C Process

  1. Interest in a particular topic is made known by the consortium members either as a result of a published note or from the results of a workshop.

  2. Either a Working Group is formed or the topic is assigned to an existing group.

  3. A Requirements Document is drafted and approved by the first the working group and then the consortium.

  4. All documents go through these stages: Initial Draft, Working Draft, Last Call Draft, Proposed Recommendation. Candidate Recommendation, and Recommendation.

#4

What is XML?

#5

What is Document?

[1] Merriam-Webster Online: http://www.m-w.com/

#6

What is an XML Document?

#7

The Real Answer

Documents are instances of units of information*.

* With XML you get to define "instance", "unit", and "information".

#8

What XML Provides

#9

An Example


  <slide>
    <title>What XML Provides</title>
    <contents>
      <ul>
        <li><p>Internationalization via Unicode</p></li>
        <li><p>Validation of instances.</p></li>
        <li><p>Localization of names via namespaces 
              (e.g. My &#39;tomato&#39; isn&#39;t your &#39;tomato&#39;).</p>
        </li>
        <li><p>A &#34;human readable&#34; format.</p></li>
        <li><p>Hierarchical structure.</p></li>
        <li><p>A &#34;motif&#34; for extensibility.</p></li>
      </ul>
    </contents>
  </slide>

      

#10

A More Complicated Example

<c:pseudocode name="Adj" 
   xmlns:c="urn:publicid:IDN+mathdoc.org:schema:pseudocode:2004:1.0:us"
>
   <c:args><c:arg>v</c:arg><c:arg>j</c:arg></c:args>
   <c:for>
      <c:varassign> 
         <c:var>i</c:var>
         <c:constant>1</c:constant>
      </c:varassign>
      <to><c:constant>j</c:constant></to>
      <c:do>
         <c:varassign>
            <c:var>r</c:var>
            <c:func name="find-simplex">
               <c:value><c:var>v</c:var></c:value>
               <c:value><c:var>i</c:var></c:value>
            </c:func>
        </c:varassign>
      </c:do>
   </c:for>
   <c:return><c:value><c:var>r</c:var></c:value></c:return>
</c:pseudocode>

Adj(v, j):

for i ← 1 to j

do

r ← find-simplex(v,i)

return r

#11

Reading EBNF

[22]    prolog      ::=    XMLDecl? Misc* (doctypedecl Misc*)?
[23]    XMLDecl     ::=    '<?xml' VersionInfo EncodingDecl? SDDecl? S? '?>'
[24]    VersionInfo ::=    S 'version' Eq ("'" VersionNum "'" | '"' VersionNum '"')
[25]    Eq          ::=    S? '=' S?
[26]    VersionNum  ::=    ([a-zA-Z0-9_.:] | '-')+
[27]    Misc        ::=    Comment | PI | S

#12

Motivation for Namespaces

#13

Names in XML

#14

Local Names and Identifiers

#15

Syntax of Names

#16

QNames and Prefixes

#17

Basic Syntax - Documents & Elements

#18

The XML Declaration

#19

Basic Syntax - Attributes

#20

Those Pesky Namespaces and Prefixes

#21

Namespace Example

<doc xmlns:m="http://www.w3.org/1998/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">
<title>My Document</title>
<body xmlns='http://www.w3.org/1999/xhtml">
<p xlink:href="alternate.xml">I am a paragraph containing some mathematics: 
<m:math><m:mi>x</m:mi></m:math>
</p>
</body>
</doc>

#22

Well-Formed Documents

#23

Well-Formed Elements

#24

Well-Formed Empty Elements

#25

Well-Formed Attributes

#26

Well-Formed Text

#27

Comments

#28

Processing Instructions

#29

CDATA Sections

#30

Characters and Unicode

#31

Unicode in XML

#32

Whitespace Handling