Graphs, XML, and RDF
R. Alexander Milowski
School of Information Management and Systems
milowski at sims.berkeley.edu
#1
Graphs
Figure 1. Figure |
|
#2
Edges
Edges can have labels.
An edge label can describe a relationship:
Figure 2. Figure
#3
Formalisms
A graph G is formally defined as a set G(V,E) where:
V is a set of vertices (or nodes).
E is a set of edges between nodes usually specified as a pair of vertices.
If an edge is directed, then there is a start vertex and end vertex (e.g. an arrow).
#4
Example XML
I'll use this for the examples:
<person> <name><first>Alex</first><last>Milowski</last></name> <office>103 South Hall</office> <email>milowski@sims.berkeley.edu</email> </person>
#5
An XML Graph
Figure 3. Figure
#6
An Alternate XML Graph
Figure 4. Figure
#7
Graphs and Modeling
Many XML models have conceptual models that are graphs.
You need to decide how these graphs relate to the XML.
Sometimes you just need the XML as an exchange syntax for the graph.
Either way, you are modeling a "serialization" of the graph.
#8
Graph Serialization
Cycles cause problems in serialization.
Example:
Figure 5. Figure
Does A contain B or B contain A?
#9
Cycles - Option 1
You can make one directed edge the "canonical relationship".
Example:
Figure 6. Figure
<A id="a"> <B link-to="a"/> </A>
#10
Cycles - Option 2
Or make all edges "first class".
Example:
Figure 7. Figure
<model> <A id="a" link-to="b"/> <B id="b" link-to="a"/> </model>
#11
XML for Arbitrary Graphs
Sometimes you just want to model a graph in XML.
The most basic kind:
<graph> <vertex id="a"/> <vertex id="b"/> <edge from="a" to="b"/> <edge from="b" to="a"/> </graph>
You could also "type" your edges/vertices with element names:
<graph> <akind id="a"/> <bkind id="b"/> <parent-to-child from="a" to="b"/> <child-to-parent from="b" to="a"/> </graph>
#12
University Example
Figure 8. University Relationships
#13
University Example - A Graph Instance
Figure 9. Two Students & Their Classes
#14
University XML - Try 1
<university> <name>UC Berkeley</name> <department> <name>SIMS</name> <class id="290-4"> <student><name>Jane Smith</name></student> <student><name>John Doe</name></student> </class> <class id="290-8"> <student><name>Jane Smith</name></student> </class> </department> </university>
...but 'student' gets duplicated.
#15
University XML - Try 2
Maybe 'student' should be a child of university...
<university> <name>UC Berkeley</name> <student id="s1"><name>Jane Smith</name></student> <student id="s2"><name>John Doe</name></student> <department> <name>SIMS</name> <class id="290-4"> <student ref="s1"/> <student ref="s2"/> </class> <class id="290-8"> <student ref="s2"/> </class> </department> </university>
But departments can cross-list courses and students can be from other universities...
#16
University XML - "Fully Normalized"
This might be the ultimate in flexibility:
<model> <university id="berkeley"> <name>UC Berkeley</name> <department id="sims"/> </university> <student id="s1"><name>Jane Smith</name></student> <student id="s2"><name>John Doe</name></student> <department id="sims"> <name>SIMS</name> <class ref="290-4"/> <class ref="290-8"/> </department> <class id="290-4"> <student ref="s1"/> <student ref="s2"/> </class> <class id="290-8"> <student ref="s2"/> </class> </model>
But it also might be a pain to process.
#17
RDF & XML
RDF - Resource Description Framework - the "semantic web", blah, blah, blah...
The big difference:
An RDF instance is a Graph
A XML document is a tree.
RDF can be represented as XML.
XML vocabularies can include RDF constructs.
#18
An RDF Example
<c:person xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:c="http://cde.berkeley.edu/contact/#" rdf:about="http://cde.berkeley.edu/people#milowski"> <c:name>Alex Milowski</c:name> <c:email rdf:resource="mailto:milowski@sims.berkeley.edu"/> <c:title>Instructor Lacky</c:title> </c:person>
#19
The RDF Graph
Figure 10. Figure