Plan for Today's Class

Assembling a Document Model
- Traversing the Component Model
- Assembly using Core + Contexts

Where We Are in the Methodology

We have identified primitive and aggregate components
We've used heuristic or formal means (or both)
The methods we used and the results reflect the mixture of transactional and non-transactional documents in our context
- Number of components
- Size of components
- Precision of rules for datatypes, associations, cardinality

The Component Model is a Set of Relations

Multiple Paths Through the Component Network

Representations of Normalized Model - Primary Key Path

Representations of Normalized Models - UML Class Diagram

Why We Need Hierarchical Models

A relational model simultaneously describes all of the associations among the components; put another way, it doesn't highlight any particular association
But when we exchange information, we do so to satisfy the requirements in some context
If there are multiple ways to interpret the content we will not achieve interoperability
Hierarchies (tree structures) provide unambiguous structures
So we impose a contextual interpretation when we create a hierarchy on a relational model

Simple Example - Book / Author / Edition / Publisher

Hierarchical Interpretation of the "Book" Model [1]

Hierarchical Interpretation of the "Book" Model [2]

Hierarchical Interpretation of the "Book" Model [3]

Document Model Assembly

Document model assembly is the process of creating a model of a document type – hierarchical and nested – by drawing on the "pool" or library of content and structural components
Assembly involves designing (or selecting a pattern for) the top level structure as an entry point and then navigating through the relationships in the conceptual model collecting the components in the order that best satisfies your requirements
Assembly order can differ whenever there is a bi-directional relationship between components – whenever two components are functionally independent, an assembly order chooses one of the relationships to enforce an interpretation on the assembled document
The direction of following the relationship determines which of the structural roles is being used
We end up with a specific context-sensitive view of the model
This is the logical basis of the document schema – all we have left to do is to encode it as an XML schema

Document Model Assembly and the Document Type Spectrum

The basic problem of assembly is the same for all types of documents but the solution is different at different points on the spectrum
Non-transactional / narrative / publication type documents usually have fewer content-based rules, but their assembly is often shaped by structural or presentation rules

Document Model Assembly - Transactional Document Types

Since transactional documents and data-intensive contexts tend to have more rules, their component models are more complex and there are more alternative document assembly models
These alternate assembly models may differ in which information from the instance they present (they may be queries or views of the instance rather than a one-to-one rendering) and in the order or structure with which they present it.
If the sequence is important it should be a component in the model and assembled in our logical documents (e.g. SequenceNumber in our Lecture Notes example)

The Rules of Assembly

The rules represented in the component model must be followed during any document model assembly:

Mandatory associations must be followed
Mandatory components must be included
Optional associations are followed if they meet the requirements for the context.
Optional components are included if they meet the requirements for the context.
Even if one role is the usual or canonical interpretation, it may not be a requirement for the context of this specific document assembly

Assembly Order and Containership

The structural depth of the document model is determined by how many associations in the component model are followed
The order in which associations are followed determines the nesting or container structure in the model

Document Model Assembly - Non-transactional Document Types

Requirements for structural or presentation integrity may be more important than content constraints
There are conventional assembly patterns for many types of documents (perhaps these can be viewed as default requirements)
(Maler and el Andaloussi call this the "shape" of the document type)
Some document types seem naturally "flat" – just 2-level deep "list of things" documents
Sometimes documents can be arbitrarily deep with chapter, section, subsection, etc divisions but from a component type perspective this is a simple recursive structure with few or no content distinctions

A Common Document Assembly Pattern

Structural Integrity Requirements

Structural integrity – requirement to preserve some aspects of structure:

Identical page boundaries for the electronic and printed versions of documents
Chronological order for a narrative biography or history
"Putting it together" instructions (don't want to say "assembly" here) for a bicycle or piece of furniture need to follow the order in which they are most easily or safely put together.

Presentation Integrity Requirements

Presentation fidelity – preserve aspects of original presentation:

An extreme requirement, but in some circumstances it is mandated by law to reproduce a document artifact exactly as it appeared in its original printed format. For example, with International Letters of Credit and Bills of Lading you can readily imagine a bank or customs inspector carefully comparing computer-generated and original printed documents.
This is a requirement to assemble the document model in "document order" – that is, to organize the elements so that their valid order matches the order in which they would want them to appear in a document instance

Event Calendars: Assembly

Time-based Calendar Model - Assembly Path

Time-based Calendar Model

Location-based Calendar Model - Assembly Path

Location-based Calendar Model

Event-based Calendar Model - Assembly Path

Event-based Calendar Model

How Many Document Models to Assemble? [1]

In many domains because of the rich network of associations you can assemble a large number of different document models from the same component model
Determining how many document types to assemble is another design problem in its own right
During a document analysis phase you will create an inventory of existing document types but this isn't necessarily the set of logical document types you'll end up with after you design.
There may be several types of documents that you want to treat as equivalent by assembling a single more general document model, or you could assemble several separate models

How Many Document Models to Assemble? [2]

This decision has consequences for the implementation model (the DTD or schema):
- The software tools you can use for creating and manipulating documents
- The authoring or document creation process
- The amount of training likely to be required
- The flexibility of your system or applications
- The amount of validation that is possible
- The amount of integration or transformation required on a one-time or recurring basis

Motivating "Core and Contexts" Modeling and Assembly

The "component model traversal" is a rigorous approach for assembling document models
It is especially appropriate when you've been able to fully or mostly normalize the component model (because there are many constraints or rules about the components and their relationships)
But the normalized model of a complex domain may be very granular with many small groups of components
So we've developed a complementary conception of document model assembly that can improve the manageability and reuse of the model components – core and contexts
The basic idea is that we're identifying components and assembling them from "the document down" rather than from "the components up"

The Customization / Contextualization Challenge

Even a little bit of analysis suggests that there are some components that are useful in lots of different situations or applications or documents ["person," "address," "line item," "event," ...]
So many people and organizations have created models for these standard components or documents
But often each of these components takes on some additional information or structure in each of the different customization contexts
So any set of component types needed to satisfy all of these contexts will contain lots of them that aren't needed in most of them

A Standard Set of Components in a Domain

Subtractive Refinement of the Domain Model

Interoperability Challenges with Subtractive Refinement [1]

Interoperability Challenges with Subtractive Refinement [2]

Start with a Core Component

Additive Customization of the Core Component

Additive Refinement – Benefits and Limitations

Core and Context Components

Assembly with Core and Contexts

Reuse with Core and Contexts

Modeling with Contexts in the Course "ecosystem" [1]

Modeling with Contexts in the Course "ecosystem" [2]

Modeling with Contexts for Event Calendars

Limitations of "Core and Contexts" Modeling

It seems like a good idea to create the smallest possible core components and leave room for them to be customized or contextualized by additional components
But the set of contexts that emerge is strongly shaped by the document types you are expecting to assemble, and some people object in principle to modeling shaped by implementation considerations
Furthermore, the criteria or heuristics used to decide what "goes together" are informal and don't yield consistent results
But it isn't a question of "either/or" here between the traversal and the c+c approach. Think of them as influences or philosophies or approaches for document assembly that you need to balance. Thinking of modeling and document assembly in different ways can result in a deeper understanding of why and how you got there

23. Document Model Assembly [1]

DE + IA (IS 243) - 19 April 2006

Plan for Today's Class

Where We Are in the Methodology

The Component Model is a Set of Relations

Multiple Paths Through the Component Network

Representations of Normalized Model - Primary Key Path

Representations of Normalized Models - UML Class Diagram

Why We Need Hierarchical Models

Simple Example - Book / Author / Edition / Publisher

Hierarchical Interpretation of the "Book" Model [1]

Hierarchical Interpretation of the "Book" Model [2]

Hierarchical Interpretation of the "Book" Model [3]

Document Model Assembly

Document Model Assembly and the Document Type Spectrum

Document Model Assembly - Transactional Document Types

The Rules of Assembly

Assembly Order and Containership

Document Model Assembly - Non-transactional Document Types

A Common Document Assembly Pattern

Structural Integrity Requirements

Presentation Integrity Requirements

Event Calendars: Assembly

Time-based Calendar Model - Assembly Path

Time-based Calendar Model

Location-based Calendar Model - Assembly Path

Location-based Calendar Model

Event-based Calendar Model - Assembly Path

Event-based Calendar Model

How Many Document Models to Assemble? [1]

How Many Document Models to Assemble? [2]

Motivating "Core and Contexts" Modeling and Assembly

The Customization / Contextualization Challenge

A Standard Set of Components in a Domain

Subtractive Refinement of the Domain Model

Interoperability Challenges with Subtractive Refinement [1]

Interoperability Challenges with Subtractive Refinement [2]

Start with a Core Component

Additive Customization of the Core Component

Additive Refinement – Benefits and Limitations

Core and Context Components

Assembly with Core and Contexts

Reuse with Core and Contexts

Modeling with Contexts in the Course "ecosystem" [1]

Modeling with Contexts in the Course "ecosystem" [2]

Modeling with Contexts for Event Calendars

Limitations of "Core and Contexts" Modeling

For 24 April