13. Models of Business Information
DE + IA (INFO 243) - 28 February 2007
Bob Glushko
Plan for Today's Class
- "Document Automation" or STP Pattern
- XML Vocabularies
- XMLification and EDI {and,or,vs} XML
- Vertical vs Horizontal Vocabularies
- Hub Languages
- OASIS, Cover Pages, and other resources
"Document Automation" or STP Pattern
- Many business processes can be described as "moving information around"
- At each step information might be added to the input document or a new document might be created that contains most of the input document's content
- However, even though the end-to-end process might span multiple departments (or companies), the business applications (run by
separate departments) may not have been designed to share information with each other
- Clerical functions can usually be totally automated
- Processes carried out by knowledge workers can often be partially automated
Typical Characteristics of Document Automation Efforts
- Create documents with templates or via guided assembly (aka "wizards")
- Minimize manual intervention via rule-based routing, access control, exception handling
- Concurrent process re-engineering
- Documents are regenerated when source information changes
- End-to-end perspective to maximize content reuse
- Standard content components and processes
XMLification
- In the 10 years (since XML was created) many standards efforts have arisen to evolve document and process standards encoded in EDI syntaxes to XML
- EDI is still widely used to automate routine transactions between established trading partners, especially in direct procurement and in the demand chains for cpg sector ("Consumer Packaged Goods")
- So EDI isn't dead, but XMLification is inevitable because EDI's implementation and recurring costs are vastly higher than those for XML-based information systems
Why You'd Encode A New Vocabulary in XML and Not EDI
(XML) Vocabularies
- The most accessible information pattern resources are XML vocabularies (or "schemas" or "tag sets")
- A vocabulary is a set of elements and attributes
and the rules by which they combine
- Linguists shudder at this definition because it conflates the vocabulary and the grammar
- XML purists call these "applications" of XML but this conflicts with the more common usage of that word to mean "software application"
The Need for Controlled Vocabularies
-
There is no one good access term or name for most objects or concepts. The idea of an "obvious, "self-evident," or "natural" term is a myth
- The words people use to describe things or concepts are
"embodied" in their context and experiences... so they are
often different or even "bad" with respect to the words
used by others
- What would "good" words be like?
- Where would they come from?
- How do you get people to use them?
Why XML Vocabularies Exist
- Vocabularies must exist because:
- There is no predefined set of tags
- And since there is no predefined tag set, there can't be any predefined semantics
- Semantics are defined in the schema and its documentation about what the elements
and attributes mean
- So using XML without a (documented) schema makes little sense
Why Vocabularies Are Desirable
- Vocabularies are desirable because people want to have a set of tags that is customized to their problem or industry so that they can take full advantage of XML
- A good vocabulary represents a significant investment in defining a domain, identifying its key semantic components, and specifying the constraints and rules governing the combination and reuse of those components
- A good vocabulary is a reference model for the domain that facilitates communication between enterprises operating within it
The Best Thing About XML
- Is the ease with which you can create a new vocabulary
- The key word here is "you"
The Worst Thing About XML
- Is the same as the best thing: the ease with which you can create a new vocabulary
- There are often multiple vocabularies for the same or related domains and especially for the common information
models that are used in more than one domain
- That two concepts use the same XML tag names
doesn't prove anything; the same content will inevitably be described using different names,
and different content will be given the same names
Vertical and Horizontal Vocabularies
- Vertical:
- Particular industry or vertical market
- Detailed product semantics
- Specialized process semantics
- Sometimes called "domain-specific" languages
- Horizontal
- Concepts that are common to all (or a large number of) vocabularies
Creating Vertical XML Vocabularies – Worst Practices
- Turn proprietary APIs into XML vocabularies by wrapping "<" and ">" around the names of methods that set and return values
- Turn proprietary database schemas into XML vocabularies by wrapping "<" and ">" around the names that define the structure of a record or object
- Turn existing EDI messages into XML by automated conversion, using the delimiters for segments, composites, and elements as "handles" for content
Creating Vertical XML Vocabularies – Better Practices
- Turn existing EDI messages into XML
- Analyze EDI messages to identify the "syntax-neutral" conceptual models they contain
- Encode these conceptual models in XML
- If no vocabulary exists:
- Identify current and potential uses of the vocabulary
- Analyze existing documents and information
sources (with EDI, identify "syntax-neutral" models of message content)
- Design conceptual models that satisfy the requirements in a feasible way
- Encode the models in XML schemas
XML and Metcalfe's Law
- The value of a language depends on how many people (or computers) understand it
- How do you encourage and enable others to understand your language?
- Standardization Approach 1: "Understand my language or I won't do business with you"
- Standardization Approach 2: "Excuse me, here's my language, would you like to do business with me?"
The Interoperability Problem
- The vocabulary problem implies an interoperability problem
- This means that two applications or services can't use each other's models or document instances "as is"
- Some interoperability problems can be detected and resolved by completely automated mechanisms
- Other problems can be detected and resolved with some human intervention
- Other problems can be detected but not resolved
- Some problems can go undetected
Why Interoperability Problems Are Inevitable with Vertical XML Vocabularies
- Each new XML vocabulary for a particular industry is a step
forward for that community, but proliferates definitions of information models
that are common to many of them
- Since the distinctive or specialized parts of each vocabulary are the industry-specific "vertical" parts,
a lot of attention gets paid to them
- In contrast, relatively less effort is given to the "horizontal" parts that seem more familiar or
understandable
- Nevertheless, any large company – even highly verticalized ones – engages in diverse business
activities that require it to understand multiple vocabularies at different times
Vertical and Horizontal Vocabularies Must Work Together
When Models Don't Match
- Suppose you publish your web service interface description and tell the world "my ordering service requires a purchase order that conforms to this schema"
- This says "send me MY purchase order" not "send me YOUR purchase order"
- How likely is it that the purchase orders being used by other firms will be able to meet your interface requirement, either directly or after being transformed?
How Bad Can the Interoperability Problem Be?
The Target Model For The Interoperability Scenarios
The XSD Schema for the Expected Order [1]
The XSD Schema for the Expected Order [2]
Identical Model with Different Tag Names [1]
Identical Model with Different Tag Names [2]
Same Model, Attributes Instead of Elements
Assembly Mismatch - Separate Customer and Order Documents [1]
Assembly Mismatch - Separate Customer and Order Documents [2]
Conceptual Incompatibility
Lessons from the Interoperability Examples
- There are a large number of ways that two implementation models that are supposed to be equivalent can fail that test
- But no matter how different they look, with different syntaxes, tag names, or assembly models, if their conceptual model is the same, it is possible to transform one implementation model to another
- Validation is not sufficient to guarantee complete interoperability
Attacking the Interoperability Problem
- Everyone has to learn to "speak" all the languages – clearly impractical
- Everyone has to learn just one language but it has to be the same one
- Multiple vocabularies exist, but there is at least one "interchange" or "hub" language designed
to facilitate translations between "native" vocabularies
An Interchange or Hub Language
Hub Languages for e-Business
- (early 1990s) - Ad hoc efforts in EDIFACT to "harmonize" core components across verticals
- 1997- XML Common Business Library is 1st XML horizontal vocabulary, incorporated EDIFACT semantics and code lists
- 1999 - ebxml initiative of EDIFACT and OASIS to develop syntax-neutral "core components"
- 2001 - Universal Business Language effort begins, building on xCBL and ebXML Core Components
Universal Business Language
- DOCUMENT ARCHITECTURE: A generic XML interchange format for business documents that can be extended to meet the requirements of particular industries
- CORE COMPONENTS: A library of XML schemas for reusable data components such as "Address," "Item," and "Payment" -- the common data elements of everyday business documents
- STANDARD DOCUMENTS: A small set of XML schemas for common business documents such as "Order," "Despatch Advice," and "Invoice" that are constructed from the UBL library components and can be used in a generic order-to-invoice trading context
UBL 1.0 Document / Process Scope
How A Hub Language Increases the XML Advantage over EDI
How a Hub Language Shortens the Time to the XML Payoff
Mapping in and out of Hub Language
- If all parties/applications/services rely on a hub language for their external interfaces, an exponential interoperability challenge becomes a linear one
- Mapping tools for transforming instances from an internal information model to another one are ubiquitous as standalone tools and as parts of application servers
- EXAMPLE: Altova MapForce
Microformats
- Microformats represent a very different approach to solving interoperability problems
Information Pattern Resources for Scavenger Hunt
For Monday March 5
- Chapter 5 of Document Engineering
- "E-Government Architecture in Ireland" Sean McGrath and Fergal Murray.
XML 2004 Conference
- "The Digital Transformation: Technology and Beyond" Donald J. Bowersox, David J. Closs, and Ralph W. Drayer.
Supply Chain Management Review (January/February 2005)
- Chapter 8 of Document Engineering