8. The Document Engineering Approach
DE + IA (INFO 243) - 12 February 2007
Bob Glushko
Plan for Today's Class
- Modeling Methodologies
- The Document Engineering Approach
Modeling Methodologies Can Contain or Define:
- Processes / Activities / Steps -- what to do and when to do it
- Artifacts -- the documents or other representation of the results of the processes / activities / steps; different parts or views of the overall model
- Meta-models -- models that specify the type of information to be recorded in the artifact
- Notations -- the presentational constructs used in the modeling artifacts
- Tools -- technology used to create the artifacts
Methodology and "Mythodology" [1]
- System development methodologies have been around for about 50 years
- The ad hoc ones come and go, but the more formal ones often achieve a high profile
- With almost religious overtones, some methodologies have their "prophets," "high priests," "disciples," "Bibles," and "heretics"
- Methodology wars are fought over process (quantity and format of documentation), notation (sizes and shapes for arrows and boxes), tools (merits of different software packages), and artifacts (what is an "object")
Methodology and "Mythodology" [2]
- Stay neutral in these wars by remembering:
- Modeling is a means NOT an end in itself
- Modeling can be guided by one or more methodologies
- A methodology doesn't have to be formal or prescriptive to be useful
- Since many modeling techniques come with either an explicit or implied methodology for its application that may be limiting for some kinds of models, make sure you don't restrict your designs to suit the modeling technique
The Document Engineering Approach
The D-O-C-U-M-E-N-T Checklist
- D -- data types and document types
-
O -- organizational processes
-
C -- context (types of products or services, industry, geography, regulatory considerations)
-
U -- user types and special user requirements
-
M -- models, patterns, or standards that apply
-
E -- enterprises and eco systems (e.g., trading communities, standards bodies)
-
N -- the needs (business case) driving the enterprise(s)
-
T -- technology constraints and opportunities
D-O-C-U-M-E-N-T in the Document Engineering Approach
The Domain Model
- The primary purpose of modeling is to better understand some existing system or environment and its entities ("the things that exist in it that contain or embody relevant information") and to describe this understanding so it can be communicated
- This collective or composite understanding is sometimes called the domain model
Modeling Perspectives on the Domain Model
- It would be ideal if the modeling methodology could directly support the goal of creating the domain model
- But most domains or systems to be analyzed/built are too big or complex to be understood "all at once" -- multiple modeling perspectives are implied by different parts of the model, and these might require more than one analyst with different skills
Converging Modeling Perspectives in Document Engineering
The Modeling Artifacts
- So the steps of the modeling methodology develop the domain model "in pieces" -- the modeling artifacts -- and the methodology is designed so that these pieces logically fit together to create the complete model
- Document Engineering defines a set of artifacts for separate but interconnected models of model processes, documents, and information components
The Modeling Artifacts in Document Engineering [1]
The Modeling Artifacts in Document Engineering [2]
Meta-models
- A meta-model is an abstract model that specifies the type of information to be collected and recorded
during the modeling activity
- This is another layer on top of the model itself -- think of it as a model for the model
- Rigorous or highly-formal methodologies generally have a meta-model with prescriptive
processes for "populating" it
- Meta-models enforce consistency in a set of models, enabling them to be be compared or to interoperate
The ebXML Metamodel for Business Processes
The ebXML Metamodel for Business Transactions
Notations
- A notation is needed to depict or represent the objects and processes in your model
- At its simplest, a notation is a set of graphical elements (usually boxes) and lines that connect two or more of them to indicate a relationship
- Notations can be as simple and informal as "box diagrams" on the back of a napkin
- Or as complex and formal as the "activity diagrams" defined in the Unified Modeling Language(UML)
- There is no single "lingua franca" model notation suitable for all users and purposes
UML as a Modeling Notation
- The Unified Modeling Language (UML) is a graphical language for visualizing, specifying, constructing and documenting the structure and behavior of (software) systems
- UML is the synthesis of a variety of object-oriented modeling concepts and notations and is now endorsed by the Object Management Group. It is a language in that it can be used to define different types of models
- UML has a number of standard notations or diagram types that all are based on the same underlying meta-model (and it can also be used to define additional notations)
- Put another way... one UML description of a model can be used to produce numerous artifacts that use different notations, each of which highlights a different aspect of the model
- We will use just enough UML in this course to help us analyze and design models for documents and business processes
UML Class Diagram for Business Process Metamodel
XML as a Modeling Notation
- XML has rapidly emerged as a preferred format for creating models that are vendor and notation-neutral
- XML's key benefits from this perspective are the ease with which XML instances can be transformed and the use of models expressed as XML schemas to guide code generation by serving as templates for program objects
- Using the concepts introduced in this lecture, XML is a meta-modeling language or maybe even a meta-meta-modeling language; each XML schema language is a meta-language for creating models that are expressed as DTDs or as instances of other schema language types
But Using XML != "Modeling with XML"
- Some people think that "modeling" with XML means "writing a schema given a set of instances" or "inferring a schema from a single instance"-- many software tools support this sort of thing
- But schemas developed without a stage of conceptual design are rarely very useful because they are too closely tied to the particular instances used, which may not be representative.
- Sometimes schemas went through a stage of conceptual design but once the schemas are implemented the conceptual information isn't available to schema users
Tools
- Tools can be as simple and informal (pencil and paper) or complex and formal (CAD and CASE software)
- There is no single tool suitable for all users and purposes
- Tools can be general purpose, not tied to specific methodologies (word processors, spreadsheet, presentation packages)
- General purpose tools can use templates to support specific notations or processes
- Tools can be dedicated to particular notations or processes
Starting Through the "Snake"
Setting the Context
- Any Document Engineering project worth doing will involve some set of document types and information components that take part in some set of business processes
- Because "no document (or process) is an island" there will always be some point at which the documents and processes you care about will intersect or overlap with some that that you don't care about
- We'll call the Context whatever characteristics of the situation that define what is in or out of scope, inside or outside of the boundary in which our solution has to work
Context is a Point of View
Context and Selecting Patterns
- A business process pattern implies a set of documents and some regular choreographies of document exchanges
- A pattern can be thought of as a typical cluster or configuration of requirements
- Selecting an appropriate pattern will help expose the information requirements, rules and constraints for our subsequent document analysis and design
- Choosing a pattern suggests which document payloads we'll need to find or design and in which business processes we are likely to deploy them
- How we describe context influences what patterns we identify and how we apply them
The Drop Shipment Pattern (UML Sequence Diagram)
Document Analysis
- The first phase in developing conceptual models of documents is Document Analysis
- "Document" analysis is the standard term but is a bit of a misnomer – much of what we will analyze isn't packaged as "documents" and looks more like "data" – we are going to stretch the meaning of "document" to fit everything
- Document-level activities of Inventory and Sampling determine what documents and information sources we'll analyze in more detail
- The Harvesting and Consolidation activities work on the smaller content components identified or extracted in the preceding activities
- After consolidation to a set of semantically distinct candidate components, we begin the design phase
Example Harvest: Syllabus
Example Consolidation: Event Calendar
Document Component Models
-
Document Component Models
describe the complete set of semantic components in a domain
Document Assembly Models
-
Document Assembly Models
describe some selection or arrangement of components in a hierarchical model of a document type
Another Assembly of Same Component Model
Another Assembly of Same Component Model
Readings for 14 February
- Chapter 3 of Document Engineering [Textbook, 86-100]
- Do Some Business Models Perform Better than Others?, Peter Weill, Thomas W. Malone, Victoria T. D’Urso, George Herman and Stephanie Woerner
- E-Gov: Federal Enterprise Architecture
- "FEA Consolidated Reference Model Document (v. 2.1)" Office of Management and Budget [pages 1-9, 25-26]