1. Course Overview
DE + IA (INFO 243) - 17 January 2007
Bob Glushko
Plan for Today's Lecture
- Who Are We, and Why Are We Here?
- DE and IA in the news
- Introduction to Key Concepts
- Syllabus Overview and Administrivia
Who are We and Why are We Here?
-
Instructor: Bob Glushko
- Teaching Assistant: Anya Kartavenko
- The Rest of You?
The Course Description
- This course introduces the discipline of Document Engineering: specifying, designing, and deploying electronic documents and information repositories that enable document-centric applications. These applications include web services, virtual enterprises, information supply chains, single-source publishing, and syndication.
- Document Engineering has much in common with the field of Information Architecture, but extends its scope beyond web site and web application design.
Document Engineering in the News
- You don't yet know what Document Engineering is, but I do and we're going to start by seeing that it is in the news all the time
- What are the common issues and themes?
Intel, Wal-Mart and others Push Electronic Health Records
FDA Sets Document Standards for Drug-Label Submissions
- Wall Street Journal, 2 November 2005 (wsj.com)
- US Food and Drug Administration is requiring all new and changed labels and package inserts to be submitted in XML "Structured Product Labeling" format instead of in PDF
- All will be available at http://dailymed.nlm.nih.gov/dailymed/about.cfm
Salesforce.com Connects its Front End to Back Ends
- Salesforce has radically transformed the CRM business from being application-oriented to services-oriented
- Business process events in Salesforce.com will send document updates to ERP applications or integration hubs
SPEEDy Self-Service Comes to Airline Information Processes
Global Shippers Give Customers Real-Time Cargo Info
FedEx Kinko's Announces Web-based Printing, Tracking Services
Disrupting Business Models with Document Assembly
- August 2006 article by Darryl Mountain in the International Journal of Law and Information Technology argues that document assembly software has the potential to radically disrupt the business models of law firms
- The variable content is typically unstructured text that defines terms and conditions for document types like business contracts, wills, and divorces
- Document assembly software contains a rules editor, and the logic in the rules then drive an interactive or wizard-like questionnaire that collects the needed content. Companies like HotDocs and Rapidocs are typical examples.
What Are the Common Themes in These News Items?
- Enormous amounts of existing (paper) documents and legacy processes would benefit from automation, process re-engineering, transformation to SOA
- New business processes are created / coordinated / choreographed via the management and exchange of electronic documents
- Standards / patterns for documents and business processes are essential
- Information technology and business processes are co-evolving with many ways to create business value
- But projects can be challenging, and their success depends on many factors besides technology
So Document Engineering Isn't (Just) About XML
- XML is a useful technology for Document Engineering, but using XML doesn't make you a document engineer
- The best thing about XML is the ease with which you can create a new vocabulary for a particular type of document
- XML is just the syntax in which we encode document models... what really matters is how we modeled the documents
Creating Models is Easy, But Creating GOOD Models is Hard
- The worst thing about XML is the same as the best thing – the ease with which you can create a new vocabulary
- No way around the classical problems of classification and naming we know from philosophy, linguistics, cognitive psychology, and information science
- XML is NOT "self-describing"
- The same content will inevitably be described using different names,
and different content will be given the same names
- There are often multiple vocabularies for the same or related domains and especially for the common information models that are used in more than one domain
The Document Exchange Pattern
- Businesses have long dealt with each other by exchanging documents
-
Halfat's clay pot receipt for taxes
is certainly one of the oldest documents that record a business transaction (355 BCE)
The Document Exchange Pattern (continued)
- Very natural thing to do
- the simplest case is "here's my catalog, do you
want to buy anything" and the exchanged document being "here's my order"
- We use concepts like "supply chains" and "distribution channels"
as metaphors for the coordinated or choreographed flow of information and materials/products between
businesses
- These are complex patterns composed from the document exchange pattern
Document Exchange is the Mother of All Patterns
- Document exchange is the "mother of all patterns" for business models, business processes, and business information
- Business model or organizational patterns: marketplace, auction, supply chain, build to order, drop shipment, vendor managed inventory, etc.
- Business process patterns: procurement, payment, shipment, reconciliation, etc.
- Business information patterns: catalog, purchase order, invoice, etc. and the
components they contain for party, time, location, measurement, etc.
Web Services
- Web services is today's biggest buzzword
- The idea is simple – encapsulate or "wrap" some specific and discrete unit of functionality
to hide its implementation and make it reusable by sending it an XML message, to which it replies with an XML message
- Many business patterns like supply chains or virtual enterprises are a natural fit for web services, easy to see idea of service composition
- But exchanging information does no good if the information can't be understood by the parties (or applications) doing the exchanging.
- The Web services "standards" not only don't solve this problem – they completely ignore it
Modeling Documents {and,vs,or} Modeling Processes
- A document exchange -- or any web-based service -- consists of both the documents and the processes that produce and consume them
- By understanding the information in the documents, we learn what kinds of processes (or services) are possible
- By understanding the processes (or services), we learn what kinds of information are needed
A Process-Centric Depiction
A Document-Centric Depiction
Benefits of a Document-Centric Modeling Approach
- Documents are more tangible than processes, easier to analyze and communicate
- SOA emphasizes documents as the public interfaces to private processes
- David Cohn: 100,000 nouns enable us to understand the meanings of 10,000 verbs
Course Syllabus
- We've just touched on almost every topic in the course
- Models of Business Organization and Business Processes
- Models of Business Information; XML Vocabularies
- Models of Business Architecture; Web Services
- Analyzing and Modeling Business Processes, Documents and Information Components
- Model-Based Applications and User Interfaces; UI Design Patterns
- Management and Strategy Issues, Case Studies
Required Readings
- Glushko & McGrath,
Document Engineering,
MIT Press, 2005.
- All readings will be available online or as paper handouts; do we need a course reader? Some copyright clearance issues to work out
Course Deliverables
- 8 assignments throughout the semester, 6 or 7 of which will be graded. These assignments are designed to develop and reinforce practical skills in analysis, modeling, and implementation of document-centric and model-based applications
- Students taking the class for a letter grade will also be required to carry out a "mini-project" during the second half of the semester working in teams of 2 or 3
- Students taking the class S/U will not be required to do the mini-project but will instead serve as reviewers or consultants for other mini-projects
- There is no final exam or midterm
Course Grading
- Assignments are 60% of final grade
Team Project
- Individual assignments teach separate skills that you'll bring together in a 2-4 person team project
- Last half of the semester; pick project by early March, incremental reports up to final presentation and report at semester end
- 30% of final grade
- For SIMS 2008ers, could be incubator for MIMS project (or summer internship, or 2007-8 GSR appt.)
Past Document Engineering Projects [1]
- 2002-2003
- Course Approval System -- analysis and redesign of system by which new courses are born (primary clients: Academic Senate, IS&T)
- 2003-2004
- System Map -- interactive inventory and visualization of campus IT systems, precursor to campus-wide Data Dictionary (primary client:
Central Computing Services [Shel Waggener])
- Digital Chemistry -- data modeling to enable content sharing across delivery platform (primary client: Chemistry department [Mark Kubinec])
- Event Calendar Network -- replace hodge-podge on calendars that can't share events with repository and syndication/reuse network (primary client: public affairs [Jeff Kahn])
- Role-based Access Control -- single-sign on for campus systems, with field-level access control and dynamic assembly of form user interfaces (primary client: graduate division [Chris Hoffman])
- Center in a Box -- dynamic assembly of web site for Centers, all of which have (or ought to have) the same data model (mission, people, projects, publications, news, events, resources)
Past Document Engineering Projects [2]
- 2004-2005
- Syllabus project -- common data model for all syllabi to enable dynamic generation of custom and aggregated views (primary client:
SIMS)
- 2005 Class Projects
-
redesign of Center in a Box
-
Genentech process control documents
-
peer market (evolved into MyCroft for Ben Hill)
-
generic model of content management
-
personal health record
Past Document Engineering Projects [3]
- 2006
- Course Catalog "Round Trip"
- Bio/Bib
- Advancing to PhD Candidacy
- Construction Project Management and Collaboration
- Class Chat
Where's the Other 10%
- Class Participation: 10% (in class, list serve, blog?)
Some Schedule Difficulty
- A week from today (24 January) I will be in Tampa attending the quarterly board of directors meeting for OASIS; that's the day we have XML foundations (review) scheduled. Is this necessary?
- 18 April will also be an OASIS BOD meeting, but I hope that's far enough away that we can reschedule that lecture
- On 30 April I also have a conflict and I'd like to reschedule that lecture
My Biases and Expectations
- Document Engineering is practical but also intellectually challenging
- I'm not a formal person and will be as accessible as I can to all of you – my official office hours are proposed as M 11-12 & Tu 4-5
- But my informality doesn't mean I'm casual about what goes on in my class.
Course Web Sites and List Serves
- Sylvia is familiar to SIMS students
- Sign up for "i243" list server
- e-mail to majordomo@sims.berkeley.edu
- Subject: Leave blank
- Body of message: subscribe is243
Readings for 22 January
- part of Chapter 16 of Document Engineering [Textbook, 554-571]
- "Accelerating RosettaNet" Burgert, E-Commerce World (November 2001) [Online]
- "HIT and MIS: Implications of Health Information Technology and
Medical Information Systems" P. Goldschmidt
Communications of the ACM (October 2005) [Online, 69-74]