Admin

Information-based Computing

Schulman: The Web is the API

URLs are command-line interfaces into computing power available on distributed computers.

Example of quote.yahoo.com

XML and Information-centric Applications

Highlights

XML promotes an information-centric, as opposed to browser-centric, view of the Web.

XML is part of a next generation Web that is more functional, and not driven exclusively by the browser.

XML will be implemented first on the backend where information is managed and can be accessed by information-smart programs.
 

The Big Tease

The Browser-Centric Web is Breaking Down

Capabilities such as Dynamic HTML are good but it is difficult to write Dynamic HTML applications that do not depend upon the user choosing a specific browser.

Developers are interested in the new capabilities but then back away from using them because of the complexity of supporting these new capabilities across different browsers.

Documents and Databases

How do we develop and manage information on the Web?

 

Documents

Databases

Information
unstructured
structured
Tools
authoring
applications development
Access
browsing
retrieval
Generation
static
dynamic
Portability
Yes
No

Structured Documents

XML is based on the idea that documents can be represented as structured information, gaining many of the benefits of databases.

An interesting corollary idea is that databases can be represented as documents to provide interchange and portability.

Similar to object-oriented or component-based models.

XML is intended for interchange between systems.

If you automate a site so that it is generated from a database, then you might lose the benefit of having a search engine index the pages on your site.   This is an interchange problem.

An XML-based metadata standard would allow you to interchange information with search engines.
 

XML and Distributed Computing 

Business relationships are largely based on information interchange. 

Imagine if you had to send your product database to another company? How do you tell them how the database is structured?

Instead you might express this information in XML and make it available on your site for real-time access.

XML is standardizing syntax.

XML provides a way to express structured information so that it is both human readable and easily processed by programs.

It's just tags and attributes but enables so-called self-describing information.


Example 1

 
Nancy::555-1234::555-4321::Vice President::nancy@webreview.com
 

Example 2:

<ENTRY> 
   <NAME><FIRST>Nancy</FIRST></NAME> 
   <PHONE><HOME>555-1234</HOME><WORK>555-4321</WORK> 
   <EMAIL>nancy@webreview.com</EMAIL> 
   <JOB><TITLE>Vice President</TITLE></JOB> 
</ENTRY> 

This benefits programmers, who can use a general-purpose XML tool for parsing an XML file.

Syntax, Not Semantics

Semantics is a potential pitfall.

XML does not tell an application what the tags and the enclosed content mean or represent. 

DTDs and Schemas are one approach to organizing semantics but for the most part this falls to the application.
 

XML Provides Ways to Validate Information

The above ENTRY is well-formed, in that it respects the syntax of XML.

There is also the notion of validating this XML against a DTD, which is a formal definition of the structure of this tagset.  

A validation process could determine that the entry is missing a LAST name, which might be a required element.
 

XML is SGML-Lite

SGML is an ISO standard designed for structured information markup.

Problems with SGML

Developed without a good understanding of how to build tools to process it.

Perceived to be too complex with too much overhead..

Opportunity for XML

"Revise" SGML, addition by subtraction.

Create a lightweight standard with working implementations.

XML tools are already available written in C, Java, Perl and Python.
 

XML is a separate track from HTML

"When will XML be implemented in the browser?"

In the short term, XML is not dependent on the browser for acceptance.

Standards are beginning to drive the Web.

Widespread support for XML is a sign that the World Wide Web Consortium (W3C) is getting traction in laying out a open, standards-based path for the Web. 

XML serves as the foundation for other standards

XML extensions

XSL

Extensible Style Language

XLL

Extensible Linking Language

Namespaces

Necessary to establish context of tagset, especially when exchanging XML fragments

Schema

A DTD replacement that supports data types.

 

XML based standards

SMIL

Synchronized Multimedia Integration Language

RDF

Resource Description Framework (metadata)


 

Automating Access to Information

Think "Beyond the Browser"

Think of programs as consumers of HTML today. Programs talk to programs on other machines. Servers talk to servers. 

Soon, the conversation will be encoded in XML and these programs will be smarter about the information they retrieve and process.

User Applications

Write a program to hit three different sites, access their product databases via an HTML interface, passing parameters through a URL.

Integrate the results into a single report available to others in your company.

Building Information Interfaces

Web application layers

User interface
Presentation interface
Information interface

Two new roles for developers and designers:

Information Content and Exchange (ICE)

ICE Spec submitted to W3C:

Internet Value Networks -- data exchange as the basis of a business relationship.