Web Architecture and Organizing Information

R. Alexander Miłowski

milowski@ischool.berkeley.edu

School of Information, UC Berkeley

The Principles of the Web

In summary, the Architecture of the World Wide Web, Volume 1 states these principles:

  1. We name resources on the Web with URIs.
  2. We interact with resources via URIs and over protocols such as HTTP.
  3. We use common data representations such as HTML or XML.
  4. We use metadata, from both the protocol and data representation, to understand how to process representations.
  5. We use annotations, encoded in markup and links, to discover new information such as related resource locations on the Web.

Naming with URIs

URI assignment authorities and the Web servers deployed for them may benefit from an orderly mapping from resource metadata into URIs.

Decoding that:

Read: Cool URIs don't change

Protocols Metadata

HTTP is your friend.

It wants to help you.

Don't abuse your friends.

HTTP metadata:

  • describes the content payload,
  • transfer and character encodings,
  • use and security information (e.g. CORS),
  • status and service health.
> GET /data/q/5/n/768/2014-09-11T05:30:00Z HTTP/1.1
> User-Agent: curl/7.37.0
> Host: www.mesonet.info
> Accept: */*

< HTTP/1.1 200 OK
< Content-Length: 129465
< Content-Type: application/xhtml+xml; charset=UTF-8
< Last-Modified: Thu, 11 Sep 2014 05:37:16 GMT
< Date: Mon, 15 Sep 2014 18:48:43 GMT
< Accept-Ranges: bytes
< Server: XProclet Server V1.0.m1
< Access-Control-Allow-Origin: *
< Access-Control-Allow-Headers: accept-encoding,cache-control

Representations

Use common formats:

Or do it this way: CWOP Weather for the Bay Area

But not this way (1GB binary data):
http://nasanex.s3.amazonaws.com/NEX-DCP30/NEX-quartile/rcp26/mon/atmos/tasmax/r1i1p1/v1.0/CONUS/tasmax_quartile75_amon_rcp26_CONUS_209601-209912.nc

Is JSON good or bad for the above data resource?

Putting it all together

  1. Developed good naming practices that encode useful metadata from your information.
  2. Utilize but don't abuse your protocol-level metadata.
  3. Try to use the right common format of the Web for the purpose you are serving.