Look at the various components of Web publishing, many of which are common to most Web applications.
"HTTP is a protocol with the lightness and speed necessary for a distributed collaborative hypermedia information system. " Tim Berners-Lee, 1992, Basic HTTP
References: HTTP 1.1 Spec
User initiates transaction by typing URL in browser or clicking on link.
http://xyz.com/98/document.html
GET /98/document.html HTTP/1.0
HTTP/1.1 200 OK 404 Not Found
Content-type: text/html Content-length: 3896
HTTP is a stateless protocol, which means that the server does not maintain any information about the transaction that persists throughout a session. In other words, each transaction is independent of the others.
Session tracking requires a server-side application to maintain it. Session tracking is used in shopping cart applications, for instance.
The Apache Group, an Open Source software project, has developed the leading Web server with over 50% of all servers. Microsoft (IIS) and Netscape's server combined don't come close.
Web servers are fairly stable technology.
Reference: Apache.org, Netcraft survey
Site administrator usually takes care of the following server configuration issues:
For Apache, this is the /usr/local/apache/conf
directory and the
main configuration file is httpd.conf
.
/work/pub/wr/
maps to /wr/
audio/x-pn-realaudio ram
text/x-sgml sgml sgm
video/mpeg mpeg mpg mpe
Decision about URLs:
Authentication is asking a user to provide identification, usually a user name and password. Basic Authentication uses the htaccess file, and more sophisticated applications will manage this information in a user database.
Access control is determining which areas of the site can be accessed. You can configure the server to allow or deny access to different individuals or groups of users or IP addresses.
152.163.201.137 - - [20/Sep/1998:02:10:08 -0700] "GET / HTTP/1.0" 200 8087 152.163.201.137 - - [20/Sep/1998:02:10:13 -0700] "GET /SLlogo2.gif HTTP/1.0" 200 6848 152.163.201.135 - - [20/Sep/1998:02:10:13 -0700] "GET /perl_id_313c.gif HTTP/1.0 " 200 1911 152.163.201.136 - - [20/Sep/1998:02:10:13 -0700] "GET /w3jicon.gif HTTP/1.0" 200 1970
Some of the tasks surrounding logs:
References: Lincoln Stein, Yahoo's list of tools, Marketwave's Hitlist Examples
Webmaster manages the hardware, the OS and the network.
Properly configured PC's can be powerful enough to handle sizable load, obviating the need for more expensive servers from Sun.
- Yahoo runs on FreeBSD.
Small dedicated Web server devices such as the Cobalt server with embedded Linux and Web administration.
References: Server Watch, WebServer Compare
CGI BIN Applications
Application Servers (Cold Fusion, ASP)
Databases and SQL
CGI modules in Perl and Python provide a higher-level interface for the programmer, and hide the low level details.
Script installed in server's cgi-bin directory.
HTML document containing form references the CGI script.
<form action="http://dale.songline.com/cgi-py/formreply.py" method="Get">Name and Address Form
http://dale.songline.com/cgi-py/formreply.py?name="dale"
Python script installed in cgi-bin directory
import cgi form = cgi.FieldStorage() form_ok = 0 if form.has_key("name") and form.has_key("addr"): if form["name"].value != "" and form["addr"].value != "": form_ok = 1 print "Content-type: text/html" # HTML is following print # blank line, end of headers if not form_ok: print "<H1>Error</H1>" print "Please fill in the name and addr fields." return else: print "<H1>Results</H1>" print form["name"].value, form["addr"].value
Programs like Cold Fusion and Microsoft's Active Server Pages (ASP) are application servers.
Applications servers provide a framework for non-programmers to create dynamic Web sites. Still requires technical knowledge to build applications, but coding complexity is more similar to HTML.
Increasingly, applications structure information in databases and then generate HTML dynamically.
A database can mean many different things: flat-file database, dbm files, relational database such as Access and SQL Server from Microsoft or MySQL, which is free software. Oracle and Informix provide large scale database servers.
Ideally, you design your application independent of a particular database, and then you can migrate your data to better performing database systems as the need arises.
The main application interfaces to the database are through SQL and/or ODBC. SQL can be used to create or modify data records in the database as well as to select sets of data from it.
Example:
SELECT NAME, ADDR FROM EMPLOYEES WHERE NAME EQ "DALE DOUGHERTY"
Languages such as Perl, Python and Java all provide fairly standard interfaces for accessing databases.
Cold Fusion from Allaire is a Windows/NT application.
Server is configured so that files ending in .cfm are passed to the Cold Fusion
application server.
HTML file: (could be created as a .cfm file.)
<FORM ACTION="searchquery.cfm" METHOD="Post"> Last Name: <Input Type="text" Name="LastName"> <Input Type="Submit" Value="Search"> </FORM>
Application file (.cfm):
<CFQUERY Name="EmployeeList" Datasource="Examples"> Select * From Employees WHERE LastName = '#LastName#' </CFQUERY> <body> <H2>Results</H2> <CFOUTPUT> <P>The search for #Form.LastName# returned the following: </CFOUTPUT> <CFOUTPUT QUERY="EmployeeList"> <HR> #FirstName# #LastName# (Phone: #PhoneNumber#) <BR> </CFOUTPUT>ASP variables are referenced using %. (e.g., %LastName%).
Streaming Media
Advertising Server
Search Engine
Conferencing System
RealAudio and RealVideo require a seperate server to stream multimedia content to users.
Content must be first prepared in their format using RealPublisher or another tool that supports this format.
Licensed based on the number of simultaneous connections to the server.
Just came out with new G2 system.
References: Real Media, Perl Interview
The ad server provides for the dynamic rotation of advertising banners on a site, and the collection of data to track impressions and click-throughs.
Ad sales rep uses the server as adminstrator to set up campaigns. Advertisers use the site to get real-time reporting on how their ad is doing.
High-end ad servers allow more targeted delivery of ads based on:
References: 3 Ad Server Solutions
Search engine provides a full-text index of a site or a collection of sites.
Webmaster needs to configure indexer to run at certain intervals, either to regenerate complete index or simply to update it.
We use a subject index as the primary interface for searching and then offer the full-text search.
References: Web Review Search
Sites will use conferencing and chat systems to create community and increase user involvement.
References: WebBoard
Email remains the dominant form of communication on the Web. The ability to send regular email to users is very valuable.
We use an "email" subscription box on our sites to encourage users to provide an email address to us. Then we send our table of contents to them weekly.
Mailing List Servers automate the process of maintaining a mailing list and sending out large numbers of messages:
Major Domo, ListServ, Lyris
Netscape and Microsoft share about 90% of the market, with Netscape still the leader with over 50% but Microsoft continues to show steady growth. Opera, an interesting new entry, hasn't made much progress.
References: Mozilla.org, Opera,
Site Navigation
Logo/Identity
Headers/Footers and page navigation
Layout templates
Manages "metadata" to build collections of documents and create different views.
Database driven
Provides for staging of content; replication.
Development and Production Servers
Workflow and approval
References: PACE