AC.productions - The Zen of Serving Web Pages - What Sound Does a Web Hit Make?

The Zen of Serving Web Pages

Web Server Know How © 2003 Christian Treber

What Sound Does a Web Hit Make?

Christian Treber, Senior IT Consultant and Internet Services Specialist

I can't explain what sound a web hit makes, but I have definite information on what is written into the log file of a web server. The web server keeps a log of operations it has been performing. This log contains valuable information about how the web site has been used. What exactly does get logged?

After the web server received a request and sent the response it writes a log entry into the log file. A log entry simply is a line of text which typically looks like this (though less colorful):

195.145.250.9 - - [01/Sep/2001:12:20:14 +0200] "GET / HTTP/1.0" 200 1827 "http://www.cnet.com/webLogAnalysers/" "Mozilla/4.0 (Windows 98)"

Let's take a look at the different parts of the log entry (marked in color).

Host. Here: 195.145.250.9 (numeric form).

This is the IP address of the requester. Or it might be the IP address of a proxy that the request has been routed through. But what is a proxy? A proxy is a web server that acts as an intermediate between a requester and a web server.

This especially makes sense when the proxy "caches" requests. If the requested URL allready is in the cache, it gets served right away. If the URL is not in the cache (or outdated), the proxy fetches a copy, caches it, and forwards it to the requester. Proxies are used to save on web traffic and reduce the load on web servers. At the same time, they obscure the address of the original requester: the web server only sees the last address in the chain.

User. Here: - (not defined).

This is the identity of the requester according to the Identification Protocol. You won't ever see this "in the wild". I know of no web server which logs this information. Corrections und updates welcome!

Login. Here: - (not defined).

This field will be empty unless the requested URL has protected access. In this case, the field will contain the identity used in the authorization (so to say, the "user name").

Date and time. Here [01/Sep/2001:12:20:14 +0200]

This is when the URL has been requested. The field is subdivided in

  • Day, month, year
    Here: September 1st 2001
  • Hour, minute, second
    Here: 12:20:14 p.m.
  • Time difference to GMT
    Here: "+0200", meaning "Greenwich Mean Time plus two hours", i.e. middle European daylight savings time.

Be aware that the time stamp format can vary wildly between servers.

Command. Here: "GET / HTTP/1.0"

The command sent to the web server. The field is subdivided in

  • Operation
    Here: "GET" (the file)
  • URL
    Here "/"
  • Protocol
    Here: "HTTP/1.0"

What a browser sends to the web server pretty much always is a "GET" command. Form data might get sent back with "POST". The HTTP protocol allows uploading of pages with "PUT" (and some other things) as well.

Result code Here: 200 (means"OK").

This is the code for the outcome of the operation. "404 - page not found" is a very popular result code you might know off hand.

Bytes transfered. Here: 1827

This is the number of bytes sent between the requester and the web server. The direction is determined by the operation.

Referrer. Here: "http://www.cnet.com/webLogAnalysers/"

This is the URL of the referring page, which is the page which contained the link that the user has clicked upon. A very interesting information indeed!

Agent. Here: "Mozilla/4.0 (Windows 98)" (refers to Netscape 4.0 and Windows 98 as operating system).

This is the name (and possibly version) of the agent = program that made the request. This example could be a browser, a crawler, or a download tool. This field often contains all kind of other information, such as the operating system the agent runs on.

© 2003 Christian Treber, www.ctreber.com

Back to main page.