AC.productions - The Zen of Serving Web Pages - Nine Aspects of a Web Log Entry - Agent

The Zen of Serving Web Pages

Web Server Know How © 2003 Christian Treber

Agent

What is is.

A string identifying the agent (the program which sent the request to the server).

What it says.

It tells you which program has been making the request. What's that good for?

It's interesting to see which browsers are being used. If you're using some tricky JavaScript in your pages, you might need to tailor the scripting to the most often used browsers. The glossary gives further information on other agents such as download tools, link checkers, offline readers, crawlers, and robots.

When you're examining traffic on your website, be aware that quite a bit might not have been caused by people browsing, but the activity of robots.

What may cause trouble.

As with the referer field the content of the agent field completely depends on the requesting program. I dare say this is the most messed-up field of all.

It all started with Netscape Navigator (aka Mozilla) being the dominant browser. When pages were created for this browser only, it was common to pop up a message saying "Please use Netscape" when the detected browser wasn't Netscape. In these days gone long ago, it was Microsoft who was the newcomer. In order to avoid getting shut out from certain pages their Internet Explorer (MSIE) browser had to mimick Netscape's signature.

To accomplish that, "cloaking" was used. MSIE identified itself as "Mozilla", but added "(compatible; MSIE x.y;...)" after the initial ID. Simple checking only detected "Mozilla", and everything was fine. Though Microsoft has effectively killed Netscape by abusing its monopoly (that's what a judge said), the practice of cloaking still continues.

Soon agent detection became better, and people started distinguishing between Netscape and MSIE. When Opera came out with their browser, they were faced with a problem similar to Microsoft's. They wanted to identify their browser, but not for the price of being turned away from certain web sites.

The result: "double cloaking". Opera's agent information reads "Mozilla (compatible; MSIE 6.0) Opera 6.0". Depending on the smartness of agent detection, either Netscape, MSIE, or Opera will be recognized.

This all doesn't make detecting the true agent easy. To make things more complicated, all kinds of other information is thrown in as well. That might be the operating system (aka platform), detailed version information, contact email or web addresses (i.e. for crawlers), and whatever the programmer felt like (how about "Mozilla/5.0 (X11; Linux i686; en-US; rv:1.0rc5; OBJR" or "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:0.9.4) Gecko/20011128 Netscape6/6.2.1").

Since it's impossible to decipher all the information, AC.log Pro only tries to identify agent and platform (the operating system) and ignores the rest.

Platform

What is is.

The platform (operating system) the agent is running on

What it says.

The platform information usually is not relevant for web content providers (that is you), since technical details such as Java or JavaScript support usually depend more on the browser than on the underlying platform.