- An agent is any program that is used to access a web server.
This includes browsers, crawlers, and link checkers.
- A browser is a program that enables you to surf the web. On your
behalf they requests URLs from web servers and display the transmitted
document in a window. Links to other web pages are usually displayed as
underlined text, and can be clicked upon.
- Search engine crawlers or crawlers for short mechanically
try to read each page on a web site for adding them to the data base
of a search engine such as Google or AltaVista. Other varieties exist:
there are crawlers which try to extract email addresses for spamming
- Deep linking
- Deep links are direct links to content deep in a web size
(pretty much everything but the start page) that bypass the usual
"access route". Some site operator don't want that since i.e. advertisment
pages are bypassed.
- The IP addresses can be broken down into host and domain. The domain
identifies a network, while the host is an address on that network. While
hosts with the same name might exist on different networks, the "fully
qualified host name", consisting of host and domain, is always unique.
IP addresses in name form such as "www.acproductions.de" can be broken down
into host "www" (this is the first part) and domain "acproductions.de"
(all the rest). The domain itself is a hierarchy separated by dots.
From the last to the first part of the domain the specification gets finer
and finer. In example, the IP address "archimedes.math.manoa.uofhawaii.edu"
can be read as "the host archimedes at the educational institution of the
University of Hawaii at the campus of Manoa, Maths department".
IP addresses in name form such as "192.168.0.1" come in three
different flavors, class A, B and C. Addresses with the first
number in the 0 to 63 range are Class A networks. The first number is the
domain, the rest is the host. The 64 to 191 range are Class B networks
with the first two numbers indicating the domain and the last two the host.
Class C networks range from 192 to 254. The first three numbers make up
the network, while the last number is the host.
- A download tool or downloader for short allows for
the download of (typically large) files. The main advantage over downloads
with most web browsers is that downloaders can stop and resume downloads.
So even if you need to disconnect your Internet link or the connection
gets broken, the download can be completed without having to start at
- See domain
- Link checker
- A link checker is a program which scans
all the pages of a website for internal and external links that
might be broken. Running such a program on your own web site creates
a number of hits. When other people are running link checkers on
their web site, the link checker might probe external links as well. This
is how some of your pages get hit by a link checker though you didn't
use one: someone has tested his links to your pages.
- Name form
- IP addresses in number form consist of four numbers in the
range from 0 to 255 that are commonly written down separated by dots
(i.e. "192.168.0.1"). IP addresses in number form often can
be translated into the name form by using the Domain Name Service (DNS),
performing a "reverse lookup".
- Number form
- IP addresses in name form consist of names separated by
dots (i.e. "www.acproductions.de"). Valid IP addresses in name form
always can be translated into the number form by using the Domain
Name Service (DNS), performing a "lookup".
- Offline reader
- Offline readers fetch pages on the users' behalf for later
consumption. Some browsers such as Microsoft Internet Explorer offer
offline reading capabilities.
- In this context "platform" is just another word for "operating system"
(on the machine the browser or agent runs on).
- A proxy server is a server between a browser and a web server.
The browser sends a request to the proxy, which in turn forwards
that request to a web server (or even another proxy).
The proxy can maintain an internal cache, making it a "caching proxy"
or "cache" for short. When a request is made, the proxy checks if it has
the page in the cache. If yes, it might perform a short check with the web
server to see if the page is outdated. If not, it serves the page without
having to transfer the whole page from the web server.
This procedure mainly serves two purposes: The requests can be served
faster, and the network traffic from the proxy onwards is reduced.
- A robot is an automatized agent that read and processes web pages
for puposes such as indexing for a search engine. By convention they check
the contents of the file "robots.txt" to determine if their presence is
wanted or not, and to find out about how they should behave.
- Top Level Domain, TLD
- The last part of an IP address in name form is the top level domain
or TLD. It might denote a class (as in "com" for "commercial") or a country
(as in "fp" for French Polynesia).