|
Log Analysis Data & Uses for Better Site Development and
Positioning
In our last article on search engine positioning, we mentioned log analysis
to determine when the search engine spiders visited your site.
We feel that our readers can benefit from a little more in
depth assistance on how to accomplish that suggestion. In
this article, we will look into log analysis, the how's, the
where's associated with log analysis, and the tools to make
your log analysis data tasks easier. We will explore log analysis
data, and how to use log analysis data to leverage affiliate
strategy.
What is log analysis data?
Log analysis data is raw information supplied to you by your host, or other
third party tracking sites, that outlines specific information
about your visitors, their traffic patterns, and pages viewed
throughout your site. It can include information on screen
resolution, browser, operating system, and a whole host of
other information relating to a visitor's computer settings.
In raw form, it looks like the following:
66.150.40.221 - - [11/Jan/2004:18:44:54 -0800] "HEAD
/html/tutorials/webmaster/index.htm HTTP/1.1" 200 0 "-"
"InternetSeer.com"
64.68.82.208 - - [11/Jan/2004:19:34:08 -0800] "GET /robots.txt
HTTP/1.0" 404 42586 "-" "Googlebot/2.1
(+http://www.googlebot.com/bot.html)"
64.68.82.208 - - [11/Jan/2004:19:34:09 -0800] "GET /pics/nav/texttell.swf
HTTP/1.0" 200 2109 "-" "Googlebot/2.1
(+http://www.googlebot.com/bot.html)"
This is only a couple of lines taken from one of our server
logs. If you remember in the last article, we said paid hosted
sites usually have access to their visitor logs. Yahoo provided
the above information to us. We download our log files on
a daily basis, and then use the log analysis data to make
decisions based on our site operation. We also use the log
analysis data to see when spiders and bots visit our site.
You will take notice that googlebot is mentioned twice, and
internetseer.com is mentioned once also. All of the above
log analysis data are from spiders and bots.
Whenever anything makes a querry of your site, it will show
up in the log analysis data. There are many important uses
for this log analysis data. By following URL and IP paths
back to the pages, it references, you can find out if someone
is stealing your bandwidth. You can use the log analysis data
to see what sites are linking to you. You can then visit those
sites and get a feel for what kind of visitor visits your
site. There are too many uses to list for log analysis data,
and are beyond the scope of this article. Bottom line; log
analysis data is a vital tool for a developing webmaster.
But What Does all that Log Analysis Data Mean?
Remember that each time your site is visited, one of those lines will be generated
in the log analysis report. Anytime a person points a browser
to a URL within your site; it generates another line of code
in that log analysis data. Anytime you run an HTML validater
on one of your pages, it generates another line of code in
that log analysis data. Yeop, you guessed it, anytime a bot
drops by to visit your site, or anything else related to those
actions, will generate another line of code to that log analysis
data. You starting to see how important that log analysis
data can be? It is also important to keep in mind that this
logging includes files your own pages call from your server.
64.68.82.208 - - [11/Jan/2004:19:34:08 -0800] "GET
/robots.txt HTTP/1.0" 404 42586 "-" "Googlebot/2.1
(+http://www.googlebot.com/bot.html)"
Lets start with the beginning of the line of log analysis
data code (underlined above for illustration purposes). You
know, the numbers separated by dots. That is the ISP of the
visitor. By doing a lookup, you can determine who owns that
ISP. If they are stealing your bandwidth, then further investigation
could reveal whom to contact to file a complaint. It could
also give you the email address of who to contact to request
they stop stealing your bandwidth. The log analysis data will
also show you where most of your visitors come from. It will
reveal even more clues to what their habits and interests
are. You could then use that information to adjust your affiliate
programs to fall in line with those habits and interests.
64.68.82.208 - - [11/Jan/2004:19:34:08 -0800] "GET
/robots.txt HTTP/1.0" 404 42586 "-" "Googlebot/2.1
(+http://www.googlebot.com/bot.html)"
The next part of the log analysis data is the date and time
(underlined above for illustration purposes). You will note
that it contains an offset, which we believe is a GMT time
differential. If you want to micromanage, then you could use
this information to determine what time might be the best
time to place a particular ad on your site. You could use
that log analysis data to determine when the best time is
to take your site down for maintenance or upgrading content.
64.68.82.208 - - [11/Jan/2004:19:34:08 -0800] "GET
/robots.txt HTTP/1.0" 404 42586 "-" "Googlebot/2.1
(+http://www.googlebot.com/bot.html)"
The next parts of the log analysis data report the request
method, the URL of the requested file, and the protocol specification
(underlined above for illustration purposes). You can use
this data to identify missing files on your server, and the
paths requested in your HTML, or the pages that use broken
links to link to you. You can then modify your HTML to reflect
the path change, or totally remove the HTML that requests
the file.
Back
to Table of Contents
What to do With
Log Analysis Data  (Article Continues)
About the Author
James R. Sanders is the owner of Sanders
Consultation Group Plus. He has been a webmaster and website
designer since 1997. He has also been involved in self employment
ventures since 1992. He is presently a contributing author
of NewbieHangout. His writing is targeted to webmasters, would
be webmasters, website designers, would be website designers,
self employed, or those researching information looking for
solutions to questions associated with design, business operations,
and promotion today. His goal is to provide practical information
based upon his years of experience to help webmasters, website
designers, and self employed people achieve their goals in
today's competitive global market. You can subscribe to his
free newsletters at SCGP
- Newsletter.
|