QNDstat - a simple Web stats generator

Introduction - What is it ? - How to use it - How it Works - Legalese - Version History - Downloading QNDstat

Introduction

QNDstat started out as a simple hack, just two lines of Perl and a lot of padding that I came up with some time in March, to scan our httpd 1.4 log and produce a page counting the accesses to individual users' Web page directories.

Since then it's grown and grown as I've added code to generate monthly as well as daily stats, check whether the users exist and grabs their names, check for HTML pages, not just hits, create and use a temporary file for the monthly stats (instead of relying on finding the entire month in the logs), archive the stats and present the reults graphically.

What is it ?

QNDstat is a Perl script that logs all accesses to users' HTML pages on an httpd server, and produces two HTML files of stats and graphs, one of accesses so far in the month, one of accesses on the latest day logged, with links to the users' default Web pages. It also archives previous month's stats for comparison and reference. This makes most sense if you look at an example; see here for an example of a daily log, and here for a monthly log.

How to use it

You do not need to be a system or Web site administrator to use QNDstat. As long as you can can upload and create Web pages on your server, can run Perl programs on this server and have access to the logs created by httpd and the system's password file, you can use QNDstat.

If you cannot manage one of the above, it may be that your system administrator did not think anyone would want access to the files or the ability to run programs, or it may be that restrictions are placed on you or your acccount fo good reason. In either case you will have to find out from your system administrator how you can use QNDstat or whether they are willing to install it for you.

To get QNDstat to work you must modify the settings at the start of the program to match those of your system. The first six are the paths of the httpd log, the password file, the directory where your HTML files live, the names given to, the HTML files created and a temporary file where QNDstat stores the monthly running totals, which can be anywhere you can write to and read from a file.

The next setting is not required for the stats creation, but is there to provide a quick and easy way to insert the name of the Web server or Internet service provider into the pages generated, e.g. 'Supernet' is the popular name of Hong Kong Supernet, who run the Web site for which QNDstat was originally written.

The last two lines are the URLs of the images to be used to create the graphs. These can be any GIF or JPEG images, though small images which do not vary horizontally qork best. If you do not want graphs (i.e. if you want to produce similar stats files to version 1.1) comment out these lines. The URLs can be relative or absolute URLs.

QNDstat is designed to be run once a day. Each time it runs it scans the log for all accesses to users' pages that day, then creates a page from the data, and combines this data with previous days' results to produce the monthly page.

It can be run manually but is best run as a daily cron task. To do this I use the following file:

40 23 * * * /u1/bwijoh00/johnb/progs/qndstat.pl
The 40 and 23 being the time in minutes and hours at which the program is run. This is saved in a file cronfile, which is submitted to the cron process by typing cron cronfile. See man pages for cron and crontab for further information on how this works.

It can be set to run at any time. If it runs between midnight and 6pm it will process the log entries for the previous day, processing a complete 24 hours each time it runs. But if run between 6pm and midnight it will only process the available entries for the day it runs on. This may be useful if the access log is automatically archived or deleted at or around midnight each night.

How it Works

Eact time it is run, QNDstatscans through the httpd log looking for all lines containing If it finds such a line it checks the userID following the '~' and adds 1 to a running total of hits on this user's pages.

It then produces the monthly totals by adding the previous monthly total to the totals just collected, saves the new monthly totals, and collects user's names from the /etc/passwd file. Finally it produces two HTML files, listing the users found in /etc/passwd in order of the number of hits on their pages. At the end of the month it archives the monthly results and starts compiling the monthly statistics anew.

If URLs for images are provided it produces a table with the images used to produce a bar graph of the results. To produce the graph from only a single image, wthout any image-processing programming, it uses the 'WIDTH' and 'HEIGHT' modifiers to the IMG tag. This technique is not supported by all browsers, so graphs produced by QNDstat will look strange when viewed with some older browsers, or with browsers with images turned off.

The way it processes the log relies the common practice of specifying users' html directories by the character '~' followed by their name, producing URLs of the form

http://<server>/~<userID>/,
where <server> is a server domain name and <userID> is a user's userID. It creates similar relative URLs, i.e.
/~<userID>/,
for the links on the pages it generates.

By searching for URLs ending in 'htm', 'html' or without any extension it should be logging all hits on users' HTML pages, including the index file opened when a browser requests a directory. It does not log access to other resources, such as images or sounds.

One limitation of QNDstat is that it does not log accesses to pages in httpd's DocumentRoot directory, which is often used for the service provider or server maintainer's documents. It also does not log accesses to cgi-bin programs, nor does it interpret any result codes in the log.

QNDstat has only been tested on the log files produced by NCSA httpd versions 1.3 and 1.4 It may or may not work the logs produced by other servers, I've just not had chance to test it on them. If it does not work with your server's logs please mail me a sample (< 100k) of the logs and I will modify QNDstat so it can also process such logs and upload the new version here.

Legalese

QNDstat is being distributed as freeware, but is still (c) 1995 John Blackburne. You can download and use QNDstat, but I retain copyright to it and to its documentation.

You may modify it for your own use, but you may not then distribute the modified copy. The copyright notice, and the code generating the link back to this page, should be retained whatever other modifications are made. You should not upload QNDstat anywhere else on the Web - it is better that you provide a link to this page.

Lastly, QNDstat is provided "as is", without express or implied warranty of any kind. In particular, although I intend to fix any bugs found and add support for other log formats, I do not guarantee I will do this, or do any other work on QNDstat.

Version History

Downloading QNDstat

Click here to view QNDstat, which you can then save as a text file.

© John Blackburne, johnb@hk.super.net, 17th March 1996


up to John's Home Page