| Web Cache Proxy Effectiveness The Web cache proxy effectiveness is the measurement of the cache 
              hit rates and any resulting traffic and time savings. There are 
              numerous tools available to analyze the information within Web cache 
              proxy logs (e.g., Bradford Barrett's highly recommended Webalizer). 
              However, for our purposes, we wrote a Perl script utilizing Thomas 
              Boutell's GD graphics library (via Lincoln Stein's GD 
              Perl module). Our interest was in the number of accesses, the volume 
              of the accesses, and the elapsed time of accesses. An example of 
              the output for the month of October 2001 is shown in Figure 6.
              The proxy logs (in this instance from squid) provides information 
              such as the timestamp of access, the client IP address, the client 
              user id (if you are using some type of authentication), the URL 
              access, HTTP return code, proxy return code, bytes downloaded, and 
              transaction time in milliseconds. We have calculated the total number 
              of accesses, the volume of accesses, and the transaction time of 
              the accesses from these logs.
              The other information, which can be customized to your requirements, 
              includes further breakdowns based on other criteria such as the 
              URL, country, TLD types, HTTP content types, and user id. This information 
              can be utilized to improve your redirection or pre-fetching tables 
              to further increase the cache hit and savings.
              The Perl script (Listing 1) requires the squid format log files 
              from standard input and generates the HTML report to standard output. 
              (Listings for this article are available from the Sys Admin 
              Web site: http://www.sysadminmag.com.) Additionally, the 
              graphic PNG format files are generated in the current directory. 
              Hence, to utilize this script, change the directory to the Web server 
              directory and then pipe the squid access logs to the script and 
              redirect the output into a .html file. For example:
              
             
$ cd /opt/squid/htdocs
$ cat ../logs/access.log | ../bin/PerfStat.pl > perf.html
The above command will produce a report file (perf.html) with reference 
            to the PNG image files: CHpie.png, CVpie.png, DHpie.png, DVpie.png, 
            PApie.png, PVpie.png, THpie.png, TVpie.png, RThist.png and VDhist.png, 
            which are generated in the current working directory.  In our circumstances, we run a cron job (Listing 2) to feed the 
              proxy logs each day through the program and generate accumulative 
              statistics and roll them over at the end of each month. The cronjob 
              shell script renames the files (adding the yyyymm extension), as 
              well as editing the .html report file, replacing the image file 
              references with the new file names.
              The first part of the Perl script reads from standard input the 
              squid log format records and then also holds in various variables, 
              arrays, and hashes the access counts, byte counts, and time counts 
              categorized in a number of different ways. After all the input has 
              been read, it then writes out the calculations in HTML format tables 
              to standard output. References (embedded links) to PNG graphic image 
              files are also included in this output. The last part of the Perl 
              script loads the graph data into arrays and the GD modules are called 
              to write the graph into the nominated files. Prior to running the 
              Perl script, the reader may need to lead the GD Perl module. The 
              GD Perl module also requires the installation of the GD graphics 
              library and this requires both the libpng and zlib 
              libraries. Otherwise, the graphing code can be commented or stripped 
              out and the Perl script would simply produce the information in 
              HTML table format.
           |