Friday, November 16, 2012

Analyse log files in the pseudo-standard NCSA format

Quick awk script to analyse log files in the pseudo-standard NCSA format to show the number of times each page is hit and what HTTP Status Code is returned for each of the pages
awk '{
splitter=index($7,"=")-1;
if (splitter==-1)
  splitter=length($7);
url=substr($7,1,splitter) "\t" $9;
urls[url]++;
}
END {
for (url in urls)
  print url,"\t",urls[url];
}'