Analyzing the NGINX web server log file

  • Post author:
  • Post last modified:August 14, 2024
  • Reading time:2 mins read

1.0 Introduction

Given an nginx web server log file, we might like to know who are the visitors to our website. One can go line by line through the log file, but that is tedious. It would be nice if we can get one line per client visiting the website, giving the IP address of the visitor and how many times that client has visited. Finally, if the output is sorted in the descending order of number of visits, we get the list of most frequent visitors to the website. Here is a script which does this.

2.0 Script

Our script is written in Perl. It uses a hash with the IP address as the key. The hash keeps the number of visits for IP addresses. For each line of the log file, the corresponding count for the IP address is incremented. The script is,

#!/usr/bin/perl
#
# processlog: process the log file passed as an argument.
#

if (@ARGV < 1) {
    die ("Usage: processlog logfile1 [logfile2 [...]]\n");
}

my (%count);
while (<>) {
    chomp;
    /(\S+)/;
    $count{$1} += 1;
}

foreach $ipaddress (keys %count) {
    print "$ipaddress: $count{$ipaddress}\n";
}

We can run the above script with an access.log file.

$ ./processlog access.log 
2001:db8:fe17:a000:dacb:8aff:fee1:905b: 14
2001:db8:bdde:6480:6868:bc67:87d:64f7: 12
198.51.100.3: 18
2001:db8:938f:85bf:c309:68fd:ae91:2c82: 14
203.0.113.0: 2
...

This gives the totals for the IP addresses of visitors but is not sorted on the number of visits descending. We can pass the output of the script through the sort command to get the most frequent visitors to the website.

$ ./processlog access.log | sort -nr -k2,2 -k1,1 
198.51.100.45: 10083
203.0.113.22: 7365
192.0.2.31: 5927
203.0.113.44: 3972
198.51.100.10: 3857
...

Karunesh Johri

Software developer, working with C and Linux.
0 0 votes
Article Rating
Subscribe
Notify of
guest
0 Comments
Newest
Oldest Most Voted
Inline Feedbacks
View all comments