Console Login
Home / Blog / SysAdmin / Stop Trusting JavaScript: Server-Side Log Analysis with AWStats on CentOS 5
SysAdmin 1 views

Stop Trusting JavaScript: Server-Side Log Analysis with AWStats on CentOS 5

@

The Myth of "100% Accuracy"

If you are relying solely on Google Analytics or simple hit counters, you are flying blind. I've seen it time and again: a marketing manager panics because traffic "dropped," but the server load is through the roof. Why? Because JavaScript-based trackers don't see the botnets, the scrapers, or the users with NoScript enabled.

To really know what is hitting your port 80, you need to parse the raw access logs. In the Nordic hosting market, where bandwidth at NIX (Norwegian Internet Exchange) is premium, you can't afford to waste resources on ghost traffic.

Enter AWStats. It’s ugly, it’s written in Perl, and it’s absolutely essential.

Prerequisites and The I/O Bottleneck

Before we touch the config, a warning. Log analysis is brutal on disk I/O. I once crashed a client's e-commerce site just by running a report on a 4GB log file during peak hours. The disk heads were thrashing so hard the database timed out.

This is where your choice of infrastructure matters. Most cheap VPS providers oversell their storage. If you are on a crowded node, your iowait will skyrocket during parsing. This is why we built CoolVDS on Enterprise SAS RAID-10 arrays with dedicated Xen resources. You need that spindle speed when crunching millions of lines of text.

Step 1: Installation on CentOS 5.3

Don't compile from source unless you enjoy dependency hell. Use the RPM Forge repository.

rpm -Uhv http://apt.sw.be/redhat/el5/en/i386/rpmforge/RPMS/rpmforge-release-0.3.6-1.el5.rf.i386.rpm yum install awstats

Step 2: Apache Configuration

AWStats needs the "Combined" log format to track user agents and referrers. Open your httpd.conf:

LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined CustomLog logs/access_log combined

Step 3: The Critical Config

Copy the model config file to your domain specific file:

cp /etc/awstats/awstats.model.conf /etc/awstats/awstats.coolvds-demo.no.conf

Edit these lines specifically:

  • LogFile="/var/log/httpd/access_log" (Point this to your actual log)
  • LogFormat=1 (This corresponds to Apache Combined)
  • SiteDomain="coolvds-demo.no"
  • DNSLookup=1 (Warning: This slows down processing significantly. Only enable if you have local caching DNS or low latency to Nordic upstream providers).
Pro Tip: In Norway, the Datatilsynet (Data Inspectorate) is strict about privacy under the Personal Data Act. You should consider using the IPV6Regex and plugins to mask the last octet of IP addresses if you don't strictly need them for security auditing.

Step 4: Automation and Updates

Don't run this manually. Add it to your crontab. But be smart—don't run it at the top of the hour when everyone else's cron jobs trigger. Offset it.

15 * * * * /usr/bin/perl /var/www/awstats/awstats.pl -config=coolvds-demo.no -update > /dev/null

Performance: The CoolVDS Advantage

When you run that Perl script, your CPU usage will spike. On standard OpenVZ containers, this is where you get "CPU steal"—where the host node throttles you because your neighbor is also compiling a kernel. It results in reports that take 20 minutes to generate.

At CoolVDS, we use Xen virtualization. This means your RAM and CPU cycles are hard-allocated. When you crunch logs, you get the raw power of the core, not a timeshare slice. Plus, our low latency connection to Oslo means reverse DNS lookups for Norwegian traffic resolve instantly.

Conclusion

Stop guessing. Install AWStats today and see who is actually visiting your site. If the log parsing chokes your current server, it's time to move to a platform that handles high I/O without sweating.

Need a test environment? Spin up a CentOS 5 instance on CoolVDS in under 2 minutes.

/// TAGS

/// RELATED POSTS

Surviving the Slashdot Effect: Bulletproof Load Balancing with HAProxy on Linux

Is your Apache server choking on traffic? Learn how to implement software-based load balancing using...

Read More →

Building a Fortified Mail Server in 2009: Postfix, Dovecot, and Surviving the Spam Filters

Stop letting shared hosting IPs blacklist your business. We break down a battle-tested Postfix/Dovec...

Read More →
← Back to All Posts