The Truth is in the Logs: High-Performance AWStats Setup on Linux
Let’s be honest for a second. If you are relying solely on Google Analytics or any other JavaScript-based tracker, you are flying blind. Between the rise of AdBlock, NoScript, and corporate firewalls stripping tracking pixels, you are likely missing 10% to 20% of your actual traffic. As a sysadmin, I trust one thing: the raw access log. It doesn't lie, it doesn't get blocked, and it captures every single handshake.
But raw logs are ugly. You need a parser that can chew through gigabytes of text without choking your CPU. That is where AWStats comes in. It is old school, written in Perl, and arguably still the best tool we have in 2012 for granular, server-side traffic analysis. In this guide, I will walk you through a battle-hardened configuration for CentOS and Debian systems that respects your hardware resources and keeps Datatilsynet off your back.
Why Server-Side Analytics Matter in Norway
Here in the Nordics, privacy isn't just a suggestion; it is a mandate. With the Personopplysningsloven (Personal Data Act) strictly enforced by Datatilsynet, shipping all your user data to third-party US servers can be a legal grey area. Keeping your analytics in-house on a server physically located in Oslo or nearby ensures you retain data sovereignty. Plus, when you control the parser, you control the data retention policy.
Step 1: Installation (The Easy Part)
First, ensure you have the EPEL repository enabled if you are on CentOS, or the standard non-free repos on Debian. We need Perl, as AWStats is a Perl script at its core.
For CentOS 6 / RHEL 6:
yum install awstats perl
For Debian 6 (Squeeze) / Ubuntu 12.04:
apt-get install awstats
Once installed, the default configuration is usually usually located in /etc/awstats/. Do not edit the model file directly. Copy it to a new file named after your domain.
cp /etc/awstats/awstats.model.conf /etc/awstats/awstats.coolvds.com.conf
Step 2: Configuration for High-Traffic Sites
This is where most people fail. They leave the defaults, and the first time the cron job runs on a 2GB log file, the server loads spike, and the OOM killer starts eyeing your MySQL process. We need to be surgical.
Open your new config file and adjust these critical parameters:
# /etc/awstats/awstats.coolvds.com.conf
# Point this to your actual log file.
# If you use logrotate, make sure this points to the live log or handle rotation carefully.
LogFile="/var/log/nginx/access.log"
# 1 = Apache combined/Nginx combined.
# Do not use 'common' log format, it lacks user agents and referrers.
LogType=W
LogFormat=1
# DOMAIN SETUP
SiteDomain="coolvds.com"
HostAliases="www.coolvds.com localhost 127.0.0.1"
# PERFORMANCE TUNING (CRITICAL)
# Turn this OFF. Reverse DNS lookups on every IP will kill your processing time.
# You can resolve IPs later or use a separate plugin if absolutely necessary.
DNSLookup=0
Pro Tip: Setting DNSLookup=0 reduced my processing time from 45 minutes to 2 minutes on a client's high-traffic e-commerce site last week. Unless you strictly need to know if a visitor is from Telenor or NextGenTel by hostname, keep it off. GeoIP plugins are a much faster alternative for location data.
Step 3: The Nginx Caveat
Apache users usually have it easy because AWStats defaults to Apache's structure. If you are running Nginx (which you should be for static content performance), you need to ensure your nginx.conf generates a compatible log format.
Add this to your http block in /etc/nginx/nginx.conf:
log_format combined_awstats '$remote_addr - $remote_user [$time_local] '
'"$request" $status $body_bytes_sent '
'"$http_referer" "$http_user_agent"';
access_log /var/log/nginx/access.log combined_awstats;
This mimics the Apache "combined" format perfectly, ensuring AWStats parses every field correctly.
Step 4: Automation and Security
You do not want to run the update manually. Set up a cron job to update the statistics every hour. Open your crontab with crontab -e:
0 * * * * /usr/lib/cgi-bin/awstats.pl -config=coolvds.com -update > /dev/null
Security Warning: AWStats exposes a lot of internal data. Do not leave the web interface open to the public internet. Use basic authentication in your web server to restrict access.
Securing via .htaccess (Apache):
AuthType Basic
AuthName "Restricted Access"
AuthUserFile /etc/awstats/.htpasswd
Require valid-user
The Hardware Reality: I/O Bottlenecks
Here is the reality check: Parsing logs is an I/O intensive operation. It reads massive files and writes thousands of small stats files. On a traditional mechanical hard drive (HDD), this "thrashing" can cause your website's latency to spike while stats are calculating. I have seen web servers hang for 10 minutes during log rotation because the disk queue depth was saturated.
This is why the underlying infrastructure matters. At CoolVDS, we have moved aggressively to Pure SSD storage for our virtualization clusters. The random read/write speeds of SSDs are orders of magnitude faster than SAS or SATA spindles.
If you are hosting a site targeting Norway, you also need to consider network latency. Routing traffic through Frankfurt or London adds milliseconds. Hosting on a VPS with direct peering to NIX (Norwegian Internet Exchange) ensures your local users get that snappy, instant-load feel.
Final Thoughts
AWStats might look like it's from 2003, but in 2012, it remains a vital tool in the sysadmin's arsenal. It provides visibility that JavaScript simply cannot match. Just remember: log analysis is heavy. Do not try to run this on an undersized VPS with shared mechanical storage, or you will regret it when your site crawls to a halt every hour on the hour.
Need a platform that can crunch logs without sweating? Deploy an SSD-powered instance on CoolVDS today and keep your data fast, local, and secure.