The Lie of the JavaScript Tag
If you are relying solely on Google Analytics or Piwik JavaScript tags to understand your server traffic, you are flying blind. I recently audited a high-traffic e-commerce site hosting Magento 1.7. Their marketing dashboard showed a smooth traffic curve, yet the server load averages were spiking above 20.0 every evening. Why? Because JavaScript tags don't execute when a bot hits your site. They don't fire for users with NoScript enabled (a growing demographic among privacy-conscious Europeans). And they certainly don't tell you when a rogue script is hammering your login.php.
The truth resides in one place: /var/log/httpd/access_log. But reading raw text files with grep and awk is manual labor we don't have time for. Today, we are going to deploy AWStats 7.0 on CentOS 6. We will configure it not just for pretty graphs, but for forensic-level traffic analysis that adheres to strict Norwegian data standards.
Prerequisites and The I/O Bottleneck
Before we touch the config files, we need to address the elephant in the rack: Disk I/O. AWStats is a Perl script that parses massive text files line by line. On traditional SATA or even 15k SAS drives, parsing a 2GB log file can lock your I/O wait (iowait) and degrade web server performance.
Pro Tip: Never run heavy log analysis on the same physical spindle as your MySQL database. If you are still on mechanical drives, partition your logs to a separate disk. Better yet, move to a provider offering Pure SSD VPS solutions. At CoolVDS, we use high-performance SSD arrays which make log parsing trivial, reducing parse times from minutes to seconds compared to traditional hosting.
Step 1: Installation on CentOS 6 / RHEL 6
We will stick to the EPEL repositories. Compiling from source is fine, but for maintainability across a fleet of servers, yum is king.
# Install EPEL repository if you haven't already
rpm -Uvh http://download.fedoraproject.org/pub/epel/6/i386/epel-release-6-8.noarch.rpm
# Install AWStats
yum install awstats
Step 2: Configuring Apache for Data Richness
The default Apache common log format is useless for performance debugging. We need to define a custom format that captures the time taken to serve the request. This is critical for identifying slow backend queries.
Open your /etc/httpd/conf/httpd.conf and verify your LogFormat:
LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined
# ADD THIS LINE for performance tracking:
LogFormat "%h %l %u %t \"%r\" %>s %b %D" common_with_time
The %D flag logs the time taken to serve the request in microseconds. This allows AWStats to report on your slowest pages—a lifesaver when debugging Magento or heavy Drupal installations.
Step 3: configuring awstats.conf
Copy the model configuration file for your specific domain.
cp /etc/awstats/awstats.model.conf /etc/awstats/awstats.coolvds-demo.no.conf
vi /etc/awstats/awstats.coolvds-demo.no.conf
There are hundreds of settings here, but these are the ones that separate the amateurs from the pros:
# Point to your actual log file
LogFile="/var/log/httpd/access_log"
# Set to '1' if you want to update stats from the web browser (Secure this with htaccess!)
AllowToUpdateStatsFromBrowser=0
# CRITICAL: Disable DNS Lookups for performance
# Doing reverse DNS on 100,000 IPs will kill your server and lag the report.
DNSLookup=0
# Enable the time-taken plugin we prepared for in Apache
ExtraSectionName1="Time to serve"
ExtraSectionCodeFilter1="200 304"
ExtraSectionCondition1=""
ExtraSectionFirstColumnTitle1="Time in ms"
ExtraSectionValue1="%D / 1000"
ExtraSectionStatTypes1=H
Handling Norwegian Privacy Laws (Datatilsynet)
Operating in Norway means respecting the Personopplysningsloven. Storing full IP addresses can be a gray area depending on your data retention policy. If you need to stay strictly compliant and minimize liability, you can obfuscate the last octet of the IP address within AWStats, though this limits your ability to block specific attackers.
If you choose to log full IPs for security (which is a valid legitimate interest), ensure your server is physically located in a jurisdiction you trust. Our CoolVDS nodes in Oslo offer low latency to the NIX (Norwegian Internet Exchange) while keeping data within Norwegian legal borders.
Step 4: Automation and Security
Don't rely on browser updates. Set up a cron job to update the statistics every hour. This keeps the processing load distributed rather than running a massive parse job once a day.
# /etc/cron.hourly/awstats
#!/bin/bash
/usr/share/awstats/wwwroot/cgi-bin/awstats.pl -config=coolvds-demo.no -update > /dev/null
Make sure the script is executable: chmod +x /etc/cron.hourly/awstats.
Finally, secure the reporting interface. You do not want your competitors seeing your traffic stats. Use Apache's .htaccess to restrict access to your management IP range.
# /var/www/awstats/.htaccess
AuthType Basic
AuthName "Restricted Access"
AuthUserFile /etc/awstats/htpasswd
Require valid-user
# Optional: Restrict by IP
Order Deny,Allow
Deny from all
Allow from 81.x.x.x # Your Office IP
The Verdict
AWStats gives you the raw reality of your server's life. It shows the brute-force attempts on SSH (if you parse secure logs), the 404 errors generated by broken links, and the bandwidth usage that JavaScript trackers ignore.
However, parsing logs is resource-intensive. On legacy VPS providers overselling their HDD storage, I've seen awstats.pl processes hang for 20 minutes, causing load spikes that affect the live site. Efficiency isn't just about code; it's about infrastructure. Whether you are hosting a high-traffic blog or a critical business application, ensure your underlying storage can handle the I/O thrashing of log analysis.
Don't let slow I/O kill your insights. Deploy a test instance on CoolVDS today and experience the difference high-speed SSD storage makes for system administration tasks.