Stop Guessing, Start Grepping
If you rely solely on JavaScript-based analytics, you are flying blind. Users disable scripts, mobile browsers on early smart devices time out, and bots simply ignore your tracking code. I recently audited a client's e-commerce setup running on a generic budget host. Their analytics dashboard showed a steady 500 visits a day. Their server load, however, suggested they were hammering the CPU like they were hosting the Eurovision finals.
The culprit? A scraper botnet harvesting their pricing data every ten minutes. JavaScript analytics never fired. The server logs, however, don't lie. This tutorial covers setting up AWStats on CentOS 6 to get a granular, forensic view of your traffic. We will focus on doing this efficiently, so log parsing doesn't eat up the I/O cycles your database needs.
The Prerequisites: Architecture Matters
Log analysis is heavy on disk I/O. When you run a parsing script against a 5GB access_log, you are reading thousands of lines per second. On oversold hosting platforms using standard SATA drives, this operation causes iowait spikes that degrade your web server's performance. Your site slows down just because you wanted to know who visited it.
Pro Tip: Always separate your log processing from your production hours. Configure your cron jobs to run at 03:00 CET. If you are on CoolVDS, our KVM virtualization ensures your disk I/O is isolated, and our Enterprise SAS 15k RAID-10 arrays can chew through log files without stalling your MySQL queries.Step 1: Installation on CentOS 6
We will use the EPEL repository. It’s cleaner than compiling from source and makes patching easier.
rpm -Uvh http://download.fedoraproject.org/pub/epel/6/i386/epel-release-6-5.noarch.rpm
yum install awstatsStep 2: Configuration for Accuracy
AWStats needs to know exactly where your Apache or Nginx logs are. Open the configuration file usually found in /etc/awstats/.
vi /etc/awstats/awstats.yourdomain.confFind the LogFile directive. If you are running a standard Apache setup, it should point here:
LogFile="/var/log/httpd/access_log"However, the real power comes from the LogFormat. Ensure you are using the 'Combined' log format in Apache so AWStats can track User Agents and Referrers. This is critical for spotting the difference between a potential customer in Oslo and a scraper bot in Shenzhen.
Step 3: Security and The Norwegian Context
By default, AWStats is accessible via a web folder. You must secure this. In Norway, Datatilsynet takes data privacy seriously under the Personal Data Act. You do not want your server logs—which contain IP addresses—exposed to the public web.
Secure the directory in your Apache config:
<Directory /usr/share/awstats/wwwroot>
Order deny,allow
Deny from all
Allow from 127.0.0.1
Allow from 80.212.x.x # Your Office IP
AuthType Basic
AuthName "Restricted Statistics"
AuthUserFile /etc/awstats/htpasswd
Require valid-user
</Directory>This ensures only authorized personnel can view traffic data.
Why "Free" Hosting Fails at Forensics
Many providers use OpenVZ or other container-based virtualization where the kernel is shared. If a neighbor on your node decides to parse a massive log file, your disk performance tanks. We built CoolVDS on KVM (Kernel-based Virtual Machine). This guarantees that your RAM and disk operations are yours alone. When you parse logs to find out why your bandwidth bill is high, the parsing itself shouldn't take down the server.
Next Steps
Once configured, set up a daily cron job to update your stats:
0 3 * * * /usr/share/awstats/tools/awstats_updateall.pl now >/dev/nullDo not let hidden traffic eat your resources. Deploy a CoolVDS instance today, get root access, and see what is actually happening on your network.