The Map Is Not The Territory: Why You Need Server-Side Analytics
If you are relying solely on Google Analytics or any other JavaScript-based tracker to make business decisions, you are flying blind. I recently audited a high-traffic e-commerce client in Oslo who thought their traffic was down 15%. It wasn't. Their new jQuery slider was conflicting with the analytics script in older browsers. The customers were there; the tracking wasn't.
As systems administrators, we trust the kernel, not the client. The Apache access_log is the ultimate source of truth. It captures everything: the Googlebot crawling your sitemap, the script kiddie probing for SQL injections, and the user with JavaScript disabled. This is where AWStats enters the fight.
The Setup: AWStats 7.0 on CentOS
Getting AWStats running on a RHEL or CentOS box isn't difficult, but configuring it for accuracy requires attention to detail. We aren't just looking for page views; we are looking for bandwidth hogs and 404 errors that destroy SEO.
First, ensure you have the EPEL repository enabled, then grab the package:
yum install awstatsThe magic happens in the configuration file, typically found at /etc/awstats/awstats.yourdomain.conf. The default config is safe, but "safe" doesn't give us the granularity we need. Here is the configuration I deploy for high-traffic nodes:
LogFile="/var/log/httpd/access_log"
LogType=W
LogFormat=1
DNSLookup=0Pro Tip: Always set DNSLookup=0. Doing reverse DNS lookups on thousands of hits per second will kill your latency and likely get your IP throttled by upstream DNS providers. Resolve IPs offline if you absolutely must know the hostnames.The I/O Bottleneck: Why Hardware Matters
Here is the painful truth about log analysis that most hosting providers won't tell you: It is an I/O nightmare.
When you run awstats_updateall.pl, the script has to read massive text files, parse them line-by-line, and write to a database. On a standard shared hosting environment with oversubscribed SATA drives, this process causes "I/O Wait" (iowait) to skyrocket. Your web server creates a log entry, AWStats tries to read it, and suddenly your MySQL database locks up because the disk queue is full.
I've seen servers in data centers utilizing older virtualization tech (like Virtuozzo) grind to a halt during nightly log rotation. This is why architecture matters.
At CoolVDS, we utilize Xen paravirtualization. Unlike container-based hosting where you fight neighbors for kernel resources, Xen gives you a dedicated slice of RAM and strict I/O scheduling. Furthermore, our transition to high-performance enterprise storage means you can parse a 2GB log file without your site's Time-To-First-Byte (TTFB) falling off a cliff. If you are serious about data, you cannot run on spinning rust shared with 500 other tenants.
Norwegian Privacy Context: Datatilsynet Compliance
Operating in Norway means respecting the Personopplysningsloven (Personal Data Act). The Data Inspectorate (Datatilsynet) has been increasingly vocal about the storage of IP addresses. An IP address can be considered personally identifiable information (PII).
To stay compliant while keeping your data useful, use the HostAliases parameter to mask internal IPs, or configure the awstats_extra.conf to truncate the last octet of IP addresses if you don't need forensic-level user tracking. It’s a small configuration change that keeps your legal liability low.
Analyzing the "Invisible" Traffic
Once your AWStats is crunching data on a high-speed CoolVDS instance, look immediately at the "Robots/Spiders visitors" section. This is data Google Analytics will never show you.
In a recent deployment, we noticed a massive spike in bandwidth usage that didn't correlate with sales. AWStats revealed a rogue scraper from an offshore IP range hammering the catalog. We blocked the range in iptables within minutes. Without server-side logs, we would have just paid the overage bill and wondered why the server was sluggish.
Next Steps
Don't let your infrastructure dictate your visibility. If your current VPS chokes when you try to grep a log file, it's time to move.
Deploy a CentOS 6 instance on CoolVDS today. Experience the difference that dedicated resources and low-latency storage make for your systems administration tasks.