Stop Trusting JavaScript: Why Server-Side Log Analysis is Your Only Source of Truth
If I had a krone for every time a marketing manager complained that Google Analytics numbers didn't match the server load, I'd own a datacenter in Oslo by now. Here is the hard truth: Client-side tracking is a liar.
It misses users with NoScript, it misses bots scraping your content, and it definitely misses that rogue script hotlinking your images and draining your bandwidth. As systems administrators, we don't deal in "sessions" or "bounce rates." We deal in raw HTTP requests, status codes, and bytes transferred. We deal with reality.
In 2009, with the explosion of botnets and scrapers, relying solely on JavaScript tags is administrative negligence. You need to parse your raw Apache logs. For that, AWStats remains the heavy lifter of choice. But be warned: analyzing gigabytes of logs requires real iron, not cheap shared hosting.
The Gap Between Analytics and Reality
I recently audited a Magento installation for a client in Trondheim. Their analytics dashboard showed a calm 2,000 visits a day. Yet, the server load averages were spiking above 5.0, and httpd processes were maxing out RAM.
We fired up the logs:
tail -f /var/log/httpd/access_log
The screen turned into a blur. A Chinese botnet was hammering the search function. Google Analytics saw none of this because bots don't execute JavaScript. The client was paying for bandwidth they weren't tracking. This is where server-side analysis becomes critical for both security and TCO (Total Cost of Ownership).
Deploying AWStats on CentOS 5
AWStats (Advanced Web Statistics) parses your log files and generates static HTML reports. It is written in Perl, which means it is powerful but resource-intensive. Do not try to run this on an overloaded shared server; the process will likely be killed for excessive CPU usage. This is a classic use case for a dedicated VPS Norway instance where you have guaranteed CPU cycles.
1. Installation
On a standard CentOS 5 box with the EPEL repository enabled:
yum install awstats
2. Configuration
The magic happens in /etc/awstats/awstats.yourdomain.conf. You need to tell AWStats exactly how Apache is writing logs. If you are using the standard "Combined" format in Apache, your config should look like this:
LogFile="/var/log/httpd/access_log"
LogType=W
LogFormat=1
SiteDomain="www.yourdomain.no"
HostAliases="yourdomain.no www.yourdomain.no 127.0.0.1"
Pro Tip: If you are rotating logs nightly (logrotate), make sure AWStats runs before the rotation, or configure it to read the archived logs. Losing 24 hours of data because of a cron job race condition is a rookie mistake.
3. The Performance Impact
Parsing a 2GB log file takes time. Perl will eat a single CPU core alive during the process. On CoolVDS managed hosting plans, we utilize Xen virtualization. Unlike OpenVZ containers used by budget providers, Xen gives you a rigid memory allocation and dedicated swap.
When AWStats fires up at 04:00 AM via cron, you want high-speed storage I/O to read that log file quickly. While many hosts are still spinning 7.2k SATA drives, serious setups should look for Enterprise RAID-10 with 15k SAS drives or the emerging enterprise SSDs to keep I/O wait times low.
Data Privacy in Norway (Datatilsynet)
Here is the nuance many non-European hosts miss. An IP address is considered Personal Data under Norwegian law (Personopplysningsloven). When you store raw Apache logs, you are processing personal data.
If you host this data on a server in the US, you are relying on Safe Harbor frameworks, which legal experts are already scrutinizing. By keeping your VPS and your logs physically located in Oslo, you reduce latency to the NIX (Norwegian Internet Exchange) to under 2ms and you satisfy the requirements of Datatilsynet without complex legal gymnastics.
Securing the Reporting Interface
By default, AWStats puts its CGI scripts in `cgi-bin`. Do not leave this open to the world. It reveals your traffic patterns, your directory structure, and your OS version to competitors. Lock it down in your Apache config:
<Directory "/usr/share/awstats/wwwroot">
Order deny,allow
Deny from all
Allow from 127.0.0.1
Allow from 84.211.x.x # Your Office IP
AuthType Basic
AuthName "Admin Access"
AuthUserFile /etc/httpd/.htpasswd
Require valid-user
</Directory>
Conclusion
Google Analytics is for the marketing team. AWStats is for the engineers. It tells you who is stealing your bandwidth, which 404 errors are frustrating your users, and exactly how much load your server is actually handling.
But remember: Log parsing is I/O heavy. Don't let a statistics script slow down your production database. Isolate your workloads on a proper CoolVDS instance with guaranteed resources and low latency connectivity to the Nordic backbone.
Ready to take control of your infrastructure? Deploy a CoolVDS Xen instance in Oslo today and stop guessing what's happening on your server.