Stop Trusting Javascript: Why You Need Server-Side Analysis
If you are relying solely on Google Analytics to judge your server's health, you are flying blind. I've seen it a hundred times: a marketing manager claims traffic is down, but the load average on the server is spiking through the roof. Why? Because Javascript tags don't fire for bots, scrapers, hotlinkers, or users with NoScript installed.
The only source of truth is the raw access log. In the battle for uptime, AWStats remains the weapon of choice for serious systems administrators who need to parse gigabytes of Apache or Nginx data into actionable intelligence. But running it on an underpowered box is a recipe for disaster.
The Anatomy of a Log File
Before we install anything, look at your httpd.conf. If you are running a standard CentOS 5 stack, you are likely using the Combined Log Format. This is non-negotiable for meaningful analysis.
LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined
This captures the Referer (who sent them) and the User-Agent (what browser or bot they are using). Without this, you cannot distinguish between a legitimate customer from Oslo and a scraper bot hammering your login page.
Deploying AWStats on RHEL/CentOS
Don't compile from source unless you enjoy dependency hell. Use the EPEL repository.
yum install awstats
cd /etc/awstats
cp awstats.model.conf awstats.yourdomain.conf
Open that configuration file. The most critical directive is LogFile. If you are rotating logs daily (and you should be, to save disk space), you need to point AWStats to the correct path, often handling wildcards if you parse archived logs.
Pro Tip: Never run log analysis on the live request thread. I once saw a junior admin hook a log parser into a CGI script. It took the site down in 10 minutes. Run your updates via crontab at 3 AM local time.
The "War Story": The Invisible Bandwidth Thief
Last month, a client hosting a media-heavy site on a generic shared host complained about sluggish performance. Their "analytics" showed 500 visitors a day. The server logs told a different story.
I fired up AWStats and sorted by "FileType". A specific .wmv video file was consuming 80% of their bandwidth. Digging into the "Connect to site from" section, we found a forum in Eastern Europe was hotlinking the video directly. We blocked the referer in .htaccess and the server load dropped from 4.0 to 0.2 instantly. Javascript analytics never saw those requests. AWStats did.
Privacy and The Norwegian Context (Datatilsynet)
Operating in Norway means respecting the Personopplysningsloven (Personal Data Act). While IP addresses are technical data, the Data Protection Authority (Datatilsynet) has strict views on storing user data indefinitely.
You can configure AWStats to anonymize the last byte of the IP address inside the config file. This maintains enough granularity for geo-location (identifying if traffic is coming from Telenor or NextGenTel subnets) without storing the exact identity of the user.
# Plugin to enable anonymization
LoadPlugin="geoipfree"
Compliance isn't just for lawyers; it's part of systems architecture.
The Hardware Reality: Why Virtualization Matters
Here is the ugly truth about log analysis: it is I/O intensive. Reading a 2GB log file requires high sustained read speeds. On a cheap Shared Hosting plan or an oversold OpenVZ container, your "steal time" (CPU wait) will skyrocket while the disk thrashes. This makes your actual website slow while you are just trying to read the logs.
This is where CoolVDS takes a different approach. We utilize Xen HVM virtualization. This means your RAM is reserved, and your I/O throughput isn't fighting with 500 other neighbors. When you grep a 5GB log file on a CoolVDS instance, you get the raw power of the underlying RAID-10 SAS arrays.
Performance Comparison: Log Parsing
| Environment | 1GB Log Parse Time | Impact on Web Server |
|---|---|---|
| Shared Hosting | Timeout / Failed | High Latency |
| Cheap VPS (OpenVZ) | 45 Seconds | Moderate Jitter |
| CoolVDS (Xen HVM) | 12 Seconds | Zero Impact |
Automating the Process
Once configured, set it and forget it. Add this to your cron:
0 3 * * * /usr/bin/perl /usr/share/awstats/wwwroot/cgi-bin/awstats.pl -config=yourdomain -update > /dev/null
By morning, you'll have a fresh report waiting. You will see the bots, the 404 errors, and the bandwidth drains that Google misses.
If you are tired of wondering why your server is slow, stop guessing. Get raw access to your resources and your logs. Deploy a CoolVDS instance today and see what's actually hitting your network interface.