Console Login
Home / Blog / Tutorials & Guides / Stop Grepping Blindly: The Sysadmin’s Guide to AWStats & Log Intelligence
Tutorials & Guides 8 views

Stop Grepping Blindly: The Sysadmin’s Guide to AWStats & Log Intelligence

@

Stop Grepping Blindly: The Sysadmin’s Guide to AWStats & Log Intelligence

If I see one more junior admin trying to debug a traffic spike by staring at a scrolling tail -f /var/log/httpd/access_log, I might just confiscate their root password. Don't get me wrong—grep and awk are foundational tools. But when you are managing a high-traffic e-commerce site targeting the Norwegian market, raw text won't tell you if your conversion drop is due to a 404 error on the checkout page or a latency issue routing through NIX (Norwegian Internet Exchange).

You need historical trends. You need user agent breakdowns. You need AWStats.

In this guide, we are going to set up AWStats on a standard LAMP stack, tune it so the Perl scripts don't hammer your CPU, and discuss the specific privacy implications of logging user IP addresses here in Norway.

The "I/O Wait" Nightmare

I recently audited a server for a media client in Oslo. They were complaining that their site crawled every night at 04:00. They blamed the backup script. I looked at `top`. The CPU wasn't pegged by backups; it was pegged by awstats.pl trying to parse 4 gigabytes of log files on a single SATA drive. The system was stuck in iowait hell.

Log analysis is heavy on disk reads. If your Virtual Private Server (VPS) is sitting on oversubscribed storage, parsing logs will kill your web server's performance. This is why architecture matters. At CoolVDS, we utilize high-performance RAID-10 SAS and emerging Enterprise SSD tiers, which means high IOPS (Input/Output Operations Per Second). You can crunch a 2GB log file in seconds, not minutes.

Installation: CentOS 5 & Debian Lenny

Let's get this running. I'm assuming you have Apache 2.2 installed.

For CentOS 5 (via EPEL)

rpm -Uvh http://download.fedora.redhat.com/pub/epel/5/i386/epel-release-5-3.noarch.rpm yum install awstats

For Debian 5 (Lenny)

apt-get update apt-get install awstats

Configuration: The Critical Flags

The default config is garbage for high-performance environments. Open your config file, usually located at /etc/awstats/awstats.yourdomain.conf.

1. DNS Lookup (The Performance Killer)
By default, AWStats might try to reverse resolve every IP address to a hostname. Turn this OFF immediately. It creates massive latency and DNS traffic.

DNSLookup=0

2. Incremental Updates
Do not parse the whole log file every time. Use the history file to only parse new lines.

LogFile="/var/log/httpd/access_log" LogType=W LogFormat=1 SiteDomain="www.yourdomain.no" HostAliases="yourdomain.no www.yourdomain.no 127.0.0.1" DirData="/var/lib/awstats" DirCgi="/awstats" DirIcons="/awstatsicons" AllowToUpdateStatsFromBrowser=0 WarningMessages=1 SaveDatabase=1
Pro Tip: Never set AllowToUpdateStatsFromBrowser=1 on a public facing server unless you want bots triggering the update script and DoS-ing your site. Run the update via a cron job instead.

Automating with Cron

Set up a cron job to update stats every hour. Edit your crontab with crontab -e:

0 * * * * /usr/share/awstats/wwwroot/cgi-bin/awstats.pl -config=yourdomain.no -update > /dev/null

This keeps your reports fresh without manual intervention.

Norwegian Privacy Context: Datatilsynet & IP Addresses

Here in Norway, the Data Inspectorate (Datatilsynet) takes the Personopplysningsloven (Personal Data Act) very seriously. An IP address can be considered Personal Identifiable Information (PII). Storing full IP addresses of users indefinitely without consent is a legal grey area that can get you in trouble.

If you don't strictly need the full IP for security auditing, you should mask it. AWStats has a plugin for this, but a better approach is often to handle it at the Apache level or via a post-processing script if you want to be 100% safe.

However, if you are hosting with CoolVDS, your data resides physically in Oslo. This aids significantly with compliance compared to hosting in the US under Safe Harbor, where data sovereignty is becoming a major headache for European CTOs.

Securing the Interface

AWStats reports expose your internal directory structure and traffic patterns. Do not leave this open to the web. Use Apache's .htaccess to restrict access.

# /var/www/awstats/.htaccess AuthName "Admin Access Only" AuthType Basic AuthUserFile /etc/awstats/.htpasswd require valid-user

Create the password file:
htpasswd -c /etc/awstats/.htpasswd adminuser

Why Infrastructure Matters for Analysis

Log analysis is a resource-intensive task. It is "bursty" by nature. On cheap, oversold hosting (OpenVZ containers crammed onto a single drive), your `awstats.pl` process will fight for disk time with your neighbor's PHP scripts. This causes "CPU Steal" and high latency for your actual website visitors.

We built CoolVDS using Xen virtualization to ensure hard resource isolation. When you run a heavy Perl script to crunch your monthly logs, you get the dedicated RAM and disk throughput you paid for. No noisy neighbors. No excuses.

If you are tired of watching your load average spike every time you try to analyze your traffic, it is time for a serious upgrade.

Need a server that can handle heavy I/O without breaking a sweat? Deploy a CoolVDS Xen instance in Oslo today and stop guessing what your traffic looks like.

/// TAGS

/// RELATED POSTS

The Ironclad Mail Server: Postfix Configuration Guide for RHEL/CentOS 6

Stop relying on shared hosting relays. Learn how to configure a battle-hardened Postfix server on Ce...

Read More →

Bulletproof Postfix: Building an Enterprise Mail Gateway on CentOS 6

Stop trusting shared IPs with your business communications. A battle-hardened guide to configuring P...

Read More →

Stop Guessing: Precision Server Log Analysis with AWStats on Linux

Client-side tracking misses 20% of your traffic. Learn how to configure AWStats for granular server-...

Read More →

Build Your Own Secure Tunnel: A Hardened OpenVPN Guide for 2011

Tired of sniffing risks like Firesheep on public networks? Learn how to deploy a rock-solid OpenVPN ...

Read More →

Tunneling Through the Noise: A Hardened OpenVPN Setup on Debian Squeeze

Public WiFi is compromised. PPTP is dead. Learn how to deploy a battle-ready OpenVPN server with 204...

Read More →

Hardened Postfix Configuration: Building a Bulletproof Mail Server in 2011

Stop losing business emails to spam filters. A battle-hardened guide to configuring Postfix, impleme...

Read More →
← Back to All Posts