Console Login
Home / Blog / Server Administration / Stop Grepping Blindly: Visualizing Server Traffic with AWStats on CentOS 5
Server Administration 9 views

Stop Grepping Blindly: Visualizing Server Traffic with AWStats on CentOS 5

@

Stop Grepping Blindly: Visualizing Server Traffic with AWStats on CentOS 5

It is 3:00 AM. Your load average is spiking, and top shows Apache eating CPU cycles like a starving animal. You suspect a botnet or a scrape script gone rogue, but staring at tail -f /var/log/httpd/access_log is like trying to read the Matrix code without the red pill. You need aggregate data, and you need it five minutes ago.

While Google Analytics is fine for marketing teams, it lies to sysadmins. It relies on JavaScript execution. It misses the bots, the hotlinkers, and the 404 errors that are actually grinding your disk to a halt. This is where AWStats (Advanced Web Statistics) comes in. It parses the raw server logs, giving you the truth, the whole truth, and nothing but the truth.

I recently audited a high-traffic e-commerce site based in Oslo. They were suffering from phantom slowdowns. Marketing said traffic was normal; the logs said otherwise. We deployed AWStats and found a scraper from a non-EU IP hitting their search function 50 times per second. We blocked the IP range in iptables, and the load dropped instantly. Here is how to set it up correctly, specifically for a CentOS environment.

The Prerequisites

We are assuming you are running a standard LAMP stack on CentOS 5 or 6. While AWStats is written in Perl, don't let that scare you—it is battle-tested and efficient, provided your I/O subsystem isn't made of wood.

Pro Tip: Log analysis is I/O intensive. If you are parsing gigabytes of logs on a cheap shared host with over-provisioned SATA drives, your server will choke. This is why we build CoolVDS instances on RAID-10 SAS arrays with dedicated I/O throughput. We don't steal your IOPS.

Step 1: Installation

First, enable the EPEL repository if you haven't already. The standard repositories are often too conservative.

rpm -Uvh http://download.fedoraproject.org/pub/epel/5/i386/epel-release-5-4.noarch.rpm yum install awstats

Step 2: Configuration for Accuracy

The default configuration is rarely enough. You need to map it to your specific domain logs. Copy the model file:

cp /etc/awstats/awstats.model.conf /etc/awstats/awstats.yourdomain.com.conf vi /etc/awstats/awstats.yourdomain.com.conf

Change these critical lines:

  • LogFile="/var/log/httpd/access_log" (Or wherever your VHOST logs live)
  • SiteDomain="yourdomain.com"
  • DNSLookup=1 (Warning: This slows down processing significantly. Only enable if you have local caching DNS or fast upstream resolvers. On CoolVDS, our local resolvers in the Oslo datacenter handle this latency efficiently.)

Step 3: Security and The Norwegian Context

By default, AWStats puts its CGI scripts in a public folder. Do not leave this open. You are exposing internal traffic patterns. In Norway, Datatilsynet (The Data Inspectorate) takes a dim view of leaking IP addresses, which are considered personal data under the Personal Data Act (Personopplysningsloven).

Lock it down in your Apache config:

<Directory "/usr/share/awstats/wwwroot"> Options None AllowOverride None Order deny,allow Deny from all Allow from 127.0.0.1 10.0.0.0/8 AuthType Basic AuthName "AWStats Access" AuthUserFile /etc/awstats/htpasswd Require valid-user </Directory>

Step 4: Automation

You do not want to run the update script manually. However, running it every hour on a massive log file can spike your CPU. The balance is a cron job that runs during low-traffic hours.

0 3 * * * /usr/share/awstats/wwwroot/cgi-bin/awstats.pl -config=yourdomain.com -update > /dev/null

Why Infrastructure Matters

Parsing logs is a sequential read operation. On traditional VPS platforms using OpenVZ with oversold resources, the "steal time" (CPU time stolen by the hypervisor for other tenants) can cause your stats generation to hang, leaving you with gaps in your data.

At CoolVDS, we utilize KVM virtualization. This ensures that the RAM and CPU cycles you pay for are reserved for your log parsing, not your neighbor's WordPress plugin. Furthermore, our datacenter in Oslo ensures that if you are processing Norwegian user data, it stays within national borders, simplifying your compliance with local privacy laws.

Final Thoughts

Logs are the black box of your server. Without a tool like AWStats, you are flying blind. But remember, a tool is only as fast as the hardware it runs on. If you are tired of waiting 20 minutes for a log report to generate, it might be time to upgrade to a platform designed for heavy lifting.

Need consistent I/O for your data analysis? Deploy a KVM instance on CoolVDS today and stop fighting for resources.

/// TAGS

/// RELATED POSTS

Surviving the Spike: High-Performance E-commerce Hosting Architecture for 2012

Is your Magento store ready for the holiday rush? We break down the Nginx, Varnish, and SSD tuning s...

Read More →

Automate or Die: Bulletproof Remote Backups with Rsync on CentOS 6

RAID is not a backup. Don't let a typo destroy your database. Learn how to set up automated, increme...

Read More →

Nginx as a Reverse Proxy: Stop Letting Apache Kill Your Server Load

Is your LAMP stack choking on traffic? Learn how to deploy Nginx as a high-performance reverse proxy...

Read More →

Apache vs Lighttpd in 2012: Squeezing Performance from Your Norway VPS

Is Apache's memory bloat killing your server? We benchmark the industry standard against the lightwe...

Read More →

Stop Guessing: Precision Server Monitoring with Munin & Nagios on CentOS 6

Is your server going down at 3 AM? Stop reactive fire-fighting. We detail the exact Nagios and Munin...

Read More →

The Sysadmin’s Guide to Bulletproof Automated Backups (2012 Edition)

RAID 10 is not a backup strategy. In this guide, we cover scripting rsync, rotating MySQL dumps, and...

Read More →
← Back to All Posts