Console Login
Home / Blog / Server Administration / Stop Grepping Logs: Visualizing Traffic with AWStats on CentOS
Server Administration 8 views

Stop Grepping Logs: Visualizing Traffic with AWStats on CentOS

@

Stop Grepping Logs: Visualizing Traffic with AWStats on CentOS

There is a specific kind of headache reserved for system administrators who try to debug a traffic spike using tail -f /var/log/httpd/access.log. While raw logs are the ultimate source of truth, they are terrible for spotting trends. If you are running a high-traffic e-commerce site or a media portal here in Norway, you cannot rely on scrolling text to tell you if your bandwidth bill is about to explode.

You need visualization. In 2011, your best option isn't expensive SaaS—it's AWStats. It is free, robust, and Perl-based. But be warned: log parsing is an I/O killer. I have seen budget VPS instances freeze completely because a sysadmin tried to parse a 2GB log file on a shared disk array. Here is how to do it right, keeping your resources managed and your data strictly within Norwegian borders.

The Reality of Log Parsing and I/O Wait

Before we touch the config, let's talk hardware. AWStats works by reading your server logs line-by-line and building a database of statistics. This is a read-heavy operation.

If you are hosting on a legacy platform with oversold hard drives, running the update script will skyrocket your I/O Wait. Your web server (Apache or Nginx) will start queuing requests because the disk is too busy reading logs to serve static files. This is why we engineer CoolVDS with dedicated RAID 10 arrays and strictly limited density. We ensure that when you crunch numbers, your actual visitors don't stare at a loading screen.

Step 1: Installing AWStats on CentOS 5/6

We will assume you are running a standard LAMP stack (Linux, Apache, MySQL, PHP/Perl). First, ensure you have the EPEL repository enabled, as AWStats isn't in the base CentOS repos.

rpm -Uvh http://download.fedoraproject.org/pub/epel/5/i386/epel-release-5-4.noarch.rpm yum install awstats

Once installed, the configuration files live in /etc/awstats/. You need to create a copy of the model file for your specific domain.

cp /etc/awstats/awstats.model.conf /etc/awstats/awstats.yourdomain.com.conf vi /etc/awstats/awstats.yourdomain.com.conf

Step 2: Configuration for Accuracy

The default config is okay, but "okay" doesn't solve problems. You need to change a few specific directives to ensure you aren't just logging garbage data.

  • LogFile: Point this to your actual Apache log. usually /var/log/httpd/access_log.
  • LogFormat: Set this to 1 for "Combined" Apache format. If you use Nginx, you must match the log_format directive in nginx.conf to what AWStats expects.
  • DNSLookup: Set this to 1. This resolves IP addresses to hostnames. Warning: This slows down processing significantly. On a CoolVDS instance with low latency to local DNS resolvers, this is negligible, but on cheap hosting, this can add hours to your processing time.
Pro Tip: If you are running a cluster behind a load balancer, your logs might show the load balancer's IP instead of the visitor's. Ensure you install mod_rpaf for Apache so the X-Forwarded-For header is respected. Otherwise, AWStats will think 100% of your traffic comes from localhost.

Step 3: Secure the Interface

By default, AWStats is accessible via a CGI script. Do not leave this open to the world. Competitors can see your traffic spikes, your referrers, and your keywords. Lock it down using an .htaccess file in your web root or Apache configuration:

<Directory /var/www/awstats/> Order deny,allow Deny from all Allow from 123.45.67.89 # Your Office IP AuthType Basic AuthName "Restricted Stats" AuthUserFile /etc/awstats/htpasswd Require valid-user </Directory>

Local Compliance: The Norwegian Context

Hosting in Norway isn't just about speed; it is about the Personal Data Act (Personopplysningsloven). IP addresses can be considered personal data under Datatilsynet guidelines.

When you host with a US-based provider, you are sending log data across the Atlantic. By keeping your VPS in Oslo (like our CoolVDS infrastructure), you simplify compliance. However, you should still configure AWStats to purge raw data after processing. Set PurgeLogFile=1 if you archive logs elsewhere, or use the ArchiveLen directive to limit how long historical data is kept.

Step 4: Automating the Updates

Stats are useless if they are old. Set up a cron job to update the database every hour. Open your crontab with crontab -e:

0 * * * * /usr/bin/perl /usr/share/awstats/wwwroot/cgi-bin/awstats.pl -config=yourdomain.com -update > /dev/null

This runs the Perl script at the top of every hour. Monitor your server load the first few times this runs. If you see your load average spike above 2.0, you might need to allocate more RAM or move to a higher tier plan.

Why Infrastructure Matters

Log analysis is basically a stress test for your storage subsystem. If you are on a shared host, the "neighbor effect" means someone else's log processing can slow down your database queries. This is unacceptable for professional environments.

At CoolVDS, we use KVM virtualization to ensure strict resource isolation. We combine this with high-performance RAID storage to handle the heavy read/write operations required by tools like AWStats. Whether you are fighting off a DDoS attack or analyzing the success of a marketing campaign, you need hardware that doesn't blink.

Ready to take control of your data? Don't settle for sluggish I/O. Deploy a high-performance VPS Norway instance on CoolVDS today and see what your logs have been trying to tell you.

/// TAGS

/// RELATED POSTS

Surviving the Spike: High-Performance E-commerce Hosting Architecture for 2012

Is your Magento store ready for the holiday rush? We break down the Nginx, Varnish, and SSD tuning s...

Read More →

Automate or Die: Bulletproof Remote Backups with Rsync on CentOS 6

RAID is not a backup. Don't let a typo destroy your database. Learn how to set up automated, increme...

Read More →

Nginx as a Reverse Proxy: Stop Letting Apache Kill Your Server Load

Is your LAMP stack choking on traffic? Learn how to deploy Nginx as a high-performance reverse proxy...

Read More →

Apache vs Lighttpd in 2012: Squeezing Performance from Your Norway VPS

Is Apache's memory bloat killing your server? We benchmark the industry standard against the lightwe...

Read More →

Stop Guessing: Precision Server Monitoring with Munin & Nagios on CentOS 6

Is your server going down at 3 AM? Stop reactive fire-fighting. We detail the exact Nagios and Munin...

Read More →

The Sysadmin’s Guide to Bulletproof Automated Backups (2012 Edition)

RAID 10 is not a backup strategy. In this guide, we cover scripting rsync, rotating MySQL dumps, and...

Read More →
← Back to All Posts