Console Login
Home / Blog / Server Administration / Stop Flying Blind: Advanced Server Log Analysis with AWStats on Linux
Server Administration 7 views

Stop Flying Blind: Advanced Server Log Analysis with AWStats on Linux

@

Stop Flying Blind: Advanced Server Log Analysis with AWStats on Linux

If you are relying solely on Google Analytics to monitor your infrastructure, you aren't seeing the whole picture. Javascript tags don't fire when a bot scrapes your content, they don't record 404 errors from broken internal links, and they certainly don't tell you when a script kiddie is probing your SQL injection vulnerabilities. As a System Administrator, you need raw, unadulterated truth. You need server logs.

But raw logs are ugly. Gigabytes of text in /var/log/httpd/ are useless unless you can parse them efficiently. That is where AWStats (Advanced Web Statistics) comes in. Unlike Webalizer, which feels like a relic from the 90s, AWStats offers decent visualization and plugin support. However, it is a Perl script, and if you configure it poorly on a high-traffic node, it will eat your CPU for breakfast.

The Reality of Log Rotation and I/O

Here is the scenario I faced last week: A client on a budget VPS (hosted elsewhere, naturally) complained that their server froze every night at 04:02 AM. A quick check of /var/log/cron showed the AWStats update script triggering exactly then. The problem? They were parsing a 4GB access log on a slow SATA drive with limited RAM.

When Perl parses text, it hits the disk hard. On a shared environment with "noisy neighbors," your disk I/O wait times skyrocket, causing the web server to hang. This is why we built CoolVDS on high-performance RAID10 SAS 15k storage (and recently introduced Enterprise SSD tiers). We separate I/O paths so your log analysis never chokes your Apache processes. But if you aren't on our infrastructure yet, you need to optimize your config.

Installation and Critical Configuration

Let's assume you are running CentOS 5.5 or Debian 5 (Lenny). If you don't have the repository enabled:

yum install awstats
# or for Debian users
apt-get install awstats

Once installed, copy the model config file. Do not edit the original.

cp /etc/awstats/awstats.model.conf /etc/awstats/awstats.yourdomain.com.conf

The "DNS Lookup" Trap

This is the single most common mistake. Inside your config file, look for DNSLookup. By default, it might be set to 1 or 2. Set it to 0.

DNSLookup=0

If you leave this on, AWStats tries to perform a reverse DNS lookup for every IP address in your log file. If you have 50,000 visitors, that is 50,000 DNS queries. Your script execution time will jump from 30 seconds to 3 hours. If you absolutely need resolved hostnames, use the logresolvemerge tool offline, or rely on a local GeoIP database plugin.

Spotting Anomalies and Attacks

AWStats isn't just about counting visitors; it's a security audit tool. Navigate to the "Robots/Spiders" section. Is a specific User-Agent consuming 40% of your bandwidth? That's not a user; that's a scraper costing you money.

Look at the HTTP Status Codes. A spike in 404 Not Found errors often indicates a vulnerability scanner looking for phpmyadmin, wp-admin, or known exploit scripts. If you see this pattern coming from a specific IP block, block it in iptables immediately.

Pro Tip: Integration with ModSecurity. If you are running managed hosting environments, configure your ModSecurity audit logs to output in a format AWStats can read. It allows you to visualize attacks blocked by your firewall over time.

Privacy and Local Compliance (Norway)

Operating in Norway or the broader EU requires adherence to strict privacy standards like the Personal Data Act (Personopplysningsloven). While we don't have a pan-European "GDPR" yet, the Norwegian Data Inspectorate (Datatilsynet) is clear about storing personally identifiable information. IP addresses can be considered personal data.

To stay compliant, enable the GEOIP_PLUGIN for country stats, but considering masking the last octet of IP addresses in your reports if you expose them to third parties. Keeping data within Norwegian borders is also a safe bet for legal jurisdiction.

Performance: The CoolVDS Difference

Processing logs is a brute-force activity. It demands high read speeds. Many providers oversell their virtualization, piling hundreds of customers onto a single disk array. When everyone's cron jobs fire at 4:00 AM, the array crawls.

At CoolVDS, we utilize Xen virtualization. Unlike OpenVZ, Xen provides better isolation of resources. Combined with our low latency network connected directly to NIX (Norwegian Internet Exchange) in Oslo, your management tasks execute instantly. We prioritize disk throughput so you can parse a month's worth of logs in seconds, not hours.

Automating the Update

Finally, ensure your stats are updated automatically, but be smart about it. Don't run it at the top of the hour when everyone else does. Pick a weird time, like 04:17 AM.

17 04 * * * /usr/bin/perl /usr/lib/cgi-bin/awstats.pl -config=yourdomain.com -update > /dev/null

Data is power, but only if you can process it. Don't let your monitoring tools become the bottleneck that takes down your site. If your current host struggles to `grep` a 500MB file without lagging, it's time to upgrade.

Need a platform that respects your need for raw power? Deploy a Xen instance on CoolVDS today and experience the stability of true hardware isolation.

/// TAGS

/// RELATED POSTS

Surviving the Spike: High-Performance E-commerce Hosting Architecture for 2012

Is your Magento store ready for the holiday rush? We break down the Nginx, Varnish, and SSD tuning s...

Read More →

Automate or Die: Bulletproof Remote Backups with Rsync on CentOS 6

RAID is not a backup. Don't let a typo destroy your database. Learn how to set up automated, increme...

Read More →

Nginx as a Reverse Proxy: Stop Letting Apache Kill Your Server Load

Is your LAMP stack choking on traffic? Learn how to deploy Nginx as a high-performance reverse proxy...

Read More →

Apache vs Lighttpd in 2012: Squeezing Performance from Your Norway VPS

Is Apache's memory bloat killing your server? We benchmark the industry standard against the lightwe...

Read More →

Stop Guessing: Precision Server Monitoring with Munin & Nagios on CentOS 6

Is your server going down at 3 AM? Stop reactive fire-fighting. We detail the exact Nagios and Munin...

Read More →

The Sysadmin’s Guide to Bulletproof Automated Backups (2012 Edition)

RAID 10 is not a backup strategy. In this guide, we cover scripting rsync, rotating MySQL dumps, and...

Read More →
← Back to All Posts