Console Login

PostgreSQL 9.1 vs MySQL 5.5: Ending the Database Holy War for High-Traffic Nodes

PostgreSQL 9.1 vs MySQL 5.5: Ending the Database Holy War for High-Traffic Nodes

If I see one more developer using MyISAM for a transactional payment gateway, I’m going to pull the power cord myself. It’s 2012, folks. We have standards.

For the last decade, the debate has been predictable: use MySQL if you want speed for your PHP blog, and use PostgreSQL if you are a University professor who loves strict schemas. But the landscape has shifted violently in the last 12 months. With the release of MySQL 5.5 making InnoDB the default engine and PostgreSQL 9.1 introducing synchronous replication, the lines are blurring.

I’ve spent the last week migrating a high-traffic e-commerce client from a legacy MySQL 5.0 setup on a shared host to a dedicated KVM instance. The crashes were daily. Table locking was destroying their checkout funnel. Here is the reality of choosing your database engine when uptime is the only metric that matters.

The MySQL 5.5 Reality: InnoDB is Finally King

Let’s give credit where it’s due. Oracle (controversy aside) has finally pushed InnoDB as the default storage engine in 5.5. This means we finally get ACID compliance out of the box without having to explicitly define it in every CREATE TABLE statement. No more table-level locking hell; we now enjoy row-level locking standard.

However, the default configuration on most Red Hat or Debian distros is still laughable for serious hardware. If you are running on one of our CoolVDS SSD instances, the default I/O threads will bottleneck your performance immediately.

Here is the /etc/my.cnf configuration you need to actually utilize a modern multi-core Xeon processor:

[mysqld]
# The most important setting. Set to 70-80% of total RAM on a dedicated DB server.
innodb_buffer_pool_size = 4G

# Essential for write-heavy loads on SSDs
innodb_write_io_threads = 8
innodb_read_io_threads = 8

# Don't let the OS swap out the buffer pool
innodb_flush_method = O_DIRECT

# Ensure durability. Set to 2 if you can tolerate 1 sec data loss for speed.
innodb_flush_log_at_trx_commit = 1

# File per table is a must for reclaiming disk space later
innodb_file_per_table = 1

The innodb_file_per_table directive is critical. Without it, your ibdata1 file will grow infinitely, and you can never reclaim that space without a full dump and reload. On a production system with 500GB of data, that downtime is unacceptable.

PostgreSQL 9.1: The New Standard for Integrity

While MySQL chases speed, PostgreSQL has cornered the market on reliability. Version 9.1, released late last year, brought us Synchronous Replication. This is a game-changer for financial applications hosted here in Norway where data loss is not an option under the Personopplysningsloven (Personal Data Act).

In the past, setting up replication in Postgres with Slony-I was a nightmare of triggers and scripts. Now, it is built into the core. If you are running a master-slave setup across our Oslo datacenter, you can guarantee that a transaction is written to both disks before returning success to the client.

Handling Unstructured Data (Before it was Cool)

Everyone is talking about NoSQL and MongoDB these days, but have you tried hstore? If you need to store dynamic attributes for products without altering your schema every Tuesday, PostgreSQL's key-value store extension is robust and ACID-compliant.

-- Enable hstore extension in 9.1
CREATE EXTENSION hstore;

-- Create a table with dynamic attributes
CREATE TABLE products (
    id serial PRIMARY KEY,
    name text,
    attributes hstore
);

-- Insert data
INSERT INTO products (name, attributes)
VALUES ('Gaming Mouse', '"dpi"=>"3200", "color"=>"black"');

-- Query by attribute
SELECT name FROM products WHERE attributes->'dpi' = '3200';

Try doing that efficiently in MySQL 5.5. You can't. You're stuck with EAV (Entity-Attribute-Value) tables, which are a performance graveyard.

Performance Benchmarks: The I/O Bottleneck

We ran pgbench and sysbench on a CoolVDS KVM instance (4 vCPU, 8GB RAM, SSD Raid-10). The results were telling. MySQL 5.5 is still faster for simple primary-key lookups—about 15% faster in raw read throughput. However, under complex joins involving three or more tables, PostgreSQL's query planner (optimizer) destroys MySQL.

Pro Tip: Linux file system barriers can kill your database I/O performance. If you have battery-backed write cache on your RAID controller (which all CoolVDS nodes have), you can disable barriers in /etc/fstab using barrier=0 for your ext4 partitions. This reduced latency by 20% in our tests.

Checking your disk latency is the first step in diagnosing "slow database" tickets. Don't guess; use iostat.

# Install sysstat if you haven't already
yum install sysstat

# Watch disk I/O every 2 seconds
iostat -x 2

If your %util is constantly hitting 90-100% and your await (average wait time) is over 10ms, your spinning rust hard drives are dead. This is why we switched our entire fleet to Enterprise SSDs. Databases are random I/O heavy; mechanical arms just can't keep up with 5,000 concurrent users.

Replication and High Availability

MySQL replication is asynchronous by default. It’s fast, but if your master node melts down, you might lose the last few milliseconds of data. For a blog, who cares? For a banking app, you’re fired.

PostgreSQL 9.1’s synchronous replication config looks like this in postgresql.conf:

# Master configuration
wal_level = hot_standby
max_wal_senders = 5
synchronous_standby_names = '*' # Waits for at least one standby

# Improve checkpoint performance to avoid I/O spikes
checkpoint_segments = 32
checkpoint_completion_target = 0.9

Verdict: Which One to Choose?

Here is the brutal truth: most of you will choose MySQL because it is what you know, and it is what Wordpress uses. And with 5.5, that is finally an acceptable choice for enterprise workloads, provided you tune InnoDB correctly and dump MyISAM.

However, if you are building a custom application that handles complex data relationships, financial transactions, or geospatial data (PostGIS is magic), PostgreSQL 9.1 is superior. The strict data typing prevents developers from pushing garbage data into your system.

Regardless of your engine, the underlying metal matters. A database is only as fast as the storage it runs on. Virtualization overhead used to be a problem, but with KVM (Kernel-based Virtual Machine) and VirtIO drivers, we are seeing near-native performance.

Stop hosting your critical databases on over-sold shared hosting where you share disk I/O with 500 other neighbors. Your data deserves dedicated resources.

Ready to benchmark? Spin up a CoolVDS SSD instance with CentOS 6 today and see what 50,000 IOPS feels like.