This is a list, in no particular order, why I love PostgreSQL. Note that many of these things are far better explained in the excellent documentation of PgSQL, so you can look up things there.
- First, PgSQL takes types seriously. Data is typed from the outset and you need to convert data to match the given type. This means we rule out dates like '2009-02-31' by default. MySQL fails this blatantly for instance. Also, it allows us to import data into a typed language fairly easily as soon as we have a type mapping defined.
- PgSQL has TOAST-storage. This includes several techniques like compression and out-of-line storage. What this means is that you can process data faster since disk I/O is what tend to kill database performance as soon as your data get beyond 'fits-in-memory'.
- PgSQL can do partial indexes. I keep partial indexes on unprocessed data. Hence, even if the table contains billions of rows, I can quickly grab the unprocessed records and process them. I also keep some partial indexes around for performance.
- PgSQL does bitmap index scans. This can combine several indexes into a bitmap telling the system what rows or pages to retrieve. Again, it lowers the amount of disk I/O needed to access data. Wonderful for complex queries.
- PgSQL uses Multi-version concurrency control (MVCC). This means you can do backups of your database without having to worry too much about what happens while you take the backup. It also means you can be sure that your ACID compliance is done correctly without affecting the performance of individual queries.
- You can disable autocommit by default in the psql client. No more accidentially destroying data.
- Data Definitions (the DDL) are transaction safe! You can undo an ALTER TABLE statement if it goes wrong. It makes data manipulation into a blissful operation: Alter the table, CREATE VIEW to get the old view of the data back. COMMIT the transaction. You can do this on a live system, which is wonderful.
- Need to add millions of rows? Use the COPY command (or its psql client \copy cousin). It is a lot faster than INSERT INTO for large amounts of data. My work desktop machine with an old SATA disk imports millions of (rather large) rows in a matter of minutes. This including raising a couple of indexes on the data.
- Need to export data? CREATE TEMPORARY TABLE export ( ... ), fill it with a SELECT statement, fire up COPY. Done. I've built many reports directly in the guts of the psql prompt with a couple of VIEWs and this technique.
- PgSQL is fast, especially for complex queries. This is especially true for recent PostgreSQL releases, where performance was the optimization target. And it paid off. Combine it with a query cache or a dedicated key-value store and you have a very stable, robust and fast platform which can scale to thousands of simultaneous users.
- Built-in full-text-search. I've used this to answer search queries from users on the data with ease.
And I get to combine PgSQL with Ocaml, awk and R. Don't underestimate the data manipulation power of awk, it is extremely fast.
