Skip to content

Bruce Momjian: Momjian on Postgres
Syndicate content
Bruce Momjian: Postgres Blog
Updated: 2 days 11 hours ago

Postgres Coming to the International Space Station

Fri, 07/16/2010 - 16:00

An email list posting this week revealed that Postgres will be used on the International Space Station by the end of the year, specifically "to store the data on orbit and then replicate that db on the ground". Postgres has always been heavily used by NASA and other government agencies, but this is a uniquely "high"-profile use of Postgres.

Categories: Companies, Open Source

PostgreSQL 9.0 Illustrated

Tue, 06/29/2010 - 17:00

Some Postgres users have created an detailed outline of Postgres 9.0 features with practical examples of almost every feature. If you are curious about what is coming in 9.0 or are confused by the 9.0 release notes, this is a must read.

Categories: Companies, Open Source

Another Drexel University Course

Wed, 06/23/2010 - 17:00

I am teaching another Drexel University database class this summer called Database Theory. This class builds on the Database Systems class I taught last summer, and again heavily uses Postgres. The class covers many of the practical challenges of database management, like replication, internals, security, and managing large systems. As you can see from the syllabus, I was able to enlist the help of local open source contributors as guest lecturers. The first class is tomorrow.

Categories: Companies, Open Source

Postgres Pool Party

Wed, 06/16/2010 - 17:00

If you are near Philadelphia, you are invited to attend the 2010 Postgres Pool Party at my home:

  • When: Saturday, July 31, 2010, 2pm to 7pm
  • Where: my home in Newtown Square, Pennsylvania (directions)
  • What: barbecue, swimming, and most importantly, good conversation

All Postgres users, developers, and groupies are invited, including their families. No need to RSVP — just come.

Categories: Companies, Open Source

Updated Presentation: The Magic of Hot Streaming Replication

Wed, 06/16/2010 - 17:00

I have simplified my presentation, The Magic of Hot Streaming Replication, to show a hot/streaming setup without the need to archive WAL files. This greatly simplifies the setup, and is possible thanks to wal_keep_segments, which allows the retention of a sufficient number of WAL files in /pg_xlog for the slave without the requirement of configuring continuous archiving.

When I originally researched hot standby and streaming replication, I focused on it as an extension of continuous archiving, but I now realize that continuous archiving is unnecessary and overly-complicates the setup if all you need is a read-only, standby server, and don't need the point-in-time recovery features of continuous archiving.

Categories: Companies, Open Source

Postgres on Amazon's EC2

Mon, 05/31/2010 - 17:00

This email thread discusses the advantages (ease of management) and disadvantages (uneven performance, poor reliability) of running Postgres on Amazon's Elastic Compute Cloud (EC2) using Elastic Block Stores (EBS). It includes the idea of using multiple EBS volumes in RAID10, and using multiple EC2 database clusters for redundancy. One quote is telling, "It's possible to run a larger database on EC2, but it takes a lot of work, careful planning and a thick skin." A similar thread from a year ago had a similar cost/benefit analysis.

Categories: Companies, Open Source

Financial Disincentive

Wed, 05/12/2010 - 17:00

While conventional economics theorizes that increased rewards always leads to better output, this video suggests that many activities, particularly creative ones, produce worse output with increased rewards. The video shows various cases where this can happen, and even interviews open source contributors as a tangible example of people working without financial reward.

This video answers a question I have been asked many times, "Why do people contribute to open source?", and it answers it better than I ever have. Basically, once people have adequate compensation, people are often motivated more by creative challenges than by money. A corollary of this is that financial reward can distort creative output. This might explain why the Postgres user and developer experience feels different from proprietary software — because there are few financial distortions to interfere with the creative work of designing and using Postgres.

Continue Reading »

Categories: Companies, Open Source

Pg_Migrator Included in Postgres 9.0, Renamed to Pg_Upgrade

Wed, 05/12/2010 - 17:00

After community discussion, pg_migrator has been renamed to "pg_upgrade" and will be an additional supplied module (/contrib) in the main server distribution of Postgres 9.0 (documentation). (It will appear in Postgres 9.0 beta 2.) Pg_Upgrade supports upgrades from Postgres 8.3.X and 8.4.X to Postgres 9.0. As you can see from the documentation, many of the restrictions of upgrading from 8.3 to 8.4 are gone, even for upgrades from Postgres 8.3 directly to 9.0.

This signals a more serious committment from the community to try to provide non-dump/restore major upgrades. The pg_migrator pgFoundry project will continue, but will not be enhanced to support new major versions of Postgres — it will remain only to support upgrades to Postgres 8.4.

View Comments
Categories: Companies, Open Source

New Replication Option

Mon, 04/26/2010 - 17:00

It is rare for me to hear about a new replication option, but I heard about one involving pgpool-II last week at the New York PostgreSQL Meetup Group. pgpool-II is not only a connection pooler but supports load balancing by sending data-modification queries to every host, while it load-balances read-only queries across all hosts.

One major problem with this usage is that non-deterministic functions, e.g. random(), CURRENT_TIMESTAMP, can produce different results when sent to multiple hosts. pgpool-II 2.3 and later handles some of these cases, but not all of them. Fully-correct operation requires either developers or administrators to handle unsupported non-deterministic functions. This new replication option involves using pgpool-II in 'master_slave_mode', which causes all data modification queries to be sent to the master, and the master replicates those changes (without non-deterministic side-effects) to the other slave nodes in the pgpool cluster. This diagram (page 19) illustrates this setup.

Continue Reading »

Categories: Companies, Open Source

Creeping to Beta

Mon, 04/26/2010 - 17:00

With the final Postgres 9.0 commitfest completed two months ago, you might be wondering what is delaying 9.0 beta 1. Along with the normal cleanup of open items, most of the time has been spent finalizing the behavior of our two big 9.0 features, streaming replication and hot standby (which I mentioned previously; presentation available).

Someone asked me to explain what that testing process looks like, and the most graphic description I could think of was a massive automobile accident (photo; those fire trucks are our developers analyzing the failure ). It is as ugly as an automobile accident, e.g.:

  • server doesn't start if I stop it while doing a backup with a slave connected
  • slave cannot connect if configuration options are not set correctly, but poor feedback is given
  • slave shows massive overhead when applying streaming logs
  • configuration variables are unsettable and confusing to a normal user

Continue Reading »

Categories: Companies, Open Source

VM Me

Mon, 04/19/2010 - 17:00

Initially the lure of Virtual Machines (VM) was to allow multiple operating systems to run simultaneously on a laptop. This is particularly useful if you need to run many applications that are only supported certain operating systems, e.g. MS Office. VMs have also always been popular for testing and development. (This video presents a simplified explaination of virtualization.)

However, recently VM usage has grown and are now regularly used on production servers. This recent community email post highlights the advantages of using VMs for production deployments:

  • fewer production servers (e.g. only two servers can host all VMs)
  • simpler deployments (administrators can easily copy VMs to production servers)
  • more reliable deployments (VMs provide a self-contained environment, with fewer potential problems during deployment)
  • better scalability (VMs can be given more server resources as necessary)
  • easier testing (VMs can be copied and tested independently of the deployment server)
  • improved reliability (VMs can be easily started on standby servers once they become production servers)

Continue Reading »

Categories: Companies, Open Source

SSDs For Your Database?

Tue, 04/13/2010 - 17:00

Solid-state drives (SSD) are getting larger and more popular. Initially, SSDs appeared to be ideal for databases because they potentially allow fast I/O with non-volatile storage — unfortunately, neither of these is completely true. Let me explain:

Fast I/O: While SSDs offer random I/O speeds far in excess of traditional hard drives (because there are no moving platters or heads), the sequential I/O speed of SSDs is only marginally better than mechanical drives. Database activity that causes random I/O, like index scans that do not fit in RAM, will benefit from SSDs' superior random I/O speeds, but sequential scans only marginally benefit by using SSDs. (Greg Smith explains the limited use-case for SSDs in this email.) Postgres 9.0 will allow random page cost to be set per tablespace so administrators can indicate that random I/O has the same speed as sequential I/O for SSD-based tablespaces.

Continue Reading »

Categories: Companies, Open Source

Learning from NoSQL

Sat, 04/10/2010 - 17:00

Having just blogged about NoSQL, I think the primary lesson the Postgres community can learn from NoSQL is that not everyone needs all our features, and if it can be easily accomplished, we should allow users to trade features for some of the NoSQL desired behaviors. We already do that in some cases:

  • You can improve speed by delaying or eliminating durability (synchronous_commit, fsync)
  • You can reduce consistency overhead by not defining constraints
  • You can use prepared queries to avoid parser and optimizer overhead
  • Arrays can often be used to avoid join overhead
  • You can store non-structured data with hstore
  • You can serve stale data using multi-master asynchronous replication (Bucardo)

However, there are a few things that will be difficult to allow:

  • Non-SQL access to data
  • Reduced overhead by eliminating atomicity or isolation

Continue Reading »

Categories: Companies, Open Source

Settling for NoSQL

Sat, 04/10/2010 - 17:00

There is lots of buzz about NoSQL databases these days, and thanks to my attendance at Emerging Technologies for the Enterprise Conference, I have learned more about them. I heard a talk about Cassandra, and another about MongoDB.

NoSQL databases differ in several ways from relational databases:

  • crude or no query language (clients do much of the processing normally done by SQL)
  • no joins (data must be joined client-side)
  • sacrifice ACID and transaction properties for speed, fault tolerance, or easy addition and removal of nodes

Continue Reading »

Categories: Companies, Open Source

MVCC For Programming Languages

Thu, 04/08/2010 - 17:00

Multiversion Concurrency Control (MVCC) is the method used by Postgres to provide high read/write concurrency. You might not have heard of it, but if you have heard the Postgres locking behavior described as "readers don't block writers, and writers don't block readers", MVCC is what allows that behavior. Postgres was an early implementor of MVCC, but now most relations systems use it.

I attended the Emerging Technologies for the Enterprise Conference today and heard an interesting talk related to MVCC. It applied the ideas of MVCC (which allows for high concurrency) to the problem of parallel programming, which is becoming more necessary as massively multi-core systems become more common.

Continue Reading »

Categories: Companies, Open Source

Speaking in New York City

Tue, 03/30/2010 - 17:00

I am speaking at New York PostgreSQL Meetup Group on April 20th about Postgres replication solutions. This will be a much larger group than previous New York City Postgres meetings; 13 people have already registered.

View Comments
Categories: Companies, Open Source

PG East: Slides

Tue, 03/30/2010 - 17:00

My slides from PG East are now online, and I will be giving a webcast of that presentation in late April. The presentation covers the new Postgres 9.0 features of hot standby and streaming replication: how they work, how to configure them, and their current limitations. The presentation also includes a live demonstration of setting up these features.

View Comments
Categories: Companies, Open Source

PG East: Hotel

Tue, 03/30/2010 - 17:00

PG East was an unqualified success, as others have already blogged about. This conference might signal a change in the style for dedicated Postgres conferences. Postgres-specific conferences started only a few years ago, mostly at universities. I was one of the early proponents of having conferences at universities after seeing the success of FOSDEM. I felt the low admission cost made possible with inexpensive university facilities was critical in attracting the mostly-volunteer Postgres developers to the conference.

I am ready to rethink that suggestion having attended PG East. The conference cost only USD $125, and many things were definitely easier having the event in a hotel:

  • Informal conversations
  • Attracting business users to the show
  • Group excursions
  • Laptop and bag storage

Continue Reading »

Categories: Companies, Open Source