Postgres data on a shared file system

From: "Peter Koczan" <pjkoczan(at)gmail(dot)com>
To: pgsql-admin(at)postgresql(dot)org
Subject: Postgres data on a shared file system
Date: 2007-04-19 22:05:36
Message-ID: 4544e0330704191505pec50feen93138b964462920e@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Just for funsies, today I decided to try and serve postgres data out of AFS
(just a tablespace, not an entire database cluster). For those of you not in
the know, AFS is a distributed, networked file system. It's like NFS, but
with differences in structure, permissions, quotas, administration, and many
other things that are beyond the scope of this list (you can find details
yourself if you're so inclined). I wanted to see if there were any potential
benefits, how it compared in terms of performance to a standard local-disk
cluster, and what hacks had to be put into place for things to work.

Does anyone else have experience trying to serve clusters or data from
shared file systems? I'd like to see how your experiences compare to mine.

- AFS uses Kerberos for authentication, and you need to have all the
Kerberos bits and tickets set up for the postgres system user in order to
have proper authentication for AFS. If you don't, you'd have to give
system:anyuser (the AFS equivalent of global access) write access to the
appropriate directories, which is bad for many reasons.
- As a result of the previous point, you can't use pg_ctl to start the
server. pg_ctl doesn't pass Kerberos credentials. I had to start the server
using the postgres binary and precede it with a ticket passing mechanism
(mine is called runauth).
- AFS was about 8% slower in both reading and writing data (both the local
disk and the AFS file server had 80 GB, 7200 RPM disks in RAID 1, and the
logs were on a separate disk on the local file-system, so this is a fairly
scientific metric).
- You can't take advantage of the shared file system because you can't share
tablespaces among clusters or servers. You'd either get an error for trying
to initialize a non-empty directory, or if you hack it to try and fool the
clusters you could wind up with data corruption and race conditions since
the clusters don't play nice.
- You can't take advantage of AFS replication (where you can put read-only
copies of files in one tree and read-write copies in another tree) since
trying to point a cluster at a read-only tablespace requires you to write to
that tree, which you can't do since it's read-only. You're better off using
something else for database replication.

All in all, while the prospect of using existing shared disk space was
intriguing, I couldn't take advantage of any features that would make it
truly worthwhile, there was a non-trivial performance hit, and it required a
few small hacks to get working.

But what about all of you, what are your thoughts and experiences with
trying to serve databases out of shared file systems?

Peter

Browse pgsql-admin by date

  From Date Subject
Next Message Mario Splivalo 2007-04-20 11:52:11 createdb: database postgres does not exists
Previous Message Alvaro Herrera 2007-04-19 21:35:55 Re: Auto vacuum