Data directory on read-only media

Lists: pgsql-hackers
From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Data directory on read-only media
Date: 2005-11-20 20:39:49
Message-ID: Pine.OSF.4.61.0511201815450.399601@kosh.hut.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

I've been thinking about running postgres from read-only media. It's
handy for creating demo CDs etc. I hacked together a patch that allows
you to run Postgres without write permissions to the data directory.

Some changes are needed:

1. Force all transactions to be read-only, using the mechanisms alread in
place. This catches all the obvious INSERT/UPDATE/CREATE modifications.

2. Disable VACUUM, CLUSTER, and nextval/setval

3. Disable bgwriter and manual checkpoints.

4. Don't increment xid counter, just use a single xid for all
transactions. Otherwise we might reach the xid wrap-around point, and we
can't vacuum. It seems safe to me, am I missing anything?

5. Don't try to write buffers with commit hint modifications. Just discard
them.

6. Don't write the stats file, opts file, or user/group files.

7. Don't write free space map.

8. Disable two-phase commit.

9. Modify startup and shutdown routines so that they don't write control
file, checkpoint, or attempt recovery.

Tricky parts:

10. Temporary tables. Should they be created in /tmp instead of the
datadir, or just disabled? How about temporary views?

11. Temporary files in sorting and hash joins. Use /tmp?

12. Subtransactions. Even read-only transactions write to the
subtrans-files, don't they? Also, what happens to the subtrans system if
we reuse the same xid over and over again?

13. NOTIFY/LISTEN. I guess most people use NOTIFY/LISTEN for cache
coherency etc, so I don't think many people will miss it in read-only
mode. And there has been plans to refactor it to an in-memory thing
anyway.

14. How to put the system to read-only mode? It needs to be something that
makes it impossible to run another postmaster in read/write mode at the
same time, without requiring write access to the data dir. I'm thinking of
a utility that creates a special postmaster.pid file. When a postmaster
starts up and sees that special postmaster.pid file, it starts in
read-only mode. The system must be in clean shutdowned state.

Does anyone remember anything else that writes to disk?

The patch does most of the above. There's no utility yet to create the
special postmaster.pid file. Use "echo readonly > data/postmaster.pid"
instead.

Here's the patch:
http://users.tkk.fi/~hlinnaka/pgsql/readonly-20051120.diff

- Heikki


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Data directory on read-only media
Date: 2005-11-20 20:53:38
Message-ID: 20106.1132520018@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Heikki Linnakangas <hlinnaka(at)iki(dot)fi> writes:
> 5. Don't try to write buffers with commit hint modifications. Just discard
> them.

The performance costs of that alone are astonishing (ie, repeated
verifications of commit status).

I think what you are doing is a completely wrongheaded way to approach
it... it sounds incredibly invasive, messy, and fragile. A database or
tablespace that has been frozen (per VACUUM FREEZE) could sensibly be
put on read-only media, but I can't see doing the rest of this. Have
you thought about putting the more dynamic stuff onto a RAM disk?

regards, tom lane


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: Data directory on read-only media
Date: 2005-11-21 04:23:30
Message-ID: 200511202023.30606.josh@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Heikki,

> 4. Don't increment xid counter, just use a single xid for all
> transactions. Otherwise we might reach the xid wrap-around point, and we
> can't vacuum. It seems safe to me, am I missing anything?

Well, it eliminates transaction isolation. Completely.

--
Josh Berkus
Aglio Database Solutions
San Francisco


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: Data directory on read-only media
Date: 2005-11-21 07:59:38
Message-ID: 200511210859.38912.peter_e@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Josh Berkus wrote:
> Well, it eliminates transaction isolation. Completely.

If the data is read-only, you presumably don't need that.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: Data directory on read-only media
Date: 2005-11-21 08:00:58
Message-ID: 200511210900.59214.peter_e@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Heikki Linnakangas wrote:
> I've been thinking about running postgres from read-only media. It's
> handy for creating demo CDs etc.

I think that a read-only installation of PostgreSQL would be a very poor
demonstration of its capabilities. Better put the data in a RAM disk.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/


From: <pmagnoli(at)systemevolution(dot)it>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Tablespace on ramdisk
Date: 2005-11-21 08:26:51
Message-ID: iqaq4r.bqnpiy@mail.systemevolution.it
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,
does anyone have experiences about putting a tablespace on ramdisk? Does it
work (and keep working after a restart of the server)?
Thanks in advance for any insight.

Paolo


From: Tino Wildenhain <tino(at)wildenhain(dot)de>
To: pmagnoli(at)systemevolution(dot)it
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Tablespace on ramdisk
Date: 2005-11-21 09:37:53
Message-ID: 43819571.1060108@wildenhain.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

pmagnoli(at)systemevolution(dot)it schrieb:
> Hi,
> does anyone have experiences about putting a tablespace on ramdisk? Does it
> work (and keep working after a restart of the server)?
> Thanks in advance for any insight.
>
Yes it does work as long as you dont restart your server.
Postgres does not appreciate disappearing cluster data.

What are you trying to solve btw?

++Tino


From: <pmagnoli(at)systemevolution(dot)it>
To: Tino Wildenhain <tino(at)wildenhain(dot)de>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Tablespace on ramdisk
Date: 2005-11-21 14:07:14
Message-ID: iqb5w2.pareyw@mail.systemevolution.it
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I'd like to mimic MySQL's in-memory tables (engine=memory), which structure
survives a server restart (data lost of course).
I suspected that a server restart would be a problem in this case.
Thank you anyway.

Paolo

Tino Wildenhain <tino(at)wildenhain(dot)de> ha scritto

> pmagnoli(at)systemevolution(dot)it schrieb:
> > Hi,
> > does anyone have experiences about putting a tablespace on ramdisk? Does
it
> > work (and keep working after a restart of the server)?
> > Thanks in advance for any insight.
> >
> Yes it does work as long as you dont restart your server.
> Postgres does not appreciate disappearing cluster data.
>
> What are you trying to solve btw?
>
> ++Tino
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly
>


From: Tino Wildenhain <tino(at)wildenhain(dot)de>
To: pmagnoli(at)systemevolution(dot)it
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Tablespace on ramdisk
Date: 2005-11-21 14:12:49
Message-ID: 4381D5E1.8080007@wildenhain.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

pmagnoli(at)systemevolution(dot)it schrieb:
> I'd like to mimic MySQL's in-memory tables (engine=memory), which structure
> survives a server restart (data lost of course).
> I suspected that a server restart would be a problem in this case.
> Thank you anyway.

you could use temp tables... but usually it isnt worth the
trouble. Adjust your cache mem and stuff and often used
data will be in memory automatically.

HTH
Tino


From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Data directory on read-only media
Date: 2005-11-21 17:58:15
Message-ID: Pine.OSF.4.61.0511211925170.277354@kosh.hut.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, 20 Nov 2005, Tom Lane wrote:

> Heikki Linnakangas <hlinnaka(at)iki(dot)fi> writes:
>> 5. Don't try to write buffers with commit hint modifications. Just discard
>> them.
>
> The performance costs of that alone are astonishing (ie, repeated
> verifications of commit status).

You'd vacuum first to avoid that, of course.

> I think what you are doing is a completely wrongheaded way to approach
> it... it sounds incredibly invasive, messy, and fragile. A database or
> tablespace that has been frozen (per VACUUM FREEZE) could sensibly be
> put on read-only media, but I can't see doing the rest of this. Have
> you thought about putting the more dynamic stuff onto a RAM disk?

Invasive: Most of the changes are small modifications to startup and
shutdown routines. I wouldn't call it too invasive. Probably less invasive
than enforcing that a tablespace is read-only, which would be a nice
feature of it's own.

Messy: Well, I guess it's in the eye of the beholder. It depends a lot on
the implementation details.

Fragile: It's certainly something you have to keep in mind whenever you
need to write something to disk. There's not that many places that do
that, I enumerated them in the original mail.

Using a RAM disk is harder for the user. You need to set up the RAM disk,
figure out what to copy to RAM disk and what not, and then be careful not
to change anything that's on the frozen tablespace.

What would a script look like that setups a RAM disk and runs postgres
from that? It's probably doable, but hard to do in a portable way.

There's also this TODO:
Allow a warm standby system to also allow read-only queries [pitr]

In fact, I was originally thinking of that. I should've mentioned it.
It has the same issues with transactions and WAL as running from
read-only media, so I decided to start with the seemingly easier case.

- Heikki


From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Data directory on read-only media
Date: 2005-11-21 18:03:34
Message-ID: Pine.OSF.4.61.0511211958270.277354@kosh.hut.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, 21 Nov 2005, Peter Eisentraut wrote:

> Heikki Linnakangas wrote:
>> I've been thinking about running postgres from read-only media. It's
>> handy for creating demo CDs etc.
>
> I think that a read-only installation of PostgreSQL would be a very poor
> demonstration of its capabilities. Better put the data in a RAM disk.

RAM space is limited.

I was thinking more of a demonstration CD of some other software that uses
PostgreSQL as kind of an embedded database. Of course, there's other
databases more suited for embedding.

- Heikki


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Data directory on read-only media
Date: 2005-11-21 21:40:40
Message-ID: 200511212240.41356.peter_e@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Heikki Linnakangas wrote:
> > I think that a read-only installation of PostgreSQL would be a very
> > poor demonstration of its capabilities. Better put the data in a
> > RAM disk.
>
> RAM space is limited.

Nowadays, CD space is often more limited than RAM. You could of course
now talk about a DVD instead. I wonder what kind of performance you'd
get out of that anyway.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Data directory on read-only media
Date: 2005-11-21 21:52:01
Message-ID: 4567.1132609921@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Heikki Linnakangas <hlinnaka(at)iki(dot)fi> writes:
> There's also this TODO:
> Allow a warm standby system to also allow read-only queries [pitr]

> In fact, I was originally thinking of that. I should've mentioned it.
> It has the same issues with transactions and WAL as running from
> read-only media, so I decided to start with the seemingly easier case.

No, it hasn't got anywhere near the same issues, primarily because you
don't have to have a 100% rock solid guarantee of no disk writes. Thus
for example there's no problem with updating hint bits, nor with
preventing use of temp files.

I concur with Peter's opinion that a PG demo that is hard-core read-only
is not going to be very useful. It certainly won't have a lot to do
with either the behavior or performance of the system in live use, which
would render it not only not very interesting but close to being false
advertising.

regards, tom lane