Quick Links

Re: unlogged tables

From:	Cédric Villemain <cedric(dot)villemain(dot)debian(at)gmail(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andy Colson <andy(at)squeakycode(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: unlogged tables
Date:	2010-12-07 22:43:29
Message-ID:	AANLkTi=KpVN_NCRPT8bdYr3uoagf4VpvrmCZ-YikfFnz@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

2010/12/7 Robert Haas <robertmhaas(at)gmail(dot)com>:
> On Tue, Dec 7, 2010 at 3:44 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>>> On Tue, Dec 7, 2010 at 1:17 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>>> Hm... I thought there had been discussion of a couple of different
>>>> flavors of table volatility. Is it really a good idea to commandeer
>>>> the word "volatile" for this particular one?
>>
>>> So far I've come up with the following possible behaviors we could
>>> theoretically implement:
>>
>>> 1. Any crash or shutdown truncates the table.
>>> 2. Any crash truncates the table, but a clean shutdown does not.
>>> 3. A crash truncates the table only if it's been written since the
>>> last checkpoint; a clean shutdown does not truncate it.
>>
>>> The main argument for doing #1 rather than #2 is that we'd rather not
>>> have to include unlogged table data in checkpoints. Andres Freund
>>> made the argument that we could avoid that anyway, though, by just
>>> doing an fsync() on every unlogged table file in the cluster at
>>> shutdown time. If that's acceptable, then ISTM there's no benefit to
>>> implementing #1 and we should just go with #2. If it's not
>>> acceptable, then we have to think about whether and how to have both
>>> of those behaviors.
>>
>>> #3 seems like a lot of work relative to #1 and #2 for a pretty
>>> marginal increase in durability.
>>
>> OK. I agree that #3 adds a lot of complexity for not much of anything.
>> If you've got data that's static enough that #3 adds a useful amount
>> of safety, then you might as well be keeping it in a regular table.
>>
>> I think a more relevant question is how complicated it'll be to issue
>> those fsyncs --- do you have a concrete implementation in mind?
>
> It can reuse most of the infrastructure we use for re-initializing
> everything after a crash or unclean shutdown. We just iterate over
> every tablepace/dbspace directory and look for files with _init forks.
> If we find any then we open the main fork files and fsync() each one.

It might make sense to document this behavior : a 'simple' restart
might be way longer than before. I would probably issue a sync(1)
before restarting the server in such situation. (if the
unlogged-volatile tables are large)

--
Cédric Villemain 2ndQuadrant
http://2ndQuadrant.fr/ PostgreSQL : Expertise, Formation et Support

In response to

Re: unlogged tables at 2010-12-07 22:09:18 from Robert Haas

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Jan Urbański	2010-12-07 22:56:43	Re: pl/python improvements
Previous Message	Tom Lane	2010-12-07 22:31:34	Re: unlogged tables