Re: unlogged tables

Lists: pgsql-hackers
From: Andy Colson <andy(at)squeakycode(dot)net>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: unlogged tables
Date: 2010-11-16 01:56:22
Message-ID: 4CE1E4C6.8050106@squeakycode.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I am attempting to test this

https://commitfest.postgresql.org/action/patch_view?id=424

but I'm not sure which version of PG this should be applied to. (it would be really neat, on here:
https://commitfest.postgresql.org/action/commitfest_view?id=8
if there was a note that said, this test this stuff against git tag X or branch Y or whatever)

I got the git:

git clone git://git.postgresql.org/git/postgresql.git

downloaded the patches, and applied them ok. then did ./configure and make

after much spewage I got:

bufmgr.c: In function 'PrefetchBuffer':
bufmgr.c:126:10: error: 'struct RelationData' has no member named 'rd_istemp'
make[4]: *** [bufmgr.o] Error 1

Just to make sure everything was ok with the original, I reset:

git reset --hard HEAD^
./configure
make
and all was well.

so I tried again:
make clean
make maintainer-clean

patch -p1 < relpersistence-v1.patch
.. ok ..

but then...

$ patch -p1 < unlogged-tables-v1.patch
patching file doc/src/sgml/indexam.sgml
patching file doc/src/sgml/ref/create_table.sgml
patching file doc/src/sgml/ref/create_table_as.sgml
patching file src/backend/access/gin/gininsert.c
patching file src/backend/access/gist/gist.c
patching file src/backend/access/hash/hash.c
patching file src/backend/access/nbtree/nbtree.c
patching file src/backend/access/transam/xlog.c
patching file src/backend/catalog/catalog.c
patching file src/backend/catalog/heap.c
patching file src/backend/catalog/index.c
patching file src/backend/catalog/storage.c
patching file src/backend/parser/gram.y
patching file src/backend/storage/file/Makefile
patching file src/backend/storage/file/copydir.c
patching file src/backend/storage/file/fd.c
The next patch would create the file src/backend/storage/file/reinit.c,
which already exists! Assume -R? [n]

That didnt happen the first time... I'm almost positive.

Not sure what I should do now.

-Andy


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andy Colson <andy(at)squeakycode(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: unlogged tables
Date: 2010-11-16 02:28:00
Message-ID: AANLkTimkstxbygFbGPboYj6G7bPk9VU==RtT-d74b-+1@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Nov 15, 2010 at 8:56 PM, Andy Colson <andy(at)squeakycode(dot)net> wrote:
> I am attempting to test this
>
> https://commitfest.postgresql.org/action/patch_view?id=424
>
> but I'm not sure which version of PG this should be applied to.  (it would
> be really neat, on here:
> https://commitfest.postgresql.org/action/commitfest_view?id=8
> if there was a note that said, this test this stuff against git tag X or
> branch Y or whatever)

They're pretty much all against the master branch.

> I got the git:
>
> git clone git://git.postgresql.org/git/postgresql.git
>
> downloaded the patches, and applied them ok.  then did ./configure and make
>
> after much spewage I got:
>
> bufmgr.c: In function 'PrefetchBuffer':
> bufmgr.c:126:10: error: 'struct RelationData' has no member named
> 'rd_istemp'
> make[4]: *** [bufmgr.o] Error 1

Woops. Good catch. I guess USE_PREFETCH isn't defined on my system.
That line needs to be changed to say RelationUsesLocalBuffers(reln)
rather than reln->rd_istemp. Updated patches attached.

> That didnt happen the first time... I'm almost positive.

When you applied the patches the first time, it created that file; but
git reset --hard doesn't remove untracked files.

> Not sure what I should do now.

git clean -dfx
git reset --hard
git pull

Apply attached patches.

configure
make
make install

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
unlogged-tables-v2.patch application/octet-stream 52.7 KB
relpersistence-v2.patch application/octet-stream 65.6 KB
relax-sync-commit-v1.patch application/octet-stream 3.1 KB

From: Andy Colson <andy(at)squeakycode(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: unlogged tables
Date: 2010-11-16 18:09:47
Message-ID: 4CE2C8EB.20807@squeakycode.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I was able to apply and compile and run ok, creating unlogged tables
seems to work as well.

I patched up pgbench to optionally create unlogged tables, and ran it
both ways. I get ~80tps normally, and ~1,500tps with unlogged. (Thats
from memory, was playing with it last night at home)

I also have a "real world" test I can try (import apache logs and run a
few stats).

What other things would be good to test:
indexes?
analyze/stats/plans?
dump/restore?

Is "create temp unlogged table stuff(...)" an option?

-Andy


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andy Colson <andy(at)squeakycode(dot)net>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: unlogged tables
Date: 2010-11-16 18:12:20
Message-ID: 2329.1289931140@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andy Colson <andy(at)squeakycode(dot)net> writes:
> Is "create temp unlogged table stuff(...)" an option?

temp tables are unlogged already.

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andy Colson <andy(at)squeakycode(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: unlogged tables
Date: 2010-11-16 18:34:55
Message-ID: AANLkTimjK31Fd1g3QGXMd31k=S4Q8h6zv+ApNY=2N0Sg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Nov 16, 2010 at 1:09 PM, Andy Colson <andy(at)squeakycode(dot)net> wrote:
> I was able to apply and compile and run ok, creating unlogged tables seems
> to work as well.
>
> I patched up pgbench to optionally create unlogged tables, and ran it both
> ways.  I get ~80tps normally, and ~1,500tps with unlogged.  (Thats from
> memory, was playing with it last night at home)

What do you get with normal tables but with fsync, full_page_writes,
and synchronous_commits turned off?

What do you get with normal tables but with sychronous_commit (only) off?

Can you detect any performance regression on normal tables with the
patch vs. without the patch?

> I also have a "real world" test I can try (import apache logs and run a few
> stats).

That would be great.

> What other things would be good to test:
> indexes?
> analyze/stats/plans?
> dump/restore?

All of those. I guess there's a question of what pg_dump should emit
for an unlogged table. Clearly, we need to dump a CREATE UNLOGGED
TABLE statement (which we do), and right now we also dump the table
contents - which seems reasonable, but arguably someone could say that
we ought not to dump the contents of anything less than a
full-fledged, permanent table.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andy Colson <andy(at)squeakycode(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: unlogged tables
Date: 2010-11-16 18:58:13
Message-ID: 1289933804-sup-5878@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Excerpts from Robert Haas's message of mar nov 16 15:34:55 -0300 2010:
> On Tue, Nov 16, 2010 at 1:09 PM, Andy Colson <andy(at)squeakycode(dot)net> wrote:

> > dump/restore?
>
> All of those. I guess there's a question of what pg_dump should emit
> for an unlogged table. Clearly, we need to dump a CREATE UNLOGGED
> TABLE statement (which we do), and right now we also dump the table
> contents - which seems reasonable, but arguably someone could say that
> we ought not to dump the contents of anything less than a
> full-fledged, permanent table.

I think if you do a regular backup of the complete database, unlogged
tables should come out empty, but if you specifically request a dump of
it, it shouldn't.

--
Álvaro Herrera <alvherre(at)commandprompt(dot)com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Andy Colson <andy(at)squeakycode(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: unlogged tables
Date: 2010-11-16 19:06:00
Message-ID: AANLkTik2NJWW5MXQx5QoFBM+5TRwvOsN7vRg8z-LC1aP@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Nov 16, 2010 at 1:58 PM, Alvaro Herrera
<alvherre(at)commandprompt(dot)com> wrote:
> Excerpts from Robert Haas's message of mar nov 16 15:34:55 -0300 2010:
>> On Tue, Nov 16, 2010 at 1:09 PM, Andy Colson <andy(at)squeakycode(dot)net> wrote:
>
>> > dump/restore?
>>
>> All of those.  I guess there's a question of what pg_dump should emit
>> for an unlogged table.  Clearly, we need to dump a CREATE UNLOGGED
>> TABLE statement (which we do), and right now we also dump the table
>> contents - which seems reasonable, but arguably someone could say that
>> we ought not to dump the contents of anything less than a
>> full-fledged, permanent table.
>
> I think if you do a regular backup of the complete database, unlogged
> tables should come out empty, but if you specifically request a dump of
> it, it shouldn't.

Oh, wow. That seems confusing.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Andy Colson <andy(at)squeakycode(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: unlogged tables
Date: 2010-11-16 20:50:42
Message-ID: 5521.1289940642@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Tue, Nov 16, 2010 at 1:58 PM, Alvaro Herrera
>> I think if you do a regular backup of the complete database, unlogged
>> tables should come out empty, but if you specifically request a dump of
>> it, it shouldn't.

> Oh, wow. That seems confusing.

I don't like it either.

I think allowing pg_dump to dump the data in an unlogged table is not
only reasonable, but essential. Imagine that someone determines that
his reliability needs will be adequately served by unlogged tables plus
hourly backups. Now you're going to tell him that that doesn't work
because pg_dump arbitrarily excludes the data in unlogged tables?

regards, tom lane


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Andy Colson <andy(at)squeakycode(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: unlogged tables
Date: 2010-11-16 21:34:18
Message-ID: 4CE2F8DA.3010001@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 11/16/2010 02:06 PM, Robert Haas wrote:
> On Tue, Nov 16, 2010 at 1:58 PM, Alvaro Herrera
> <alvherre(at)commandprompt(dot)com> wrote:
>>
>> I think if you do a regular backup of the complete database, unlogged
>> tables should come out empty, but if you specifically request a dump of
>> it, it shouldn't.
> Oh, wow. That seems confusing.

Yeah. And unnecessary. If you want it excluded we already have a switch
for that.

cheers

andrew


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Andy Colson <andy(at)squeakycode(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: unlogged tables
Date: 2010-11-16 21:43:43
Message-ID: AANLkTi=f9wO_cM59p2F1H9=HMt-ftbA=KPQDd+LefCyC@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Nov 16, 2010 at 3:50 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On Tue, Nov 16, 2010 at 1:58 PM, Alvaro Herrera
>>> I think if you do a regular backup of the complete database, unlogged
>>> tables should come out empty, but if you specifically request a dump of
>>> it, it shouldn't.
>
>> Oh, wow.  That seems confusing.
>
> I don't like it either.
>
> I think allowing pg_dump to dump the data in an unlogged table is not
> only reasonable, but essential.  Imagine that someone determines that
> his reliability needs will be adequately served by unlogged tables plus
> hourly backups.  Now you're going to tell him that that doesn't work
> because pg_dump arbitrarily excludes the data in unlogged tables?

Yeah, you'd have to allow a flag to control the behavior. And in that
case I'd rather the flag have a single default rather than different
defaults depending on whether or not individual tables were selected.
Something like --omit-unlogged-data.

Incidentally, unlogged tables plus hourly backups is not dissimilar to
what some NoSQL products are offering for reliability. Except with
PG, you can (or soon will be able to, hopefully) selectively apply
that lowered degree of reliability to a subset of your data for which
you determine it's appropriate, while maintaining full reliability
guarantees for other data. I am not aware of any other product which
offers that level of fine-grained control over durability.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: unlogged tables
Date: 2010-11-16 22:00:33
Message-ID: 4CE2FF01.8070208@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


> Yeah, you'd have to allow a flag to control the behavior. And in that
> case I'd rather the flag have a single default rather than different
> defaults depending on whether or not individual tables were selected.
> Something like --omit-unlogged-data.

Are you sure we don't want to default the other way? It seems to me
that most people using unlogged tables won't want to back them up ...
especially since the share lock for pgdump will add overhead for the
kinds of high-volume updates people want to do with unlogged tables.

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com


From: David Fetter <david(at)fetter(dot)org>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: unlogged tables
Date: 2010-11-16 22:07:35
Message-ID: 20101116220735.GJ2596@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Nov 16, 2010 at 02:00:33PM -0800, Josh Berkus wrote:
> > Yeah, you'd have to allow a flag to control the behavior. And in
> > that case I'd rather the flag have a single default rather than
> > different defaults depending on whether or not individual tables
> > were selected. Something like --omit-unlogged-data.
>
> Are you sure we don't want to default the other way?

+1 for defaulting the other way.

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: unlogged tables
Date: 2010-11-16 22:08:00
Message-ID: 1289945280.31200.67.camel@vanquo.pezone.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On tis, 2010-11-16 at 14:00 -0800, Josh Berkus wrote:
> It seems to me
> that most people using unlogged tables won't want to back them up ...
> especially since the share lock for pgdump will add overhead for the
> kinds of high-volume updates people want to do with unlogged tables.

Or perhaps most people will want them backed up, because them being
unlogged the backup is the only way to get them back in case of a crash?


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: unlogged tables
Date: 2010-11-16 22:12:10
Message-ID: 4CE301BA.4030801@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 11/16/10 2:08 PM, Peter Eisentraut wrote:
> On tis, 2010-11-16 at 14:00 -0800, Josh Berkus wrote:
>> It seems to me
>> that most people using unlogged tables won't want to back them up ...
>> especially since the share lock for pgdump will add overhead for the
>> kinds of high-volume updates people want to do with unlogged tables.
>
> Or perhaps most people will want them backed up, because them being
> unlogged the backup is the only way to get them back in case of a crash?

Yeah, hard to tell, really. Which default is less likely to become a
foot-gun?

Maybe it's time for a survey on -general.

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com


From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: unlogged tables
Date: 2010-11-16 22:15:47
Message-ID: 1289945747.10258.2955.camel@jd-desktop
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, 2010-11-16 at 14:00 -0800, Josh Berkus wrote:
> > Yeah, you'd have to allow a flag to control the behavior. And in that
> > case I'd rather the flag have a single default rather than different
> > defaults depending on whether or not individual tables were selected.
> > Something like --omit-unlogged-data.
>
> Are you sure we don't want to default the other way? It seems to me
> that most people using unlogged tables won't want to back them up ...

+1

JD

--
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: unlogged tables
Date: 2010-11-16 22:22:02
Message-ID: 4CE3040A.7070608@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 11/16/2010 05:12 PM, Josh Berkus wrote:
> On 11/16/10 2:08 PM, Peter Eisentraut wrote:
>> On tis, 2010-11-16 at 14:00 -0800, Josh Berkus wrote:
>>> It seems to me
>>> that most people using unlogged tables won't want to back them up ...
>>> especially since the share lock for pgdump will add overhead for the
>>> kinds of high-volume updates people want to do with unlogged tables.
>> Or perhaps most people will want them backed up, because them being
>> unlogged the backup is the only way to get them back in case of a crash?
> Yeah, hard to tell, really. Which default is less likely to become a
> foot-gun?
>
> Maybe it's time for a survey on -general.
>

I would argue pretty strongly that backing something up is much less
likely to be a foot-gun than not backing it up, and treating unlogged
tables the same as logged tables for this purpose is also much less
likely to be a foot-gun. As I pointed out upthread, we already have a
mechanism for not backing up selected objects. I'd much rather have a
rule that says "everything gets backed up by default" than one that says
"everything gets backed up by default except unlogged tables".

cheers

andrew


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Andy Colson <andy(at)squeakycode(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: unlogged tables
Date: 2010-11-16 22:22:35
Message-ID: 7439.1289946155@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Tue, Nov 16, 2010 at 3:50 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> I think allowing pg_dump to dump the data in an unlogged table is not
>> only reasonable, but essential.

> Yeah, you'd have to allow a flag to control the behavior. And in that
> case I'd rather the flag have a single default rather than different
> defaults depending on whether or not individual tables were selected.
> Something like --omit-unlogged-data.

As long as the default is to include the data, I wouldn't object to
having such a flag. A default that drops data seems way too
foot-gun-like.

regards, tom lane


From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: unlogged tables
Date: 2010-11-16 22:23:35
Message-ID: 1289946215.10258.2973.camel@jd-desktop
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, 2010-11-17 at 00:08 +0200, Peter Eisentraut wrote:
> On tis, 2010-11-16 at 14:00 -0800, Josh Berkus wrote:
> > It seems to me
> > that most people using unlogged tables won't want to back them up ...
> > especially since the share lock for pgdump will add overhead for the
> > kinds of high-volume updates people want to do with unlogged tables.
>
> Or perhaps most people will want them backed up, because them being
> unlogged the backup is the only way to get them back in case of a crash?

To me, the use of unlogged tables is going to be for dynamic, volatile
data that can be rebuilt from an integrity set on a crash. Session
tables, metadata tables, dynamic updates that are batched to logged
tables every 10 minutes, that type of thing.

I think Berkus has a good idea on asking general.

JD

>
>
>

--
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt


From: Andres Freund <andres(at)anarazel(dot)de>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: unlogged tables
Date: 2010-11-16 22:30:29
Message-ID: 201011162330.29993.andres@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tuesday 16 November 2010 23:12:10 Josh Berkus wrote:
> On 11/16/10 2:08 PM, Peter Eisentraut wrote:
> > On tis, 2010-11-16 at 14:00 -0800, Josh Berkus wrote:
> >> It seems to me
> >> that most people using unlogged tables won't want to back them up ...
> >> especially since the share lock for pgdump will add overhead for the
> >> kinds of high-volume updates people want to do with unlogged tables.
> >
> > Or perhaps most people will want them backed up, because them being
> > unlogged the backup is the only way to get them back in case of a crash?
>
> Yeah, hard to tell, really. Which default is less likely to become a
> foot-gun?
Well. Maybe both possibilities are just propable(which I think is unlikely),
but the different impact is pretty clear.

One way your backup runs too long and too much data changes, the other way
round you loose the data which you assumed safely backuped.

Isn't that a *really* easy decision?

Andres


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Andres Freund" <andres(at)anarazel(dot)de>, <pgsql-hackers(at)postgresql(dot)org>
Cc: "Josh Berkus" <josh(at)agliodbs(dot)com>
Subject: Re: unlogged tables
Date: 2010-11-16 22:36:12
Message-ID: 4CE2B2FC020000250003785D@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andres Freund <andres(at)anarazel(dot)de> wrote:

> One way your backup runs too long and too much data changes, the
> other way round you loose the data which you assumed safely
> backuped.
>
> Isn't that a *really* easy decision?

Yeah. Count me in the camp which wants the default behavior to be
that pg_dump backs up all permanent tables, even those which aren't
WAL-logged (and therefore aren't kept up in PITR backups, hot/warm
standbys, or streaming replication).

-Kevin


From: Andres Freund <andres(at)anarazel(dot)de>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: unlogged tables
Date: 2010-11-16 22:39:35
Message-ID: 201011162339.36138.andres@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tuesday 16 November 2010 23:30:29 Andres Freund wrote:
> On Tuesday 16 November 2010 23:12:10 Josh Berkus wrote:
> > On 11/16/10 2:08 PM, Peter Eisentraut wrote:
> > > On tis, 2010-11-16 at 14:00 -0800, Josh Berkus wrote:
> > >> It seems to me
> > >> that most people using unlogged tables won't want to back them up ...
> > >> especially since the share lock for pgdump will add overhead for the
> > >> kinds of high-volume updates people want to do with unlogged tables.
> > >
> > > Or perhaps most people will want them backed up, because them being
> > > unlogged the backup is the only way to get them back in case of a
> > > crash?
> >
> > Yeah, hard to tell, really. Which default is less likely to become a
> > foot-gun?
>
> Well. Maybe both possibilities are just propable(which I think is
> unlikely), but the different impact is pretty clear.
>
> One way your backup runs too long and too much data changes, the other way
> round you loose the data which you assumed safely backuped.
>
> Isn't that a *really* easy decision?
Oh, and another argument:
Which are you more likely to discover: a backup that runs consistenly running
for a short time or a backup thats getting slower and larger...

Andres


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: unlogged tables
Date: 2010-11-16 22:41:59
Message-ID: 7946.1289947319@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Josh Berkus <josh(at)agliodbs(dot)com> writes:
>> Yeah, you'd have to allow a flag to control the behavior. And in that
>> case I'd rather the flag have a single default rather than different
>> defaults depending on whether or not individual tables were selected.
>> Something like --omit-unlogged-data.

> Are you sure we don't want to default the other way? It seems to me
> that most people using unlogged tables won't want to back them up ...

That's a very debatable assumption. You got any evidence for it?
Personally, I don't think pg_dump should ever default to omitting
data.

> especially since the share lock for pgdump will add overhead for the
> kinds of high-volume updates people want to do with unlogged tables.

Say what? pg_dump just takes AccessShareLock. That doesn't add any
overhead.

regards, tom lane


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: unlogged tables
Date: 2010-11-16 22:47:22
Message-ID: 4CE309FA.3090900@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


> That's a very debatable assumption. You got any evidence for it?
> Personally, I don't think pg_dump should ever default to omitting
> data.

Survey launched, although it may become a moot point, given how this
discussion is going.

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers(at)postgresql(dot)org, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: unlogged tables
Date: 2010-11-16 22:52:11
Message-ID: AANLkTi=i3ab=QwmKJSj3TpzNmrsNY1ddaCKnn3Q6337c@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Nov 16, 2010 at 5:30 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> On Tuesday 16 November 2010 23:12:10 Josh Berkus wrote:
>> On 11/16/10 2:08 PM, Peter Eisentraut wrote:
>> > On tis, 2010-11-16 at 14:00 -0800, Josh Berkus wrote:
>> >> It seems to me
>> >> that most people using unlogged tables won't want to back them up ...
>> >> especially since the share lock for pgdump will add overhead for the
>> >> kinds of high-volume updates people want to do with unlogged tables.
>> >
>> > Or perhaps most people will want them backed up, because them being
>> > unlogged the backup is the only way to get them back in case of a crash?
>>
>> Yeah, hard to tell, really.   Which default is less likely to become a
>> foot-gun?
> Well. Maybe both possibilities are just propable(which I think is unlikely),
> but the different impact is pretty clear.
>
> One way your backup runs too long and too much data changes, the other way
> round you loose the data which you assumed safely backuped.
>
> Isn't that a *really* easy decision?

Yeah, it seems pretty clear to me.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: marcin mank <marcin(dot)mank(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andy Colson <andy(at)squeakycode(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: unlogged tables
Date: 2010-11-16 22:57:49
Message-ID: AANLkTi=vHnxy-_nU2_KZENDsrRDUs+W1KYXL4XowPEDp@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Can (should ?) unlogged tables' contents survive graceful (non-crash) shutdown?

Greetings
Marcin Mańk


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: marcin mank <marcin(dot)mank(at)gmail(dot)com>
Cc: Andy Colson <andy(at)squeakycode(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: unlogged tables
Date: 2010-11-17 00:40:57
Message-ID: AANLkTi=xuuALCgPbL_G9S9CB+8+yHH-SZ5oyCR944SaP@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Nov 16, 2010 at 5:57 PM, marcin mank <marcin(dot)mank(at)gmail(dot)com> wrote:
> Can (should ?) unlogged tables' contents survive graceful (non-crash) shutdown?

I don't think so. To make that work, you'd need to keep track of
every backing file that might contain pages not fsync()'d to disk, and
at shutdown time you'd need to fsync() them all before shutting down.
Doing that would require an awful lot of bookkeeping for a pretty
marginal gain. Maybe it would be useful to have:

ALTER TABLE .. READ [ONLY|WRITE];

...and preserve unlogged tables that are also read-only. Or perhaps
something specific to unlogged tables:

ALTER TABLE .. QUIESCE;

...which would take an AccessExclusiveLock, make the table read-only,
fsync() it, and tag it for restart-survival.

But I'm happy to leave all of this until we gain some field experience
with this feature, and have a better idea what features people would
most like to see.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: marcin mank <marcin(dot)mank(at)gmail(dot)com>, Andy Colson <andy(at)squeakycode(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: unlogged tables
Date: 2010-11-17 00:46:43
Message-ID: 10729.1289954803@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Tue, Nov 16, 2010 at 5:57 PM, marcin mank <marcin(dot)mank(at)gmail(dot)com> wrote:
>> Can (should ?) unlogged tables' contents survive graceful (non-crash) shutdown?

> I don't think so. To make that work, you'd need to keep track of
> every backing file that might contain pages not fsync()'d to disk, and
> at shutdown time you'd need to fsync() them all before shutting down.

This is presuming that we want to guarantee the same level of safety for
unlogged tables as for regular. Which, it seems to me, is exactly what
people *aren't* asking for. Why not just write the data and shut down?
If you're unlucky enough to have a system crash immediately after that,
well, you might have corrupt data in the unlogged tables ... but that
doesn't seem real probable.

regards, tom lane


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: unlogged tables
Date: 2010-11-17 01:15:47
Message-ID: 4CE32CC3.7080705@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 11/16/10 4:40 PM, Robert Haas wrote:
> But I'm happy to leave all of this until we gain some field experience
> with this feature, and have a better idea what features people would
> most like to see.

+1. Let's not complicate this.

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: marcin mank <marcin(dot)mank(at)gmail(dot)com>, Andy Colson <andy(at)squeakycode(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: unlogged tables
Date: 2010-11-17 01:56:20
Message-ID: AANLkTinkJSbe0=3539B3e=+kdkz_3uqAWn92g67iH-kr@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Nov 16, 2010 at 7:46 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On Tue, Nov 16, 2010 at 5:57 PM, marcin mank <marcin(dot)mank(at)gmail(dot)com> wrote:
>>> Can (should ?) unlogged tables' contents survive graceful (non-crash) shutdown?
>
>> I don't think so.  To make that work, you'd need to keep track of
>> every backing file that might contain pages not fsync()'d to disk, and
>> at shutdown time you'd need to fsync() them all before shutting down.
>
> This is presuming that we want to guarantee the same level of safety for
> unlogged tables as for regular.  Which, it seems to me, is exactly what
> people *aren't* asking for.  Why not just write the data and shut down?
> If you're unlucky enough to have a system crash immediately after that,
> well, you might have corrupt data in the unlogged tables ... but that
> doesn't seem real probable.

I have a hard time getting excited about a system that is designed to
ensure that we probably don't have data corruption. The whole point
of this feature is to relax the usual data integrity guarantees in a
controlled way. A small but uncertain risk of corruption is not an
improvement over a simple, predictable behavior.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: David Fetter <david(at)fetter(dot)org>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: unlogged tables
Date: 2010-11-17 02:02:08
Message-ID: 20101117020208.GB15516@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Nov 16, 2010 at 02:07:35PM -0800, David Fetter wrote:
> On Tue, Nov 16, 2010 at 02:00:33PM -0800, Josh Berkus wrote:
> > > Yeah, you'd have to allow a flag to control the behavior. And in
> > > that case I'd rather the flag have a single default rather than
> > > different defaults depending on whether or not individual tables
> > > were selected. Something like --omit-unlogged-data.
> >
> > Are you sure we don't want to default the other way?
>
> +1 for defaulting the other way.

Upon further reflection, I'm switching to the "default to backing up
unlogged tables" side.

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, marcin mank <marcin(dot)mank(at)gmail(dot)com>, Andy Colson <andy(at)squeakycode(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: unlogged tables
Date: 2010-11-17 07:30:45
Message-ID: 4CE384A5.4080906@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 17.11.2010 03:56, Robert Haas wrote:
> On Tue, Nov 16, 2010 at 7:46 PM, Tom Lane<tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Robert Haas<robertmhaas(at)gmail(dot)com> writes:
>>> On Tue, Nov 16, 2010 at 5:57 PM, marcin mank<marcin(dot)mank(at)gmail(dot)com> wrote:
>>>> Can (should ?) unlogged tables' contents survive graceful (non-crash) shutdown?
>>
>>> I don't think so. To make that work, you'd need to keep track of
>>> every backing file that might contain pages not fsync()'d to disk, and
>>> at shutdown time you'd need to fsync() them all before shutting down.
>>
>> This is presuming that we want to guarantee the same level of safety for
>> unlogged tables as for regular. Which, it seems to me, is exactly what
>> people *aren't* asking for. Why not just write the data and shut down?
>> If you're unlucky enough to have a system crash immediately after that,
>> well, you might have corrupt data in the unlogged tables ... but that
>> doesn't seem real probable.
>
> I have a hard time getting excited about a system that is designed to
> ensure that we probably don't have data corruption. The whole point
> of this feature is to relax the usual data integrity guarantees in a
> controlled way. A small but uncertain risk of corruption is not an
> improvement over a simple, predictable behavior.

I agree with Robert, the point of unlogged tables is that the system
knows to zap them away if there's any risk of having corruption in them.
A corrupt page can lead to all kinds of errors. We try to handle
corruption gracefully, but I wouldn't be surprised if you managed to
even get a segfault caused by a torn page if you're unlucky.

fsync()ing the file at shutdown doesn't seem too bad to me from
performance point of view, we tolerate that for all other tables. And
you can always truncate the table yourself before shutdown.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, marcin mank <marcin(dot)mank(at)gmail(dot)com>, Andy Colson <andy(at)squeakycode(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: unlogged tables
Date: 2010-11-17 15:11:51
Message-ID: 26245.1290006711@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> writes:
> fsync()ing the file at shutdown doesn't seem too bad to me from
> performance point of view, we tolerate that for all other tables. And
> you can always truncate the table yourself before shutdown.

The objection to that was not about performance. It was about how
to find out what needs to be fsync'd.

regards, tom lane


From: Greg Stark <gsstark(at)mit(dot)edu>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, marcin mank <marcin(dot)mank(at)gmail(dot)com>, Andy Colson <andy(at)squeakycode(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: unlogged tables
Date: 2010-11-17 16:00:46
Message-ID: AANLkTinq_pj3aL04oLTe-nx-jEw0mUqkeLOtCZv7creL@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Nov 17, 2010 at 3:11 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> writes:
>> fsync()ing the file at shutdown doesn't seem too bad to me from
>> performance point of view, we tolerate that for all other tables. And
>> you can always truncate the table yourself before shutdown.
>
> The objection to that was not about performance.  It was about how
> to find out what needs to be fsync'd.
>

Just a crazy brainstorming thought, but....

If this is a clean shutdown then all the non-unlogged tables have been
checkpointed so they should have no dirty pages in them anyways. So we
could just fsync everything.

--
greg


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Greg Stark" <gsstark(at)mit(dot)edu>,"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Heikki Linnakangas" <heikki(dot)linnakangas(at)enterprisedb(dot)com>, "marcin mank" <marcin(dot)mank(at)gmail(dot)com>, "Robert Haas" <robertmhaas(at)gmail(dot)com>, <pgsql-hackers(at)postgresql(dot)org>, "Andy Colson" <andy(at)squeakycode(dot)net>
Subject: Re: unlogged tables
Date: 2010-11-17 16:48:36
Message-ID: 4CE3B30402000025000378BA@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Stark <gsstark(at)mit(dot)edu> wrote:

> If this is a clean shutdown then all the non-unlogged tables have
> been checkpointed so they should have no dirty pages in them
> anyways. So we could just fsync everything.

Or just all the unlogged tables.

-Kevin


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, marcin mank <marcin(dot)mank(at)gmail(dot)com>, Andy Colson <andy(at)squeakycode(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: unlogged tables
Date: 2010-11-17 17:14:47
Message-ID: AANLkTinWYVA2SGJx4amB6PoOLJn=Y6aSYAi3bxzsntim@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Nov 17, 2010 at 11:00 AM, Greg Stark <gsstark(at)mit(dot)edu> wrote:
> On Wed, Nov 17, 2010 at 3:11 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> writes:
>>> fsync()ing the file at shutdown doesn't seem too bad to me from
>>> performance point of view, we tolerate that for all other tables. And
>>> you can always truncate the table yourself before shutdown.
>>
>> The objection to that was not about performance.  It was about how
>> to find out what needs to be fsync'd.
>
> Just a crazy brainstorming thought, but....
>
> If this is a clean shutdown then all the non-unlogged tables have been
> checkpointed so they should have no dirty pages in them anyways. So we
> could just fsync everything.

Hmm, that reminds me: checkpoints should really skip writing buffers
belonging to unlogged relations altogether; and any fsync against an
unlogged relation should be skipped. I need to go take a look at
what's required to make that happen, either as part of this patch or
as a follow-on commit.

It might be interesting to have a kind of semi-unlogged table where we
write a special xlog record for the first access after each checkpoint
but otherwise don't xlog. On redo, we truncate the tables mentioned,
but not any others, since they're presumably OK. But that's not what
I'm trying to design here. I'm trying optimize it for the case where
you DON'T care about durability and you just want it to be as fast as
possible.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: unlogged tables
Date: 2010-11-17 18:11:23
Message-ID: 4CE41ACB.5040605@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert, All:

I hope you're following the thread on -general about this feature.
We're getting a lot of feedback.

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, marcin mank <marcin(dot)mank(at)gmail(dot)com>, Andy Colson <andy(at)squeakycode(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: unlogged tables
Date: 2010-11-17 18:32:27
Message-ID: 4CE41FBB.6000006@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 17.11.2010 17:11, Tom Lane wrote:
> Heikki Linnakangas<heikki(dot)linnakangas(at)enterprisedb(dot)com> writes:
>> fsync()ing the file at shutdown doesn't seem too bad to me from
>> performance point of view, we tolerate that for all other tables. And
>> you can always truncate the table yourself before shutdown.
>
> The objection to that was not about performance. It was about how
> to find out what needs to be fsync'd.

I must be missing something: we handle that just fine with normal
tables, why is it a problem for unlogged tables?

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, marcin mank <marcin(dot)mank(at)gmail(dot)com>, Andy Colson <andy(at)squeakycode(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: unlogged tables
Date: 2010-11-17 18:46:46
Message-ID: 16060.1290019606@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> writes:
> On 17.11.2010 17:11, Tom Lane wrote:
>> The objection to that was not about performance. It was about how
>> to find out what needs to be fsync'd.

> I must be missing something: we handle that just fine with normal
> tables, why is it a problem for unlogged tables?

Hmm ... that's a good point. If we simply treat unlogged tables the
same as regular for checkpointing purposes, don't we end up having
flushed them all correctly during a shutdown checkpoint? I was thinking
that WAL-logging had some influence on that logic, but it doesn't.

Robert is probably going to object that he wanted to prevent any
fsyncing for unlogged tables, but the discussion over in pgsql-general
is crystal clear that people do NOT want to lose unlogged data over
a clean shutdown and restart. If all it takes to do that is to refrain
from lobotomizing the checkpoint logic for unlogged tables, I say we
should refrain.

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: unlogged tables
Date: 2010-11-17 18:48:56
Message-ID: AANLkTi=2JvqR-2gKiosk6oPWCyR_j8EWGfkCb4+E0GcX@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Nov 17, 2010 at 1:11 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> Robert, All:
>
> I hope you're following the thread on -general about this feature.
> We're getting a lot of feedback.

I haven't been; I'm not subscribed to general; it'd be useful to CC me
next time.

Reading through the thread in the archives, it seems like people are
mostly confused. Some are confused about the current behavior of the
patch (no, it really does always truncate your tables, I swear);
others are confused about how WAL logging works (of course a backend
crash doesn't truncate an ordinary table - that's because it's WAL
LOGGED); and still others are maybe not exactly confused but hoping
that unlogged table = MyISAM (try not to corrupt your data, but don't
get too bent out of shape about the possibility that it may get
corrupted anyway).

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, marcin mank <marcin(dot)mank(at)gmail(dot)com>, Andy Colson <andy(at)squeakycode(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: unlogged tables
Date: 2010-11-17 18:55:32
Message-ID: AANLkTinewD4wJwA2yh=MLXvE_3c5fkbVx6qrBR+EjTNf@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Nov 17, 2010 at 1:46 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> writes:
>> On 17.11.2010 17:11, Tom Lane wrote:
>>> The objection to that was not about performance.  It was about how
>>> to find out what needs to be fsync'd.
>
>> I must be missing something: we handle that just fine with normal
>> tables, why is it a problem for unlogged tables?
>
> Hmm ... that's a good point.  If we simply treat unlogged tables the
> same as regular for checkpointing purposes, don't we end up having
> flushed them all correctly during a shutdown checkpoint?  I was thinking
> that WAL-logging had some influence on that logic, but it doesn't.
>
> Robert is probably going to object that he wanted to prevent any
> fsyncing for unlogged tables, but the discussion over in pgsql-general
> is crystal clear that people do NOT want to lose unlogged data over
> a clean shutdown and restart.  If all it takes to do that is to refrain
> from lobotomizing the checkpoint logic for unlogged tables, I say we
> should refrain.

I think that's absolutely a bad idea. I seriously do not want to have
a conversation with someone about why their unlogged tables are
exacerbating their checkpoint I/O spikes. I'd be happy to have two
modes, though. We should probably revisit the syntax, though. One,
it seems that CREATE UNLOGGED TABLE is not as clear as I thought it
was. Two, when (not if) we add more durability levels, we don't want
to create keywords for all of them.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, marcin mank <marcin(dot)mank(at)gmail(dot)com>, Andy Colson <andy(at)squeakycode(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: unlogged tables
Date: 2010-11-17 19:16:06
Message-ID: 17317.1290021366@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Wed, Nov 17, 2010 at 1:46 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Robert is probably going to object that he wanted to prevent any
>> fsyncing for unlogged tables, but the discussion over in pgsql-general
>> is crystal clear that people do NOT want to lose unlogged data over
>> a clean shutdown and restart. If all it takes to do that is to refrain
>> from lobotomizing the checkpoint logic for unlogged tables, I say we
>> should refrain.

> I think that's absolutely a bad idea.

The customer is always right, and I think we are hearing loud and clear
what the customers want. Please let's not go out of our way to create
a feature that isn't what they want.

regards, tom lane


From: Kenneth Marshall <ktm(at)rice(dot)edu>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, marcin mank <marcin(dot)mank(at)gmail(dot)com>, Andy Colson <andy(at)squeakycode(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: unlogged tables
Date: 2010-11-17 19:22:04
Message-ID: 20101117192204.GX14016@aart.is.rice.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Nov 17, 2010 at 02:16:06PM -0500, Tom Lane wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> > On Wed, Nov 17, 2010 at 1:46 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> Robert is probably going to object that he wanted to prevent any
> >> fsyncing for unlogged tables, but the discussion over in pgsql-general
> >> is crystal clear that people do NOT want to lose unlogged data over
> >> a clean shutdown and restart. If all it takes to do that is to refrain
> >> from lobotomizing the checkpoint logic for unlogged tables, I say we
> >> should refrain.
>
> > I think that's absolutely a bad idea.
>
> The customer is always right, and I think we are hearing loud and clear
> what the customers want. Please let's not go out of our way to create
> a feature that isn't what they want.
>
> regards, tom lane
>

I would be fine with only having a safe shutdown with unlogged tables
and skip the checkpoint I/O all other times.

Cheers,
Ken


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Kenneth Marshall <ktm(at)rice(dot)edu>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, marcin mank <marcin(dot)mank(at)gmail(dot)com>, Andy Colson <andy(at)squeakycode(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: unlogged tables
Date: 2010-11-17 19:31:26
Message-ID: 4CE42D8E.7090006@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 11/17/2010 02:22 PM, Kenneth Marshall wrote:
> On Wed, Nov 17, 2010 at 02:16:06PM -0500, Tom Lane wrote:
>> Robert Haas<robertmhaas(at)gmail(dot)com> writes:
>>> On Wed, Nov 17, 2010 at 1:46 PM, Tom Lane<tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>>> Robert is probably going to object that he wanted to prevent any
>>>> fsyncing for unlogged tables, but the discussion over in pgsql-general
>>>> is crystal clear that people do NOT want to lose unlogged data over
>>>> a clean shutdown and restart. �If all it takes to do that is to refrain
>>>> from lobotomizing the checkpoint logic for unlogged tables, I say we
>>>> should refrain.
>>> I think that's absolutely a bad idea.
>> The customer is always right, and I think we are hearing loud and clear
>> what the customers want. Please let's not go out of our way to create
>> a feature that isn't what they want.
> I would be fine with only having a safe shutdown with unlogged tables
> and skip the checkpoint I/O all other times.

Yeah, I was just thinking something like that would be good, and should
overcome Robert's objection to the whole idea.

I also agree with Tom's sentiment above.

To answer another point I see Tom made on the -general list: while
individual backends may crash from time to time, crashes of the whole
Postgres server are very rare in my experience in production
environments. It's really pretty robust, unless you're doing crazy
stuff. So that makes it all the more important that we can restart a
server cleanly (say, to change a config setting) without losing the
unlogged tables. If we don't allow that we'll make a laughing stock of
ourselves. Honestly.

cheers

andrew


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: unlogged tables
Date: 2010-11-17 19:32:22
Message-ID: 1290021671-sup-7956@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Excerpts from Robert Haas's message of mié nov 17 15:48:56 -0300 2010:
> On Wed, Nov 17, 2010 at 1:11 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> > Robert, All:
> >
> > I hope you're following the thread on -general about this feature.
> > We're getting a lot of feedback.
>
> I haven't been; I'm not subscribed to general; it'd be useful to CC me
> next time.

FWIW I've figured that being subscribed to the lists is good even if I
have my mail client configured to hide these emails by default. It's a
lot easier for searching stuff that someone else references.

(I made the mistake of having it hide all pg-general email even though I
was CC'ed, though, which is the trivial way to implement this. I don't
recommend repeating this mistake.)

--
Álvaro Herrera <alvherre(at)commandprompt(dot)com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Kenneth Marshall <ktm(at)rice(dot)edu>, Robert Haas <robertmhaas(at)gmail(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, marcin mank <marcin(dot)mank(at)gmail(dot)com>, Andy Colson <andy(at)squeakycode(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: unlogged tables
Date: 2010-11-17 19:37:30
Message-ID: 18486.1290022650@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> On 11/17/2010 02:22 PM, Kenneth Marshall wrote:
>> I would be fine with only having a safe shutdown with unlogged tables
>> and skip the checkpoint I/O all other times.

> Yeah, I was just thinking something like that would be good, and should
> overcome Robert's objection to the whole idea.

I don't think you can fsync only in the shutdown checkpoint and assume
your data is safe, if you didn't fsync a write a few moments earlier.

Now, a few minutes ago Robert was muttering about supporting more than
one kind of degraded-reliability table. I could see inventing
"unlogged" tables, which means exactly that (no xlog support, but we
still checkpoint/fsync as usual), and "unsynced" tables which
also/instead suppress fsync activity. The former type could be assumed
to survive a clean shutdown/restart, while the latter wouldn't. This
would let people pick their poison.

regards, tom lane


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Kenneth Marshall <ktm(at)rice(dot)edu>, Robert Haas <robertmhaas(at)gmail(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, marcin mank <marcin(dot)mank(at)gmail(dot)com>, Andy Colson <andy(at)squeakycode(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: unlogged tables
Date: 2010-11-17 19:42:43
Message-ID: 4CE43033.2080008@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


> Now, a few minutes ago Robert was muttering about supporting more than
> one kind of degraded-reliability table. I could see inventing
> "unlogged" tables, which means exactly that (no xlog support, but we
> still checkpoint/fsync as usual), and "unsynced" tables which
> also/instead suppress fsync activity. The former type could be assumed
> to survive a clean shutdown/restart, while the latter wouldn't. This
> would let people pick their poison.

We're assuming here that the checkpoint activity for the unlogged table
causes significant load on a production system. Maybe we should do some
testing before we try to make this overly complex? I wouldn't be
surprised to find that on most filesystems the extra checkpointing of
the unlogged tables adds only small minority overhead.

Shouldn't be hard to build out pgbench into something which will test
this ... if only I had a suitable test machine available.

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Kenneth Marshall <ktm(at)rice(dot)edu>, Robert Haas <robertmhaas(at)gmail(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, marcin mank <marcin(dot)mank(at)gmail(dot)com>, Andy Colson <andy(at)squeakycode(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: unlogged tables
Date: 2010-11-17 19:44:27
Message-ID: 18633.1290023067@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

[ forgot to comment on this part ]

Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> To answer another point I see Tom made on the -general list: while
> individual backends may crash from time to time, crashes of the whole
> Postgres server are very rare in my experience in production
> environments.

Well, if you mean the postmaster darn near never goes down, that's true,
because we go out of our way to ensure it does as little as possible.
But that has got zip to do with this discussion, because a backend crash
has to be assumed to have corrupted unlogged tables. There are some
folk over in -general who are wishfully thinking that only a postmaster
crash would lose their unlogged data, but that's simply wrong. Backend
crashes *will* truncate those tables; there is no way around that. The
comment I made was that my experience as to how often backends crash
might not square with production experience --- but you do have to draw
the distinction between a backend crash and a postmaster crash.

regards, tom lane


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Kenneth Marshall <ktm(at)rice(dot)edu>, Robert Haas <robertmhaas(at)gmail(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, marcin mank <marcin(dot)mank(at)gmail(dot)com>, Andy Colson <andy(at)squeakycode(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: unlogged tables
Date: 2010-11-17 19:53:10
Message-ID: 4CE432A6.9020508@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 11/17/2010 02:44 PM, Tom Lane wrote:
> [ forgot to comment on this part ]
>
> Andrew Dunstan<andrew(at)dunslane(dot)net> writes:
>> To answer another point I see Tom made on the -general list: while
>> individual backends may crash from time to time, crashes of the whole
>> Postgres server are very rare in my experience in production
>> environments.
> Well, if you mean the postmaster darn near never goes down, that's true,
> because we go out of our way to ensure it does as little as possible.
> But that has got zip to do with this discussion, because a backend crash
> has to be assumed to have corrupted unlogged tables. There are some
> folk over in -general who are wishfully thinking that only a postmaster
> crash would lose their unlogged data, but that's simply wrong. Backend
> crashes *will* truncate those tables; there is no way around that. The
> comment I made was that my experience as to how often backends crash
> might not square with production experience --- but you do have to draw
> the distinction between a backend crash and a postmaster crash.

OK. I'd missed that. Thanks for clarifying.

cheers

andrew


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Kenneth Marshall <ktm(at)rice(dot)edu>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, marcin mank <marcin(dot)mank(at)gmail(dot)com>, Andy Colson <andy(at)squeakycode(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: unlogged tables
Date: 2010-11-17 19:54:14
Message-ID: AANLkTi=tdk9AN0GmYBp7wURB-nNMzuQCq_C8p0_Q6TpZ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Nov 17, 2010 at 2:31 PM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
>>> The customer is always right, and I think we are hearing loud and clear
>>> what the customers want.  Please let's not go out of our way to create
>>> a feature that isn't what they want.
>>
>> I would be fine with only having a safe shutdown with unlogged tables
>> and skip the checkpoint I/O all other times.
>
> Yeah, I was just thinking something like that would be good, and should
> overcome Robert's objection to the whole idea.

Could we slow down here a bit and talk through the ideas here in a
logical fashion?

The customer is always right, but the informed customer makes better
decisions than the uninformed customer. This idea, as proposed, does
not work. If you only include dirty buffers at the final checkpoint
before shutting down, you have no guarantee that any buffers that you
either didn't write or didn't fsync previously are actually on disk.
Therefore, you have no guarantee that the table data is not corrupted.
So you really have to decide between including the unlogged-table
buffers in EVERY checkpoint and not ever including them at all. Which
one is right depends on your use case.

For example, consider the poster who said that, when this feature is
available, they plan to try ripping out their memcached instance and
replacing it with PostgreSQL running unlogged tables. Suppose this
poster (or someone else in a similar situation) has a 64 GB and is
currently running a 60 GB memcached instance on it, which is not an
unrealistic scenario for memcached. Suppose further that he dirties
25% of that data each hour. memcached is currently doing no writes to
disk. When he switches to PostgreSQL and sets checkpoints_segments to
a gazillion and checkpoint_timeout to the maximum, he's going to start
writing 15 GB of data to disk every hour - data which he clearly
doesn't care about losing, or preserving across restarts, because he's
currently storing it in memcached. In fact, with memcached, he'll not
only lose data at shutdown - he'll lose data on a regular basis when
everything is running normally. We can try to convince ourselves that
someone in this situation will not care about needing to get 15GB of
disposable data per hour from memory to disk in order to have a
feature that he doesn't need, but I think it's going to be pretty hard
to make that credible.

Now, second use case. Consider someone who is currently running
PostgreSQL in a non-durable configuration, with fsync=off,
full_page_writes=off, and synchronous_commit=off. This person - who
is based on someone I spoke with at PG West - is doing a large amount
of data processing using PostGIS. Their typical workflow is to load a
bunch of data, run a simulation, and then throw away the entire
database. They don't want to pay the cost of durability because if
they crash in mid-simulation they will simply rerun it. Being fast is
more important. Whether or not this person will be happy with the
proposed behavior is a bit harder to say. If it kills performance,
they will definitely hate it. But if the performance penalty is only
modest, they may enjoy the convenience of being able to shut down the
database and start it up again later without losing data.

Third use case. Someone on pgsql-general mentioned that they want to
write logs to PG, and can abide losing them if a crash happens, but
not on a clean shutdown and restart. This person clearly shuts down
their production database a lot more often than I do, but that is OK.
By explicit stipulation, they want the survive-a-clean-shutdown
behavior. I have no problem supporting that use case, providing they
are willing to take the associated performance penalty at checkpoint
time, which we don't know because we haven't asked, but I'm fine with
assuming it's useful even though I probably wouldn't use it much
myself.

> I also agree with Tom's sentiment above.
>
> To answer another point I see Tom made on the -general list: while
> individual backends may crash from time to time, crashes of the whole
> Postgres server are very rare in my experience in production environments.
> It's really pretty robust, unless you're doing crazy stuff. So that makes it
> all the more important that we can restart a server cleanly (say, to change
> a config setting) without losing the unlogged tables. If we don't allow that
> we'll make a laughing stock of ourselves. Honestly.

Let's please not assume that there is only one reasonable option here,
or that I have not thought about some of these issues.

Thanks,

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Kenneth Marshall <ktm(at)rice(dot)edu>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, marcin mank <marcin(dot)mank(at)gmail(dot)com>, Andy Colson <andy(at)squeakycode(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: unlogged tables
Date: 2010-11-17 20:09:04
Message-ID: AANLkTinW+aQm46MVtaQPz5WQ90G0EE2FJE9oRk-bLMJV@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Nov 17, 2010 at 2:42 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
>
>> Now, a few minutes ago Robert was muttering about supporting more than
>> one kind of degraded-reliability table.  I could see inventing
>> "unlogged" tables, which means exactly that (no xlog support, but we
>> still checkpoint/fsync as usual), and "unsynced" tables which
>> also/instead suppress fsync activity.  The former type could be assumed
>> to survive a clean shutdown/restart, while the latter wouldn't.  This
>> would let people pick their poison.
>
> We're assuming here that the checkpoint activity for the unlogged table
> causes significant load on a production system.  Maybe we should do some
> testing before we try to make this overly complex?  I wouldn't be
> surprised to find that on most filesystems the extra checkpointing of
> the unlogged tables adds only small minority overhead.
>
> Shouldn't be hard to build out pgbench into something which will test
> this ... if only I had a suitable test machine available.

I guess the point I'd make here is that checkpoint I/O will be a
problem for unlogged tables in exactly the same situations in which it
is a problem for regular tables. There is some amount of I/O that
your system can handle before the additional I/O caused by checkpoints
starts to become a problem. If unlogged tables (or one particular
variant of unlogged tables) don't need to participate in checkpoints,
then you will be able to use unlogged tables, in situations where they
are appropriate to the workload, to control your I/O load and
hopefully keep it below the level where it causes a problem. Of
course, there will also be workloads where your system has plenty of
spare capacity (in which case it won't matter) or where your system is
going to be overwhelmed anyway (in which case it doesn't really matter
either). But if you are somewhere between those two extremes, this
has to matter.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Andres Freund <andres(at)anarazel(dot)de>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Kenneth Marshall <ktm(at)rice(dot)edu>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, marcin mank <marcin(dot)mank(at)gmail(dot)com>, Andy Colson <andy(at)squeakycode(dot)net>
Subject: Re: unlogged tables
Date: 2010-11-17 20:35:05
Message-ID: 201011172135.06272.andres@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wednesday 17 November 2010 20:54:14 Robert Haas wrote:
> On Wed, Nov 17, 2010 at 2:31 PM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
> >>> The customer is always right, and I think we are hearing loud and clear
> >>> what the customers want. Please let's not go out of our way to create
> >>> a feature that isn't what they want.
> >>
> >> I would be fine with only having a safe shutdown with unlogged tables
> >> and skip the checkpoint I/O all other times.
> >
> > Yeah, I was just thinking something like that would be good, and should
> > overcome Robert's objection to the whole idea.
>
> Could we slow down here a bit and talk through the ideas here in a
> logical fashion?
>
> The customer is always right, but the informed customer makes better
> decisions than the uninformed customer. This idea, as proposed, does
> not work. If you only include dirty buffers at the final checkpoint
> before shutting down, you have no guarantee that any buffers that you
> either didn't write or didn't fsync previously are actually on disk.
> Therefore, you have no guarantee that the table data is not corrupted.
> So you really have to decide between including the unlogged-table
> buffers in EVERY checkpoint and not ever including them at all. Which
> one is right depends on your use case.
How can you get a buffer which was no written out *at all*? Do you want to
force all such pages to stay in shared_buffers? That sounds quite a bit more
complicated than what you proposed...

> For example, consider the poster who said that, when this feature is
> available, they plan to try ripping out their memcached instance and
> replacing it with PostgreSQL running unlogged tables. Suppose this
> poster (or someone else in a similar situation) has a 64 GB and is
> currently running a 60 GB memcached instance on it, which is not an
> unrealistic scenario for memcached. Suppose further that he dirties
> 25% of that data each hour. memcached is currently doing no writes to
> disk. When he switches to PostgreSQL and sets checkpoints_segments to
> a gazillion and checkpoint_timeout to the maximum, he's going to start
> writing 15 GB of data to disk every hour - data which he clearly
> doesn't care about losing, or preserving across restarts, because he's
> currently storing it in memcached. In fact, with memcached, he'll not
> only lose data at shutdown - he'll lose data on a regular basis when
> everything is running normally. We can try to convince ourselves that
> someone in this situation will not care about needing to get 15GB of
> disposable data per hour from memory to disk in order to have a
> feature that he doesn't need, but I think it's going to be pretty hard
> to make that credible.
To really support that use case we would first need to make shared_buffers
properly scale to 64GB - which unfortunatley, in my experience, is not yet the
case.
Also, see the issues in the former paragraph - I have severe doubts you can
support such a memcached scenario by pg. Either you spill to disk if your
buffers overflow (fine with me) or you need to throw away data memcached alike. I
doubt there is a sensible implementation in pg for the latter.

So you will have to write to disk at some point...

> Third use case. Someone on pgsql-general mentioned that they want to
> write logs to PG, and can abide losing them if a crash happens, but
> not on a clean shutdown and restart. This person clearly shuts down
> their production database a lot more often than I do, but that is OK.
> By explicit stipulation, they want the survive-a-clean-shutdown
> behavior. I have no problem supporting that use case, providing they
> are willing to take the associated performance penalty at checkpoint
> time, which we don't know because we haven't asked, but I'm fine with
> assuming it's useful even though I probably wouldn't use it much
> myself.
Maybe I am missing something - but why does this imply we have to write data
at checkpoints?
Just fsyncing every file belonging to an persistently-unlogged (or whatever
sensible name anyone can come up) table is not prohibively expensive - in fact
doing that on a local $PGDATA with approx 300GB and loads of tables doing so
takes less than 15s on a system with hot inode/dentry cache and no dirty files.
(just `find $PGDATA -print0|xargs -0 fsync_many_files` with fsync_many_files
beeing a tiny c program doing posix_fadvise(POSIX_FADV_DONTNEED) on all files
and then fsyncs every one).
The assumption of a hot inode cache is realistic I think.

Andres


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Kenneth Marshall <ktm(at)rice(dot)edu>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, marcin mank <marcin(dot)mank(at)gmail(dot)com>, Andy Colson <andy(at)squeakycode(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: unlogged tables
Date: 2010-11-17 20:37:24
Message-ID: AANLkTimYSB8nGpw0zkYZzuh5RSBaOWH0BMeU8p_g71KF@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Nov 17, 2010 at 2:37 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
>> On 11/17/2010 02:22 PM, Kenneth Marshall wrote:
>>> I would be fine with only having a safe shutdown with unlogged tables
>>> and skip the checkpoint I/O all other times.
>
>> Yeah, I was just thinking something like that would be good, and should
>> overcome Robert's objection to the whole idea.
>
> I don't think you can fsync only in the shutdown checkpoint and assume
> your data is safe, if you didn't fsync a write a few moments earlier.
>
> Now, a few minutes ago Robert was muttering about supporting more than
> one kind of degraded-reliability table.  I could see inventing
> "unlogged" tables, which means exactly that (no xlog support, but we
> still checkpoint/fsync as usual), and "unsynced" tables which
> also/instead suppress fsync activity.  The former type could be assumed
> to survive a clean shutdown/restart, while the latter wouldn't.  This
> would let people pick their poison.

OK, so we're proposing a hierarchy like this.

1. PERMANENT (already exists). Permanent tables are WAL-logged,
participate in checkpoints, and are fsync'd. They survive crashes and
clean restarts, and are replicated.

2. UNLOGGED (what this patch currently implements). Unlogged tables
are not WAL-logged, but they do participate in checkpoints and they
are fsync'd on request. They survive clean restarts, but on a crash
they are truncated. They are not replicated.

3. UNSYNCED (future work). Unsynced tables are not WAL-logged, do not
participate in checkpoints, and are never fsync'd. After any sort of
crash or shutdown, clean or otherwise, they are truncated. They are
not replicated.

4. GLOBAL TEMPORARY (future work). Global temporary tables are not
WAL-logged, do not participate in checkpoints, and are never fsync'd.
The contents of each global temporary table are private to that
session, so that they can use the local buffer manager rather than
shared buffers. Multiple sessions can use a global temporary table at
the same time, and each sees separate contents. At session exit, any
contents inserted by the owning backend are lost; since all sessions
exit on crash or shutdown, all contents are also lost at that time.

5. LOCAL TEMPORARY (our current temp tables). Local temporary tables
are not WAL-logged, do not participate in checkpoints, and are never
fsync'd. The table definition and all of its contents are private to
the session, so that they are dropped at session exit (or at
transaction end if ON COMMIT DROP is used). Since all sessions exit
on crash or shutdown, all table definitions and all table contents are
lost at that time.

It's possible to imagine a few more stops on this hierarchy. For
example, you could have an ASYNCHRONOUS table between (1) and (2) that
always acts as if synchronous_commit=off, but is otherwise replicated
and durable over crashes; or a MINIMALLY LOGGED table that is XLOG'd
as if wal_level=minimal even when the actual value of wal_level is
otherwise, and is therefore crash-safe but not replication-safe; or a
level that is similar to unlogged but we XLOG the first event that
dirties a page after each checkpoint, and therefore even on a crash we
need only remove the tables for which such an XLOG record has been
written. All of those are a bit speculative perhaps but we could jam
them in there if there's demand, I suppose.

I don't particularly care for the name UNSYNCED, and I'm starting not
to like UNLOGGED much either, although at least that one is an actual
word. PERMANENT and the flavors of TEMPORARY are a reasonably
comprehensible as a description of user-visible behavior, but UNLOGGED
and UNSYNCED sounds a lot like they're discussing internal details
that the user might not actually understand or care about. I don't
have a better idea right off the top of my head, though.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kenneth Marshall <ktm(at)rice(dot)edu>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, marcin mank <marcin(dot)mank(at)gmail(dot)com>, Andy Colson <andy(at)squeakycode(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: unlogged tables
Date: 2010-11-17 20:48:52
Message-ID: 4CE43FB4.9070701@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 11/17/2010 03:37 PM, Robert Haas wrote:
> I don't particularly care for the name UNSYNCED, and I'm starting not
> to like UNLOGGED much either, although at least that one is an actual
> word. PERMANENT and the flavors of TEMPORARY are a reasonably
> comprehensible as a description of user-visible behavior, but UNLOGGED
> and UNSYNCED sounds a lot like they're discussing internal details
> that the user might not actually understand or care about. I don't
> have a better idea right off the top of my head, though.

Maybe VOLATILE for UNSYNCED? Not sure about UNLOGGED.

cheers

andrew


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers(at)postgresql(dot)org, Andrew Dunstan <andrew(at)dunslane(dot)net>, Kenneth Marshall <ktm(at)rice(dot)edu>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, marcin mank <marcin(dot)mank(at)gmail(dot)com>, Andy Colson <andy(at)squeakycode(dot)net>
Subject: Re: unlogged tables
Date: 2010-11-17 20:51:37
Message-ID: AANLkTi=iRNbnJ1ZaKkN4b_KY7Oh-eq-oEaV4-hSGJ_MN@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Nov 17, 2010 at 3:35 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
>> The customer is always right, but the informed customer makes better
>> decisions than the uninformed customer.  This idea, as proposed, does
>> not work.  If you only include dirty buffers at the final checkpoint
>> before shutting down, you have no guarantee that any buffers that you
>> either didn't write or didn't fsync previously are actually on disk.
>> Therefore, you have no guarantee that the table data is not corrupted.
>>  So you really have to decide between including the unlogged-table
>> buffers in EVERY checkpoint and not ever including them at all.  Which
>> one is right depends on your use case.
> How can you get a buffer which was no written out *at all*? Do you want to
> force all such pages to stay in shared_buffers? That sounds quite a bit more
> complicated than what you proposed...

Oh, you're right. We always have to write buffers before kicking them
out of shared_buffers, but if we don't fsync them we have no guarantee
they're actually on disk.

>> For example, consider the poster who said that, when this feature is
>> available, they plan to try ripping out their memcached instance and
>> replacing it with PostgreSQL running unlogged tables.  Suppose this
>> poster (or someone else in a similar situation) has a 64 GB and is
>> currently running a 60 GB memcached instance on it, which is not an
>> unrealistic scenario for memcached.  Suppose further that he dirties
>> 25% of that data each hour.  memcached is currently doing no writes to
>> disk.  When he switches to PostgreSQL and sets checkpoints_segments to
>> a gazillion and checkpoint_timeout to the maximum, he's going to start
>> writing 15 GB of data to disk every hour - data which he clearly
>> doesn't care about losing, or preserving across restarts, because he's
>> currently storing it in memcached.  In fact, with memcached, he'll not
>> only lose data at shutdown - he'll lose data on a regular basis when
>> everything is running normally.  We can try to convince ourselves that
>> someone in this situation will not care about needing to get 15GB of
>> disposable data per hour from memory to disk in order to have a
>> feature that he doesn't need, but I think it's going to be pretty hard
>> to make that credible.
> To really support that use case we would first need to make shared_buffers
> properly scale to 64GB - which unfortunatley, in my experience, is not yet the
> case.

Well, that's something to aspire to. :-)

> Also, see the issues in the former paragraph - I have severe doubts you can
> support such a memcached scenario by pg. Either you spill to disk if your
> buffers overflow (fine with me) or you need to throw away data memcached alike. I
> doubt there is a sensible implementation in pg for the latter.
>
> So you will have to write to disk at some point...

I agree that there are difficulties, but again, doing checkpoint I/O
for data that the user was willing to throw away is going in the wrong
direction.

>> Third use case.  Someone on pgsql-general mentioned that they want to
>> write logs to PG, and can abide losing them if a crash happens, but
>> not on a clean shutdown and restart.  This person clearly shuts down
>> their production database a lot more often than I do, but that is OK.
>> By explicit stipulation, they want the survive-a-clean-shutdown
>> behavior.  I have no problem supporting that use case, providing they
>> are willing to take the associated performance penalty at checkpoint
>> time, which we don't know because we haven't asked, but I'm fine with
>> assuming it's useful even though I probably wouldn't use it much
>> myself.
> Maybe I am missing something - but why does this imply we have to write data
> at checkpoints?
> Just fsyncing every file belonging to an persistently-unlogged (or whatever
> sensible name anyone can come up) table is not prohibively expensive - in fact
> doing that on a local $PGDATA with approx 300GB and loads of tables doing so
> takes less than 15s on a system with hot inode/dentry cache and no dirty files.
> (just `find $PGDATA -print0|xargs -0 fsync_many_files` with fsync_many_files
> beeing a tiny c program doing posix_fadvise(POSIX_FADV_DONTNEED) on all files
> and then fsyncs every one).
> The assumption of a hot inode cache is realistic I think.

Hmm. I don't really want to try to do it in this patch because it's
complicated enough already, but if people don't mind the shutdown
sequence potentially being slowed down a bit, that might allow us to
have the best of both worlds without needing to invent multiple
durability levels. I was sort of assuming that people wouldn't want
to slow down the shutdown sequence to avoid losing data they've
already declared isn't that valuable, but evidently I underestimated
the demand for kinda-durable tables. If the overhead of doing this
isn't too severe, it might be the way to go.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: David Fetter <david(at)fetter(dot)org>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kenneth Marshall <ktm(at)rice(dot)edu>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, marcin mank <marcin(dot)mank(at)gmail(dot)com>, Andy Colson <andy(at)squeakycode(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: unlogged tables
Date: 2010-11-17 20:58:46
Message-ID: 20101117205846.GA13723@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Nov 17, 2010 at 03:48:52PM -0500, Andrew Dunstan wrote:
> On 11/17/2010 03:37 PM, Robert Haas wrote:
> >I don't particularly care for the name UNSYNCED, and I'm starting
> >not to like UNLOGGED much either, although at least that one is an
> >actual word. PERMANENT and the flavors of TEMPORARY are a
> >reasonably comprehensible as a description of user-visible
> >behavior, but UNLOGGED and UNSYNCED sounds a lot like they're
> >discussing internal details that the user might not actually
> >understand or care about. I don't have a better idea right off the
> >top of my head, though.
>
> Maybe VOLATILE for UNSYNCED? Not sure about UNLOGGED.

+1 for describing the end-user-visible behavior.

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


From: Steve Crawford <scrawford(at)pinpointresearch(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kenneth Marshall <ktm(at)rice(dot)edu>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, marcin mank <marcin(dot)mank(at)gmail(dot)com>, Andy Colson <andy(at)squeakycode(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: unlogged tables
Date: 2010-11-17 20:59:18
Message-ID: 4CE44226.8060100@pinpointresearch.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 11/17/2010 12:48 PM, Andrew Dunstan wrote:
>
> Maybe VOLATILE for UNSYNCED? Not sure about UNLOGGED.
>
UNSAFE and EXTREMELY_UNSAFE?? :)

Cheers,
Steve


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Robert Haas" <robertmhaas(at)gmail(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "Heikki Linnakangas" <heikki(dot)linnakangas(at)enterprisedb(dot)com>, "marcin mank" <marcin(dot)mank(at)gmail(dot)com>, <pgsql-hackers(at)postgresql(dot)org>, "Kenneth Marshall" <ktm(at)rice(dot)edu>, "Andy Colson" <andy(at)squeakycode(dot)net>
Subject: Re: unlogged tables
Date: 2010-11-17 21:00:12
Message-ID: 4CE3EDFC0200002500037927@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> OK, so we're proposing a hierarchy like this.
>
> 1. PERMANENT (already exists).

> 2. UNLOGGED (what this patch currently implements).

> 3. UNSYNCED (future work).

> 4. GLOBAL TEMPORARY (future work).

> 5. LOCAL TEMPORARY (our current temp tables).

All of the above would have real uses in our shop.

> It's possible to imagine a few more stops on this hierarchy.

Some of these might be slightly preferred over the above in certain
circumstances, but that's getting down to fine tuning. I think the
five listed above are more important than the "speculative ones
mentioned.

> I don't particularly care for the name UNSYNCED

EVANESCENT?

> I'm starting not to like UNLOGGED much either

EPHEMERAL?

Actually, the UNSYNCED and UNLOGGED seem fairly clear....

-Kevin


From: "A(dot)M(dot)" <agentm(at)themactionfaction(dot)com>
To: pgsql-hackers Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: unlogged tables
Date: 2010-11-17 21:02:03
Message-ID: 2AC33E77-3A2C-475E-8B26-237B436E903A@themactionfaction.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On Nov 17, 2010, at 4:00 PM, Kevin Grittner wrote:

> Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
>> OK, so we're proposing a hierarchy like this.
>>
>> 1. PERMANENT (already exists).
>
>> 2. UNLOGGED (what this patch currently implements).
>
>> 3. UNSYNCED (future work).
>
>> 4. GLOBAL TEMPORARY (future work).
>
>> 5. LOCAL TEMPORARY (our current temp tables).
>
> All of the above would have real uses in our shop.
>
>> It's possible to imagine a few more stops on this hierarchy.
>
> Some of these might be slightly preferred over the above in certain
> circumstances, but that's getting down to fine tuning. I think the
> five listed above are more important than the "speculative ones
> mentioned.
>
>> I don't particularly care for the name UNSYNCED
>
> EVANESCENT?
>
>> I'm starting not to like UNLOGGED much either
>
> EPHEMERAL?
>
> Actually, the UNSYNCED and UNLOGGED seem fairly clear....

Unless one thinks that the types could be combined- perhaps a table declaration could use both UNLOGGED and UNSYNCED?

Cheers,
M


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, marcin mank <marcin(dot)mank(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, Kenneth Marshall <ktm(at)rice(dot)edu>, Andy Colson <andy(at)squeakycode(dot)net>
Subject: Re: unlogged tables
Date: 2010-11-17 21:05:56
Message-ID: AANLkTikj0ykf=EJEGCsOxvyt014LmMMpHxc205BDfuNU@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Nov 17, 2010 at 4:00 PM, Kevin Grittner
<Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
>> OK, so we're proposing a hierarchy like this.
>>
>> 1. PERMANENT (already exists).
>
>> 2. UNLOGGED (what this patch currently implements).
>
>> 3. UNSYNCED (future work).
>
>> 4. GLOBAL TEMPORARY (future work).
>
>> 5. LOCAL TEMPORARY (our current temp tables).
>
> All of the above would have real uses in our shop.
>
>> It's possible to imagine a few more stops on this hierarchy.
>
> Some of these might be slightly preferred over the above in certain
> circumstances, but that's getting down to fine tuning.  I think the
> five listed above are more important than the "speculative ones
> mentioned.
>
>> I don't particularly care for the name UNSYNCED
>
> EVANESCENT?
>
>> I'm starting not to like UNLOGGED much either
>
> EPHEMERAL?
>
> Actually, the UNSYNCED and UNLOGGED seem fairly clear....

I think Andrew's suggestion of VOLATILE is pretty good. It's hard to
come up with multiple words that express gradations of "we might
decide to chuck your data if things go South", though. Then again if
we go with Andres's suggestion maybe we can get by with one level.

Or if we still end up with multiple levels, maybe it's best to use
VOLATILE for everything >1 and <4, and then have a subordinate clause
to specify gradations.

CREATE VOLATILE TABLE blow_me_away (k text, v text) SOME OTHER WORDS
THAT EXPLAIN THE DETAILS GO HERE;

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Kenneth Marshall <ktm(at)rice(dot)edu>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, marcin mank <marcin(dot)mank(at)gmail(dot)com>, Andy Colson <andy(at)squeakycode(dot)net>
Subject: Re: unlogged tables
Date: 2010-11-17 21:06:23
Message-ID: 1290027668-sup-1551@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Excerpts from Robert Haas's message of mié nov 17 17:51:37 -0300 2010:
> On Wed, Nov 17, 2010 at 3:35 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:

> > How can you get a buffer which was no written out *at all*? Do you want to
> > force all such pages to stay in shared_buffers? That sounds quite a bit more
> > complicated than what you proposed...
>
> Oh, you're right. We always have to write buffers before kicking them
> out of shared_buffers, but if we don't fsync them we have no guarantee
> they're actually on disk.

You could just open all the segments and fsync them.

--
Álvaro Herrera <alvherre(at)commandprompt(dot)com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, marcin mank <marcin(dot)mank(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, Kenneth Marshall <ktm(at)rice(dot)edu>, Andy Colson <andy(at)squeakycode(dot)net>
Subject: Re: unlogged tables
Date: 2010-11-17 21:06:34
Message-ID: 4CE443DA.90604@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 11/17/2010 04:00 PM, Kevin Grittner wrote:
> Actually, the UNSYNCED and UNLOGGED seem fairly clear....

I think Robert's right. These names won't convey much to someone not
steeped in our technology.

cheers

andrew


From: David Fetter <david(at)fetter(dot)org>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, marcin mank <marcin(dot)mank(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, Kenneth Marshall <ktm(at)rice(dot)edu>, Andy Colson <andy(at)squeakycode(dot)net>
Subject: Re: unlogged tables
Date: 2010-11-17 21:12:15
Message-ID: 20101117211215.GB13723@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Nov 17, 2010 at 04:05:56PM -0500, Robert Haas wrote:
> On Wed, Nov 17, 2010 at 4:00 PM, Kevin Grittner
> <Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:
> > Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> >
> >> OK, so we're proposing a hierarchy like this.
> >>
> >> 1. PERMANENT (already exists).
> >
> >> 2. UNLOGGED (what this patch currently implements).
> >
> >> 3. UNSYNCED (future work).
> >
> >> 4. GLOBAL TEMPORARY (future work).
> >
> >> 5. LOCAL TEMPORARY (our current temp tables).
> >
> > All of the above would have real uses in our shop.
> >
> >> It's possible to imagine a few more stops on this hierarchy.
> >
> > Some of these might be slightly preferred over the above in certain
> > circumstances, but that's getting down to fine tuning.  I think the
> > five listed above are more important than the "speculative ones
> > mentioned.
> >
> >> I don't particularly care for the name UNSYNCED
> >
> > EVANESCENT?
> >
> >> I'm starting not to like UNLOGGED much either
> >
> > EPHEMERAL?
> >
> > Actually, the UNSYNCED and UNLOGGED seem fairly clear....
>
> I think Andrew's suggestion of VOLATILE is pretty good. It's hard to
> come up with multiple words that express gradations of "we might
> decide to chuck your data if things go South", though. Then again if
> we go with Andres's suggestion maybe we can get by with one level.
>
> Or if we still end up with multiple levels, maybe it's best to use
> VOLATILE for everything >1 and <4, and then have a subordinate clause
> to specify gradations.
>
> CREATE VOLATILE TABLE blow_me_away (k text, v text) SOME OTHER WORDS
> THAT EXPLAIN THE DETAILS GO HERE;

How about something like:

OPTIONS (SYNC=no, LOG=no, ... )

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Kevin Grittner <kevin(dot)grittner(at)wicourts(dot)gov>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, marcin mank <marcin(dot)mank(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Kenneth Marshall <ktm(at)rice(dot)edu>, Andy Colson <andy(at)squeakycode(dot)net>
Subject: Re: unlogged tables
Date: 2010-11-17 21:17:15
Message-ID: 1290028622-sup-6924@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Excerpts from Robert Haas's message of mié nov 17 18:05:56 -0300 2010:

> CREATE VOLATILE TABLE blow_me_away (k text, v text) SOME OTHER WORDS
> THAT EXPLAIN THE DETAILS GO HERE;

What about some reloptions?

--
Álvaro Herrera <alvherre(at)commandprompt(dot)com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Steve Crawford <scrawford(at)pinpointresearch(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Kenneth Marshall <ktm(at)rice(dot)edu>, Robert Haas <robertmhaas(at)gmail(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, marcin mank <marcin(dot)mank(at)gmail(dot)com>, Andy Colson <andy(at)squeakycode(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: unlogged tables
Date: 2010-11-17 21:19:10
Message-ID: 4CE446CE.90609@pinpointresearch.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 11/17/2010 11:44 AM, Tom Lane wrote:
> ...because a backend crash has to be assumed to have corrupted
> unlogged tables...
>
So in a typical use-case, say storing session data on a web-site, one
crashed backend could wreck sessions for some or all of the site? Is
there a mechanism in the proposal that would allow a client to determine
the state of a table (good, truncated, wrecked, etc.)?

Cheers,
Steve


From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: David Fetter <david(at)fetter(dot)org>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, marcin mank <marcin(dot)mank(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, Kenneth Marshall <ktm(at)rice(dot)edu>, Andy Colson <andy(at)squeakycode(dot)net>
Subject: Re: unlogged tables
Date: 2010-11-17 21:43:32
Message-ID: 1290030212.10258.7150.camel@jd-desktop
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


> > >> I don't particularly care for the name UNSYNCED
> > >
> > > EVANESCENT?
> > >
> > >> I'm starting not to like UNLOGGED much either
> > >
> > > EPHEMERAL?
> > >
> > > Actually, the UNSYNCED and UNLOGGED seem fairly clear....

Uhhh yeah. Let's not break out the thesaurus for this.

JD
--
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt


From: Dimitri Fontaine <dfontaine(at)hi-media(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, marcin mank <marcin(dot)mank(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Kenneth Marshall <ktm(at)rice(dot)edu>, Andy Colson <andy(at)squeakycode(dot)net>
Subject: Re: unlogged tables
Date: 2010-11-18 08:07:19
Message-ID: E43D0EA4-6514-4B3A-B71E-A9BD5F60756C@hi-media.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


> CREATE VOLATILE TABLE blow_me_away (k text, v text) SOME OTHER WORDS
> THAT EXPLAIN THE DETAILS GO HERE;

[ TRUNCATE ON RESTART ]

Your patch implements this option, right?

Regards,
>
--
dim


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Dimitri Fontaine <dfontaine(at)hi-media(dot)com>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, marcin mank <marcin(dot)mank(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Kenneth Marshall <ktm(at)rice(dot)edu>, Andy Colson <andy(at)squeakycode(dot)net>
Subject: Re: unlogged tables
Date: 2010-11-18 15:24:44
Message-ID: AANLkTikq4b-K+O1ZR2q2KpMPhP78BHQkmm_sDJigzJVH@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Nov 18, 2010 at 3:07 AM, Dimitri Fontaine
<dfontaine(at)hi-media(dot)com> wrote:
>
>> CREATE VOLATILE TABLE blow_me_away (k text, v text) SOME OTHER WORDS
>> THAT EXPLAIN THE DETAILS GO HERE;
>
> [ TRUNCATE ON RESTART ]
>
> Your patch implements this option, right?

Yeah.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Andy Colson <andy(at)squeakycode(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: unlogged tables
Date: 2010-11-22 04:07:12
Message-ID: 4CE9EC70.90703@squeakycode.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


I have done a bunch of benchmarking. It was not easy to find consistent numbers, so I picked a job and ran the same thing over and over.

I'm running Slackware 13.1 on a desktop computer.

Linux storm 2.6.35.7-smp #1 SMP Sun Oct 10 21:43:07 CDT 2010 i686 AMD Athlon(tm) 7850 Dual-Core Processor AuthenticAMD GNU/Linux

Database on:
/dev/sda2 on /pub type ext4 (rw,noatime)

I started with stock, unpatched, pg 9.1, and ran pg_bench. I used several scale's and always set the # connections at half the scale. (so scale 20 used 10 connections). I ran all tests for 180 seconds. autovacuum was always off, and I ran "vacuum -z" between each pg_bench.

each block of numbers has these columns: scale, test 1, test 2, test 3, avg
So the first line below: 6, 96, 105, 102, 101
means:
pg_becnh -i -s 6
pg_bench -c 3 -T 180
vacuum -z
pg_bench -c 3 -T 180
vacuum -z
pg_bench -c 3 -T 180

result times for the three runs 96, 105 and 102 seconds, with average 101 seconds.

The LOGS test is importing 61+ million rows of apache logs. Its a perl script, uses COPY over many many files. Each file is commit separate.

checkpoint_segments = 7
shared_buffers = 512MB
effective_cache_size = 1024MB
autovacuum off

fsync on
synchronous_commit on
full_page_writes on
bgwriter_lru_maxpages 100
180 second tests

scale, test 1, test 2, test 3, avg
6, 96, 105, 102, 101
20, 120, 82, 76, 93
40, 73, 42, 43, 53
80, 50, 29, 35, 38

synchronous_commit off
6, 239, 676, 614, 510
20, 78, 47, 56, 60
40, 59, 35, 41, 45
80, 53, 30, 35, 39

LOGS: ~ 3,900 ins/sec (I didnt record this well, its sort of a guess)

synchronous_commit off
full_page_writes off
6, 1273, 1344, 1287, 1301
20, 1323, 1307, 1313, 1314
40, 1051, 872, 702, 875
80, 551, 206, 245, 334

LOGS (got impatient and killed it)
Total rows: 20,719,095
Total Seconds: 5,279.74
Total ins/sec: 3,924.25

fsync off
synchronous_commit off
full_page_writes off
bgwriter_lru_maxpages 0
6, 3622, 2940, 2879, 3147
20, 2860, 2952, 2939, 2917
40, 2204, 2143, 2349, 2232
80, 1394, 1043, 1085, 1174

LOG (this is a full import)
Total rows: 61,467,489
Total Seconds: 1,240.93
Total ins/sec: 49,533.37

------- Apply unlogged patches and recompile, re-initdb ---
I patched pg_bench to run with either normal or unlogged tables

fsync on
synchronous_commit on
full_page_writes on
bgwriter_lru_maxpages 100
180 second tests

normal tables
6, 101, 102, 108, 103
20, 110, 71, 90, 90
40, 83, 45, 49, 59
80, 50, 34, 30, 38

LOGS (partial import)
Total rows: 24,754,871
Total Seconds: 6,058.03
Total ins/sec: 4,086.28

unlogged tables
6, 2966, 3047, 3007, 3006
20, 2767, 2515, 2708, 2663
40, 1933, 1311, 1464, 1569
80, 837, 552, 579, 656

LOGS (full import)
Total rows: 61,467,489
Total Seconds: 1,126.75
Total ins/sec: 54,552.60

After all this... there are too many numbers for me. I have no idea what this means.

-Andy


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andy Colson <andy(at)squeakycode(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: unlogged tables
Date: 2010-11-24 01:32:38
Message-ID: AANLkTikRx0XYoW=3-qT8Czw=xgCWcuxDT1nC55OBs7qg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Nov 21, 2010 at 11:07 PM, Andy Colson <andy(at)squeakycode(dot)net> wrote:
> After all this... there are too many numbers for me.  I have no idea what
> this means.

I think what it means that is that, for you, unlogged tables were
almost as fast as shutting off all of synchronous_commit,
full_page_writes, and fsync, and further setting
bgwriter_lru_maxpages=0. Now, that seems a little strange, because
you'd think if anything it would be faster. I'm not sure what
accounts for the difference, although I wonder if checkpoints are part
of it. With the current code, which doesn't exclude unlogged table
pages from checkpoints, a checkpoint will still be faster with
fsync=off than with unlogged tables. It seems like we're agreed that
this is a problem to be fixed in phase two, though, either by fsyncing
every unlogged table we can find at shutdown time, or else by
providing two durability options, one that works as the current code
does (but survives clean shutdowns) and another that excludes dirty
pages from checkpoints (and does not survive clean shutdowns).

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company