global temporary tables

Lists: pgsql-hackers
From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: global temporary tables
Date: 2010-04-24 02:52:41
Message-ID: u2o603c8f071004231952i36642ae6u9d6a7eae6eb6ff32@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

A couple of recent threads made got me thinking again about the idea
of global temporary tables. There seem to be two principal issues:

1. What is a global temporary table?

2. How could we implement that?

Despite rereading the "idea: global temp tables" thread from April
2009 in some detail, I was not able to get a clear understanding of
(1). What I *think* it is supposed to mean is that the table is a
permanent object which is "globally" visible - that is, it's part of
some non-temp schema like public or $user and it's column definitions
etc. are visible to all backends - and it's not automatically removed
on commit, backend exit, etc. - but the *contents* of the table are
temporary and backend-local, so that each new backend initially sees
it as empty and can then insert, update, and delete data independently
of what any other backend does.

As to (2), my thought is that perhaps we could implement this by
instantiating a separate relfilenode for the relation for each backend
which accesses it. relfilenode would be 0 in pg_class, as it is for
"mapped" relations, but every time a backend touched the rel, we'd
allocate a relfilenode and associated the oid of the temp table to it
using some kind of backend-local storage - actually similar to what
the relmapper code does, except without the complexity of ever
actually having to persist the value; and perhaps using a hash table
rather than an array, since the number of mapped rels that a backend
can need to deal with is rather more limited than the number of temp
tables it might want to use.

Thoughts?

...Robert


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: global temporary tables
Date: 2010-04-24 03:11:22
Message-ID: 24194.1272078682@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> A couple of recent threads made got me thinking again about the idea
> of global temporary tables. There seem to be two principal issues:

> 1. What is a global temporary table?

> 2. How could we implement that?

> Despite rereading the "idea: global temp tables" thread from April
> 2009 in some detail, I was not able to get a clear understanding of
> (1).

I believe that the spec's distinction between global and local temp
tables has to do with whether they are visible across module
boundaries. Since we haven't implemented modules, that distinction
is meaningless to us. In the spec, *both* types of temp tables have
the property that the definition (schema) of the table is global
across all sessions, and only the content of the table is session-local.

This arrangement clearly is useful for some applications, but so is our
current definition wherein different sessions can have different schemas
for the same temp table name. So eventually it'd be good to support
both. But the GLOBAL/LOCAL TEMP TABLE distinction is something entirely
different. PG's behavior does not correspond to either of those.

Your idea of using the relmapper layer to instantiate copies of temp
tables is an interesting one. It's only a small piece of the puzzle
though. In particular, what you described would result in the table
having the same OID in all sessions, even though the relfilenodes are
different --- amd since locking is done on the basis of OID, that's
probably *not* what we want. It would be much better for performance
if the different sessions' versions of the table were independently
lockable.

I also kind of wonder what is supposed to happen if someone DROPs or
ALTERs the temp table definition ...

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: global temporary tables
Date: 2010-04-24 03:18:29
Message-ID: z2r603c8f071004232018k9891b0f8p5fcb1708455f462d@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Apr 23, 2010 at 11:11 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> A couple of recent threads made got me thinking again about the idea
>> of global temporary tables.  There seem to be two principal issues:
>
>> 1. What is a global temporary table?
>
>> 2. How could we implement that?
>
>> Despite rereading the "idea: global temp tables" thread from April
>> 2009 in some detail, I was not able to get a clear understanding of
>> (1).
>
> I believe that the spec's distinction between global and local temp
> tables has to do with whether they are visible across module
> boundaries.  Since we haven't implemented modules, that distinction
> is meaningless to us.  In the spec, *both* types of temp tables have
> the property that the definition (schema) of the table is global
> across all sessions, and only the content of the table is session-local.
>
> This arrangement clearly is useful for some applications, but so is our
> current definition wherein different sessions can have different schemas
> for the same temp table name.  So eventually it'd be good to support
> both.  But the GLOBAL/LOCAL TEMP TABLE distinction is something entirely
> different.  PG's behavior does not correspond to either of those.

I don't really care what we call it, although I find the GLOBAL name
convenient and descriptive.

> Your idea of using the relmapper layer to instantiate copies of temp
> tables is an interesting one.  It's only a small piece of the puzzle
> though.  In particular, what you described would result in the table
> having the same OID in all sessions, even though the relfilenodes are
> different --- amd since locking is done on the basis of OID, that's
> probably *not* what we want.  It would be much better for performance
> if the different sessions' versions of the table were independently
> lockable.

Well, it depends on what operation we're talking about. For
operations that involve only the table contents, yeah, we'd like to
lock the versions independently. But for this sort of thing:

> I also kind of wonder what is supposed to happen if someone DROPs or
> ALTERs the temp table definition ...

...not so much. Here you REALLY want a DROP attempt to acquire an
AccessExclusiveLock that will conflict with any outstanding
AccessShareLocks. Similarly, you're only going to be able to modify
the schema for the relation if it's not otherwise in use.

...Robert


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: global temporary tables
Date: 2010-04-24 03:28:48
Message-ID: 24898.1272079728@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Fri, Apr 23, 2010 at 11:11 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> I also kind of wonder what is supposed to happen if someone DROPs or
>> ALTERs the temp table definition ...

> ...not so much. Here you REALLY want a DROP attempt to acquire an
> AccessExclusiveLock that will conflict with any outstanding
> AccessShareLocks. Similarly, you're only going to be able to modify
> the schema for the relation if it's not otherwise in use.

I think you're presuming the answer to the question. We could also view
the desired behavior as being that each session clones the temp table
definition at some instant (eg, first use). The approach that you're
assuming seems fraught with large downsides: in particular, implementing
ALTER TABLE would be a mess. The would-be alterer would need access to
the physical copies of all sessions, which throws out not only the
assumption that the relmapper entries can be private data, but all of
the access optimizations we currently have in the local buffer manager.
Not to mention the coding mess of having to repeat the ALTER operation
for each of N copies, some of which might disappear while we're trying
to do it (or if they don't, we're blocking backends from exiting).
I don't even know how you'd do the ALTER over again N times if you
only have one set of catalog entries describing the N copies.

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: global temporary tables
Date: 2010-04-24 03:46:40
Message-ID: q2q603c8f071004232046s1b77edeeib8289ecc7556c762@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Apr 23, 2010 at 11:28 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On Fri, Apr 23, 2010 at 11:11 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> I also kind of wonder what is supposed to happen if someone DROPs or
>>> ALTERs the temp table definition ...
>
>> ...not so much.  Here you REALLY want a DROP attempt to acquire an
>> AccessExclusiveLock that will conflict with any outstanding
>> AccessShareLocks.  Similarly, you're only going to be able to modify
>> the schema for the relation if it's not otherwise in use.
>
> I think you're presuming the answer to the question.  We could also view
> the desired behavior as being that each session clones the temp table
> definition at some instant (eg, first use).  The approach that you're
> assuming seems fraught with large downsides: in particular, implementing
> ALTER TABLE would be a mess.  The would-be alterer would need access to
> the physical copies of all sessions, which throws out not only the
> assumption that the relmapper entries can be private data, but all of
> the access optimizations we currently have in the local buffer manager.

I agree, that would be pretty unfortunate, althogh maybe it's the only
way to make it work. It's not what I had in mind. I was thinking
that the would-be ALTERER could just take an AccessExclusiveLock, but
now that I think about it that doesn't work, since a backend could
have the table unlocked between transactions but still have private
contents in it. :-(

> Not to mention the coding mess of having to repeat the ALTER operation
> for each of N copies, some of which might disappear while we're trying
> to do it (or if they don't, we're blocking backends from exiting).
> I don't even know how you'd do the ALTER over again N times if you
> only have one set of catalog entries describing the N copies.

Well, if you clone the table, that just pushes the problem around.
When I run ALTER TABLE on one of these thingamabobs, does it modify my
clone? The original? Both? If it modifies my clone, how do we
modify the original? If it modifies the original, won't I be rather
surprised to find my clone unaffected? If it modifies both, how do we
avoid complete havoc if the original has since been modified (perhaps
incompatibly, perhaps not) by some other backend doing its own ALTER
TABLE?

...Robert


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: global temporary tables
Date: 2010-04-24 06:42:42
Message-ID: w2i162867791004232342p183892bbw87eccb7267079f03@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2010/4/24 Robert Haas <robertmhaas(at)gmail(dot)com>:
> A couple of recent threads made got me thinking again about the idea
> of global temporary tables.  There seem to be two principal issues:
>
> 1. What is a global temporary table?
>
> 2. How could we implement that?
>
> Despite rereading the "idea: global temp tables" thread from April
> 2009 in some detail, I was not able to get a clear understanding of
> (1).  What I *think* it is supposed to mean is that the table is a
> permanent object which is "globally" visible - that is, it's part of
> some non-temp schema like public or $user and it's column definitions
> etc. are visible to all backends - and it's not automatically removed
> on commit, backend exit, etc. - but the *contents* of the table are
> temporary and backend-local, so that each new backend initially sees
> it as empty and can then insert, update, and delete data independently
> of what any other backend does.
>
> As to (2), my thought is that perhaps we could implement this by
> instantiating a separate relfilenode for the relation for each backend
> which accesses it.  relfilenode would be 0 in pg_class, as it is for
> "mapped" relations, but every time a backend touched the rel, we'd
> allocate a relfilenode and associated the oid of the temp table to it
> using some kind of backend-local storage - actually similar to what
> the relmapper code does, except without the complexity of ever
> actually having to persist the value; and perhaps using a hash table
> rather than an array, since the number of mapped rels that a backend
> can need to deal with is rather more limited than the number of temp
> tables it might want to use.

it is good idea.

I missing some ideas about statistics, about indexes.

Regards
Pavel Stehule
>
> Thoughts?
>
> ...Robert
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


From: "Greg Sabino Mullane" <greg(at)turnstep(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: global temporary tables
Date: 2010-04-24 16:02:33
Message-ID: fca910820578d251cc1d5647793c9a67@biglumber.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160

>...
> surprised to find my clone unaffected? If it modifies both, how do we
> avoid complete havoc if the original has since been modified (perhaps
> incompatibly, perhaps not) by some other backend doing its own ALTER
> TABLE?

Since this is such a thorny problem, and this is a temporary table, why
not just disallow ALTER completely for the first pass?

- --
Greg Sabino Mullane greg(at)turnstep(dot)com
PGP Key: 0x14964AC8 201004241201
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAkvTFfsACgkQvJuQZxSWSsjrVACePmmNglGi6KoZgYZ7zjUm4gPm
o2wAoNYYuiZl1HZXsgiAOQkJzNUmaORm
=IijV
-----END PGP SIGNATURE-----


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Greg Sabino Mullane" <greg(at)turnstep(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: global temporary tables
Date: 2010-04-24 16:11:33
Message-ID: 7796.1272125493@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Greg Sabino Mullane" <greg(at)turnstep(dot)com> writes:
>> surprised to find my clone unaffected? If it modifies both, how do we
>> avoid complete havoc if the original has since been modified (perhaps
>> incompatibly, perhaps not) by some other backend doing its own ALTER
>> TABLE?

> Since this is such a thorny problem, and this is a temporary table, why
> not just disallow ALTER completely for the first pass?

Usually the way we approach these kinds of problems is that we want
to see some plausible outline for how they might be fixed before we
move forward with the base feature. IOW, I wouldn't object to not
having ALTER in the first release, but if we have no idea how to do
ALTER at all I'd be too worried that we were painting ourselves into
a corner.

Or maybe you can make a case that there's no need to allow ALTER at
all, ever. But surely DROP needs to be possible, and that seems to
already introduce some of the same issues.

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Greg Sabino Mullane <greg(at)turnstep(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: global temporary tables
Date: 2010-04-24 17:16:20
Message-ID: k2y603c8f071004241016w48b9d39an2dbf6d9df2bad495@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Apr 24, 2010 at 12:11 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> "Greg Sabino Mullane" <greg(at)turnstep(dot)com> writes:
>>> surprised to find my clone unaffected?  If it modifies both, how do we
>>> avoid complete havoc if the original has since been modified (perhaps
>>> incompatibly, perhaps not) by some other backend doing its own ALTER
>>> TABLE?
>
>> Since this is such a thorny problem, and this is a temporary table, why
>> not just disallow ALTER completely for the first pass?
>
> Usually the way we approach these kinds of problems is that we want
> to see some plausible outline for how they might be fixed before we
> move forward with the base feature.  IOW, I wouldn't object to not
> having ALTER in the first release, but if we have no idea how to do
> ALTER at all I'd be too worried that we were painting ourselves into
> a corner.
>
> Or maybe you can make a case that there's no need to allow ALTER at
> all, ever.  But surely DROP needs to be possible, and that seems to
> already introduce some of the same issues.

I had the same thought as GSM this morning. More specifically, it
seems to me that the problematic cases are precisely those in which
you might feel an urge to touch somebody else's local buffers, so I
think we should disallow, approximately, those ALTER TABLE cases which
require a full-table rewrite. I don't see the problem with DROP.
Under the proposed design, it's approximately equivalent to dropping a
table that someone else has truncated. You just wait for the
necessary lock and then do it.

At least AIUI, the use case for this feature is that you want to avoid
creating "the same" temporary table over and over again. The schema
is fixed and doesn't change much, but you're creating it lots and lots
of times in lots and lots of different backends, leading to both
management and performance difficulties. If you want to be able to
change the schema frequently or in a backend-local way, use the
existing temporary table feature.

Now, there is ONE problem with DROP, which is that you might orphan
some heaps. Of course, that can also happen due to a backend crash.
Currently, autovacuum arranges to drop any orphaned temp tables that
have passed the wraparound threshold, but even if we were happy with
waiting 2 billion transactions to get things cleaned up, the mechanism
can't work here because it relies on being able to examine the
pg_class row and determine which backend owns it, and where the
storage is located.

We could possibly set things up so that a running backend will notice
if a global temporary table for which it's created a private
relfilenode gets dropped, and blow away the backing file. But that
doesn't protect against crashes, so I think we're going to need some
other garbage collection mechanism, either instead of in addition to
asking backends to clean up after themselves. I'm not quite sure what
the design of that should look like, though.

...Robert


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Greg Sabino Mullane <greg(at)turnstep(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: global temporary tables
Date: 2010-04-24 17:31:23
Message-ID: 9319.1272130283@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> At least AIUI, the use case for this feature is that you want to avoid
> creating "the same" temporary table over and over again.

The context that I've seen it come up in is that people don't want to
clutter their functions with create-it-if-it-doesn't-exist logic,
which you have to have given the current behavior of temp tables.
Any performance gain from reduced catalog churn would be gravy.

Aside from the DROP problem, I think this implementation proposal
has one other big shortcoming: what are you going to do about
table statistics? In many cases, you really *have* to do an ANALYZE
once you've populated a temp table, if you want to get decent plans
for it. Where will you put those stats?

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Greg Sabino Mullane <greg(at)turnstep(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: global temporary tables
Date: 2010-04-24 17:38:46
Message-ID: 9452.1272130726@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

[ forgot to respond to this part ]

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> ... I don't see the problem with DROP.
> Under the proposed design, it's approximately equivalent to dropping a
> table that someone else has truncated. You just wait for the
> necessary lock and then do it.

And do *what*? You can remove the catalog entries, but how are you
going to make the physical storage of other backends' versions go away?
(To say nothing of making them flush their local buffers for it.)
If you do remove the catalog entries, won't you be cutting the knees
out from under whatever end-of-session cleanup processing might exist
in those other backends?

The idea of the global table as a template that individual sessions
clone working tables from would avoid most of these problems. You
rejected it on the grounds that ALTER would be too hard; but if you're
blowing off ALTER anyway, that argument seems pretty unimpressive.

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Greg Sabino Mullane <greg(at)turnstep(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: global temporary tables
Date: 2010-04-24 18:32:20
Message-ID: v2v603c8f071004241132w6d5d873t76a1f002dde21900@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Apr 24, 2010 at 1:31 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> At least AIUI, the use case for this feature is that you want to avoid
>> creating "the same" temporary table over and over again.
>
> The context that I've seen it come up in is that people don't want to
> clutter their functions with create-it-if-it-doesn't-exist logic,
> which you have to have given the current behavior of temp tables.
> Any performance gain from reduced catalog churn would be gravy.

I think there's a significant contingent on this mailing list who feel
that that gravy would be rather tasty and would like very much to
enjoy some of it along with their temporary table tetrazzini.

> Aside from the DROP problem, I think this implementation proposal
> has one other big shortcoming: what are you going to do about
> table statistics?  In many cases, you really *have* to do an ANALYZE
> once you've populated a temp table, if you want to get decent plans
> for it.  Where will you put those stats?

For a first cut, I had thought about ignoring the problem. Now, that
may sound stupid, because now if two different backends have very
different distributions of data in the table and both do an ANALYZE,
one set of statistics will clobber the other set of statistics. On
the flip side, for some usage patterns, it might be actually work out
to a win. Maybe the data I'm putting in here today is a great deal
like the data I put in here yesterday, and planning it with
yesterday's statistics doesn't cost enough to be worth a re-ANALYZE.

If we don't want to do that, I suppose one option is to create a
pg_statistic-like table in the backend's temporary tablespace and put
them there; or we could put them into a backend-local hash table. The
current setup of pg_statistic is actually somewhat weak for a number
of things we might want to do: for example, it might be interesting to
gather statistics for the subset of a table for which a particular
partial index is predOK. When such an index is available for a
particular query, we could use the statistics for that subset of the
table instead of the overall statistics for the table, and get better
estimates. Or we could even let the user specify predicates which
will cause the table to have a different statistical distribution than
the table as a whole, and gather statistics for the subset that
matches the predicate. One approach would be to make the starelid
column able to reference something other than a relation OID, although
I don't think that actually helps with the global temp table problem
because if we use the real pg_statistic to store the data then we have
to arrange to clean it up.

...Robert


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Sabino Mullane <greg(at)turnstep(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: global temporary tables
Date: 2010-04-24 18:41:11
Message-ID: k2v162867791004241141peb800d6cmb81561a30a8ed5d3@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>
> For a first cut, I had thought about ignoring the problem.  Now, that
> may sound stupid, because now if two different backends have very
> different distributions of data in the table and both do an ANALYZE,
> one set of statistics will clobber the other set of statistics.  On
> the flip side, for some usage patterns, it might be actually work out
> to a win.  Maybe the data I'm putting in here today is a great deal
> like the data I put in here yesterday, and planning it with
> yesterday's statistics doesn't cost enough to be worth a re-ANALYZE.
>

Both variant can be. First time - statistic can be taken from some
"original" (can be empty). After ANALYZE the statistic can be
individual.

Regards
Pavel


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Greg Sabino Mullane <greg(at)turnstep(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: global temporary tables
Date: 2010-04-24 18:51:34
Message-ID: j2n603c8f071004241151xb8ac9c44k566e8c52c40574a3@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Apr 24, 2010 at 1:38 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> [ forgot to respond to this part ]
>
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> ...  I don't see the problem with DROP.
>> Under the proposed design, it's approximately equivalent to dropping a
>> table that someone else has truncated.  You just wait for the
>> necessary lock and then do it.
>
> And do *what*?  You can remove the catalog entries, but how are you
> going to make the physical storage of other backends' versions go away?
> (To say nothing of making them flush their local buffers for it.)
> If you do remove the catalog entries, won't you be cutting the knees
> out from under whatever end-of-session cleanup processing might exist
> in those other backends?

Well, if I knew for sure what the best way was to solve all of these
problems, I'd be posting a finished patch rather than a request for
comment on a design. It's not obvious to me that there's a terribly
thorny problem in the area you're concerned about, but your concern is
making me worry that I'm missing something. Why would the
end-of-session processing need the catalog entries? It seems like
whatever backend-local data structure we're using to record the
relfilenode mappings would be sufficent to nuke the backend storage,
and what else needs doing?

> The idea of the global table as a template that individual sessions
> clone working tables from would avoid most of these problems.  You
> rejected it on the grounds that ALTER would be too hard; but if you're
> blowing off ALTER anyway, that argument seems pretty unimpressive.

I don't think that avoiding the catalog churn is something to be
dismissed lightly, but I also think that cloning the table is likely
to be significantly more difficult from an implementation point of
view. Under the implementation I'm proposing, we don't need much that
is fundamentally all that new. Global temporary tables can be treated
like our existing temp tables for purposes of XLog and bufmgr, but
they don't get forced into a temporary namespace. The relation
mapping infrastructure provides a pretty good start for using a
relfilenode that isn't stored in pg_class. I've already gone through
the exercise of finding all the places where we check rd_istemp and
changing them to use macros instead (RelationNeedsWAL, IsBackendLocal,
etc.) and it's not bad.

There's a related project which I think can also leverage much of this
same infrastructure: unlogged tables. We've talked about this before,
but in short the idea is that an unlogged table behaves like a regular
table in all respects except that we never write WAL for it; and we
truncate it at shutdown and at startup. Therefore, it doesn't show up
on standby servers, and its contents are not persistent across
restarts, but performance is improved. It's suitable for things like
"the table of currently logged in users", where you don't mind forcing
everyone to log in again if the database crashes. (It might even be
possible to allow writes to unlogged tables on standby servers, though
I'm not feeling that ambitious ATM.) So:

- local temp tables exist in a temp namespace, use local buffers, and skip WAL
- global temp tables exist in a non-temp namespace, use local buffers,
and skip WAL
- unlogged tables exist in a non-temp namespace, use shared buffers,
and skip WAL
- normal tables exist in a non-temp namespace, use shared buffers, and write WAL

Thoughts?

...Robert


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Greg Sabino Mullane <greg(at)turnstep(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: global temporary tables
Date: 2010-04-24 19:51:01
Message-ID: 11887.1272138661@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> ... Why would the
> end-of-session processing need the catalog entries? It seems like
> whatever backend-local data structure we're using to record the
> relfilenode mappings would be sufficent to nuke the backend storage,
> and what else needs doing?

Well, if you're intending to have a separate data structure and code
path for cleaning up this type of temp table, then maybe you don't need
to touch any catalog entries. I'm concerned though about how far the
effects will propagate --- things like TRUNCATE, VACUUM FULL, REINDEX
will probably have issues with this. Right now they think in terms
of writing a new pg_class entry in order to reassociate tables with
new relfilenodes.

Have you thought much about the previously proposed design, ie keeping
catalog entries for temp tables in backend-local temporary catalogs?
That would certainly be a lot of work, but I think in the end it might
fit in better. This design feels like it's being driven by "hey,
we can abuse the relmapper to sort of do what we want", and not by
what we really want.

regards, tom lane


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: global temporary tables
Date: 2010-04-24 22:29:02
Message-ID: 1272148142.4161.1598.camel@ebony
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, 2010-04-23 at 22:52 -0400, Robert Haas wrote:

> Thoughts?

Only a requirement: that we design this in a way that will allow temp
tables to be used during Hot Standby. I make not other comment.

--
Simon Riggs www.2ndQuadrant.com


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Greg Sabino Mullane <greg(at)turnstep(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: global temporary tables
Date: 2010-04-24 22:53:08
Message-ID: w2j603c8f071004241553z53a04604g5c6855e79d48c17b@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Apr 24, 2010 at 3:51 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> ...  Why would the
>> end-of-session processing need the catalog entries?  It seems like
>> whatever backend-local data structure we're using to record the
>> relfilenode mappings would be sufficent to nuke the backend storage,
>> and what else needs doing?
>
> Well, if you're intending to have a separate data structure and code
> path for cleaning up this type of temp table, then maybe you don't need
> to touch any catalog entries.  I'm concerned though about how far the
> effects will propagate --- things like TRUNCATE, VACUUM FULL, REINDEX
> will probably have issues with this.  Right now they think in terms
> of writing a new pg_class entry in order to reassociate tables with
> new relfilenodes.
>
> Have you thought much about the previously proposed design, ie keeping
> catalog entries for temp tables in backend-local temporary catalogs?
> That would certainly be a lot of work, but I think in the end it might
> fit in better.  This design feels like it's being driven by "hey,
> we can abuse the relmapper to sort of do what we want", and not by
> what we really want.

Well, yes and no. I think there are definitely some good things that
can happen if we can see our way to taking a hammer to pg_class and
pg_attribute. If we create, e.g. pg_shared_class and
pg_shared_attribute, then we can un-nail the catalogs you just nailed
to make the authentication process able to work without selecting a
database. We can also enable (without guilt) clustering both those
catalogs and the database-specific versions of pg_class, since we no
longer have to worry about having multiple copies of the row that can
get out of sync with each other. And if we further break off
pg_temp_class and pg_temp_attribute, then we can also have our
existing flavor of temporary tables without worrying about catalog
bloat, which would be great. There may be other applications as well.

Having said all that, it doesn't actually allow us to implement global
temporary tables, because obviously the catalog entries for a global
temporary table have to be permanent. Of course, if we didn't have to
worry about catalog bloat, the "clone" approach you're proposing would
be somewhat more attractive, but I actually think that the synergy is
in the other direction: the perfect place to store the catalog entries
and statistics for local temporary tables is - in a global temporary
table! Note that while a local temporary table can never inherit from
a permanent table, it's entirely sensible to let global temporary
tables inherit from permanent tables. Different backends will have
different views of the overall contents of the parent table, but
that's OK, even desirable.

...Robert


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: global temporary tables
Date: 2010-04-24 23:01:05
Message-ID: g2x603c8f071004241601qbca333edmd384f9a0df596417@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Apr 24, 2010 at 6:29 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On Fri, 2010-04-23 at 22:52 -0400, Robert Haas wrote:
>> Thoughts?
>
> Only a requirement: that we design this in a way that will allow temp
> tables to be used during Hot Standby. I make not other comment.

For so long as local temporary tables put their catalog entries in
pg_class, we're not going to be able to use them during Hot Standby.
See the email I just sent elsewhere on this thread for a long term
roadmap to getting out of that pickle. At least under the
implementation I'm proposing here, making global temporary tables
usable would be an easier nut to crack, because the catalog entries
are a non-issue. There is one major problem, though: assigning a
scratch relfilenode to the temporary table requires generating an OID,
which we currently have no way to allow on the standby. Upthread I
also proposed an implementation for unlogged tables (that is, contents
don't survive a server bounce) which wouldn't have that problem
either, although I haven't fully thought it through and there may very
well be other issues.

So in short: I don't think anything we're talking about it would make
HS use harder, and some of it might make it easier. But probably some
additional engineering effort dedicated specifically to solving the
problems unique to HS would still be needed.

...Robert


From: Jim Nasby <decibel(at)decibel(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Sabino Mullane <greg(at)turnstep(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: global temporary tables
Date: 2010-04-25 00:47:47
Message-ID: FD5C55A5-5003-4C4E-8DC6-E124A13E0B68@decibel.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Apr 24, 2010, at 12:31 PM, Tom Lane wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> At least AIUI, the use case for this feature is that you want to avoid
>> creating "the same" temporary table over and over again.
>
> The context that I've seen it come up in is that people don't want to
> clutter their functions with create-it-if-it-doesn't-exist logic,
> which you have to have given the current behavior of temp tables.
> Any performance gain from reduced catalog churn would be gravy.
>
> Aside from the DROP problem, I think this implementation proposal
> has one other big shortcoming: what are you going to do about
> table statistics? In many cases, you really *have* to do an ANALYZE
> once you've populated a temp table, if you want to get decent plans
> for it. Where will you put those stats?

One possibility: rename the existing pg_stats to pg_stats_permanent. Create a global temporary table called pg_stats_temporary. pg_stats becomes a union of the two. I know the backend wouldn't be able to use the view, but hopefully access to statistics goes through a limited set of functions so that teaching them about the two different tables isn't hard.

As for cleanup and storage questions; what about having temp objects live in pgsql_tmp? I'm thinking create a directory under pgsql_tmp for a backend PID the first time it creates a temp object (global or local) and create the files in there. That also means that we don't have to come up with different relfilenodes for each backend. On the other hand, some layer (presumably smgr) would need to understand whether a relation was temporary or not. If we do that, cleanup is easy: you can remove any directories that no longer have a running PID. For forensics you probably only want to do that automatically when a backend starts and discovers it already has a directory, though we should also provide an administrator function that will clobber all directories that no longer have backends.
--
Jim C. Nasby, Database Architect jim(at)nasby(dot)net
512.569.9461 (cell) http://jim.nasby.net


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jim Nasby <decibel(at)decibel(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Sabino Mullane <greg(at)turnstep(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: global temporary tables
Date: 2010-04-25 01:14:44
Message-ID: y2u603c8f071004241814x10e29bc8pb46cad5edc0e2a5b@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Apr 24, 2010 at 8:47 PM, Jim Nasby <decibel(at)decibel(dot)org> wrote:
> On Apr 24, 2010, at 12:31 PM, Tom Lane wrote:
>> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>>> At least AIUI, the use case for this feature is that you want to avoid
>>> creating "the same" temporary table over and over again.
>>
>> The context that I've seen it come up in is that people don't want to
>> clutter their functions with create-it-if-it-doesn't-exist logic,
>> which you have to have given the current behavior of temp tables.
>> Any performance gain from reduced catalog churn would be gravy.
>>
>> Aside from the DROP problem, I think this implementation proposal
>> has one other big shortcoming: what are you going to do about
>> table statistics?  In many cases, you really *have* to do an ANALYZE
>> once you've populated a temp table, if you want to get decent plans
>> for it.  Where will you put those stats?
>
> One possibility: rename the existing pg_stats to pg_stats_permanent. Create a global temporary table called pg_stats_temporary. pg_stats becomes a union of the two. I know the backend wouldn't be able to use the view, but hopefully access to statistics goes through a limited set of functions so that teaching them about the two different tables isn't hard.

Yeah, I don't think that would be too horrible. Part of me feels like
you'd want to have the ability to store stats for a global temp table
in either one of those tables depending on use-case, but I'm also
reluctant to invent a lot of new syntax for a very limited use case.

> As for cleanup and storage questions; what about having temp objects live in pgsql_tmp? I'm thinking create a directory under pgsql_tmp for a backend PID the first time it creates a temp object (global or local) and create the files in there. That also means that we don't have to come up with different relfilenodes for each backend.

That would impose a couple of implementation restrictions that don't
seem necessary. One, it would imply ignoring reltablespace. Two, it
would prohibit (or at least complicate) allowing a backend to CLUSTER
or REINDEX its own private copy of the rel.

> On the other hand, some layer (presumably smgr) would need to understand whether a relation was temporary or not. If we do that, cleanup is easy: you can remove any directories that no longer have a running PID. For forensics you probably only want to do that automatically when a backend starts and discovers it already has a directory, though we should also provide an administrator function that will clobber all directories that no longer have backends.

Unfortunately, I don't see much alternative to making smgr know
something about the temp-ness of the relation, though I'm hoping to
keep the smgr surgery to an absolute minimum. Maybe what we could do
is incorporate the backend ID or PID into the file name when the
relation is temp. Then we could scan for and nuke such files pretty
easily. Otherwise I can't really think how to make it work.

...Robert


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Jim Nasby <decibel(at)decibel(dot)org>, Greg Sabino Mullane <greg(at)turnstep(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: global temporary tables
Date: 2010-04-25 01:37:31
Message-ID: 18126.1272159451@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> Unfortunately, I don't see much alternative to making smgr know
> something about the temp-ness of the relation, though I'm hoping to
> keep the smgr surgery to an absolute minimum. Maybe what we could do
> is incorporate the backend ID or PID into the file name when the
> relation is temp. Then we could scan for and nuke such files pretty
> easily. Otherwise I can't really think how to make it work.

I think that could be a really good idea, mainly because it makes
post-crash cleanup MUCH safer: you can tell with certainty from the
filename that it's a leftover temp table. The notion of zapping files
just because we don't see them listed in pg_class has always scared the
heck out of me.

We already know temp-ness at pretty low levels, like bufmgr vs localbuf.
Pushing it all the way down to smgr doesn't seem like a leap; in fact
I think it would eliminate a separate isTemp parameter in a lot of places.

regards, tom lane


From: Jim Nasby <decibel(at)decibel(dot)org>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Sabino Mullane <greg(at)turnstep(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: global temporary tables
Date: 2010-04-25 01:51:26
Message-ID: 83E69122-65AE-403E-9B9E-E3303E5FBFDC@decibel.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Apr 24, 2010, at 8:14 PM, Robert Haas wrote:
>> One possibility: rename the existing pg_stats to pg_stats_permanent. Create a global temporary table called pg_stats_temporary. pg_stats becomes a union of the two. I know the backend wouldn't be able to use the view, but hopefully access to statistics goes through a limited set of functions so that teaching them about the two different tables isn't hard.
>
> Yeah, I don't think that would be too horrible. Part of me feels like
> you'd want to have the ability to store stats for a global temp table
> in either one of those tables depending on use-case, but I'm also
> reluctant to invent a lot of new syntax for a very limited use case.

Yeah, I'm thinking that's very probably overkill. And if we were going to go to that level, I think it would be far more useful to provide an interface to allow manual control over statistics first, so that you can give the optimizer custom information.

>> As for cleanup and storage questions; what about having temp objects live in pgsql_tmp? I'm thinking create a directory under pgsql_tmp for a backend PID the first time it creates a temp object (global or local) and create the files in there. That also means that we don't have to come up with different relfilenodes for each backend.
>
> That would impose a couple of implementation restrictions that don't
> seem necessary. One, it would imply ignoring reltablespace. Two, it
> would prohibit (or at least complicate) allowing a backend to CLUSTER
> or REINDEX its own private copy of the rel.

Well, the same structure could be imposed underneath a temptablespace. I don't think it matters where you ultimately put it, the goal is just to make sure you can definitively tell that a file was a: temporary and b: what PID it belonged to. That allows for safe cleanup.
--
Jim C. Nasby, Database Architect jim(at)nasby(dot)net
512.569.9461 (cell) http://jim.nasby.net


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jim Nasby <decibel(at)decibel(dot)org>, Greg Sabino Mullane <greg(at)turnstep(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: global temporary tables
Date: 2010-04-25 01:59:41
Message-ID: u2y603c8f071004241859j8317f9cai665cf3d75317930c@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Apr 24, 2010 at 9:37 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> Unfortunately, I don't see much alternative to making smgr know
>> something about the temp-ness of the relation, though I'm hoping to
>> keep the smgr surgery to an absolute minimum.  Maybe what we could do
>> is incorporate the backend ID or PID into the file name when the
>> relation is temp.  Then we could scan for and nuke such files pretty
>> easily.  Otherwise I can't really think how to make it work.
>
> I think that could be a really good idea, mainly because it makes
> post-crash cleanup MUCH safer: you can tell with certainty from the
> filename that it's a leftover temp table.  The notion of zapping files
> just because we don't see them listed in pg_class has always scared the
> heck out of me.
>
> We already know temp-ness at pretty low levels, like bufmgr vs localbuf.
> Pushing it all the way down to smgr doesn't seem like a leap; in fact
> I think it would eliminate a separate isTemp parameter in a lot of places.

Eh? I don't see how it's going to do that.

...Robert


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jim Nasby <decibel(at)decibel(dot)org>, Greg Sabino Mullane <greg(at)turnstep(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: global temporary tables
Date: 2010-04-25 02:22:58
Message-ID: s2i603c8f071004241922ze1498adah728dcb8d88d59980@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Apr 24, 2010 at 9:59 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Sat, Apr 24, 2010 at 9:37 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>>> Unfortunately, I don't see much alternative to making smgr know
>>> something about the temp-ness of the relation, though I'm hoping to
>>> keep the smgr surgery to an absolute minimum.  Maybe what we could do
>>> is incorporate the backend ID or PID into the file name when the
>>> relation is temp.  Then we could scan for and nuke such files pretty
>>> easily.  Otherwise I can't really think how to make it work.
>>
>> I think that could be a really good idea, mainly because it makes
>> post-crash cleanup MUCH safer: you can tell with certainty from the
>> filename that it's a leftover temp table.  The notion of zapping files
>> just because we don't see them listed in pg_class has always scared the
>> heck out of me.
>>
>> We already know temp-ness at pretty low levels, like bufmgr vs localbuf.
>> Pushing it all the way down to smgr doesn't seem like a leap; in fact
>> I think it would eliminate a separate isTemp parameter in a lot of places.
>
> Eh?  I don't see how it's going to do that.

Oh, maybe I do see. If we pass it to smgropen() and stash it in the
SMgrRelation, we don't have to keep supplying it later on, maybe?

Will investigate further.

...Robert


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Jim Nasby <decibel(at)decibel(dot)org>, Greg Sabino Mullane <greg(at)turnstep(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: global temporary tables
Date: 2010-04-25 02:28:00
Message-ID: 19198.1272162480@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> Oh, maybe I do see. If we pass it to smgropen() and stash it in the
> SMgrRelation, we don't have to keep supplying it later on, maybe?

Right. I'm unsure whether we should push it into the RelFileNode
struct itself, but even having it in SMgrRelation ought to cut out
a few places where it now has to be passed separately.

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jim Nasby <decibel(at)decibel(dot)org>, Greg Sabino Mullane <greg(at)turnstep(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: global temporary tables
Date: 2010-04-25 02:57:01
Message-ID: u2h603c8f071004241957xb93a989ct5c78ff8e0f15c47b@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Apr 24, 2010 at 10:28 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> Oh, maybe I do see.  If we pass it to smgropen() and stash it in the
>> SMgrRelation, we don't have to keep supplying it later on, maybe?
>
> Right.  I'm unsure whether we should push it into the RelFileNode
> struct itself, but even having it in SMgrRelation ought to cut out
> a few places where it now has to be passed separately.

Pushing it into the RelFileNode has some advantages in terms of being
able to get at the information from everywhere, but one thing that
makes me think that's probably not a good decision is that we somtimes
WAL-log relfilenodes. And WAL-logging the value of the isTemp flag is
a waste, because if we're WAL-logging, it's zero.

...Robert


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Jim Nasby <decibel(at)decibel(dot)org>, Greg Sabino Mullane <greg(at)turnstep(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: global temporary tables
Date: 2010-04-25 03:02:42
Message-ID: 19874.1272164562@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> Pushing it into the RelFileNode has some advantages in terms of being
> able to get at the information from everywhere, but one thing that
> makes me think that's probably not a good decision is that we somtimes
> WAL-log relfilenodes. And WAL-logging the value of the isTemp flag is
> a waste, because if we're WAL-logging, it's zero.

Yeah. I think we also use RelFileNode as a hash tag in places, and
so adding a bool to it would be problematic for a couple of reasons:
possibly uninitialized pad bytes, and uselessly incorporating more bytes
into the hash calculation.

regards, tom lane


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: global temporary tables
Date: 2010-04-25 07:49:55
Message-ID: 1272181795.4161.1604.camel@ebony
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, 2010-04-24 at 19:01 -0400, Robert Haas wrote:

> There is one major problem, though: assigning a
> scratch relfilenode to the temporary table requires generating an OID,
> which we currently have no way to allow on the standby.

Why not have an unlogged counter, which resets each system start, using
same datatype as an oid. There's no necessity for the relfilenode to be
an actual oid is there?

That way we could use it on standbys also.

--
Simon Riggs www.2ndQuadrant.com


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: global temporary tables
Date: 2010-04-25 10:50:23
Message-ID: i2z603c8f071004250350p35754194pd7821ce29c4461b3@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Apr 25, 2010 at 3:49 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On Sat, 2010-04-24 at 19:01 -0400, Robert Haas wrote:
>
>> There is one major problem, though: assigning a
>> scratch relfilenode to the temporary table requires generating an OID,
>> which we currently have no way to allow on the standby.
>
> Why not have an unlogged counter, which resets each system start, using
> same datatype as an oid. There's no necessity for the relfilenode to be
> an actual oid is there?
>
> That way we could use it on standbys also.

I don't think that quite works, because the standby might assign a
relfilenode number for a global temp table and then the master might
subsequently assign the same relfilenode number to a regular table.
We might be able to make that not matter, but it's far from obvious to
me that there are no gotchas there...

...Robert


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jim Nasby <decibel(at)decibel(dot)org>, Greg Sabino Mullane <greg(at)turnstep(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: global temporary tables
Date: 2010-04-25 12:46:17
Message-ID: x2s603c8f071004250546k53ca5d58w5377d3d53430b602@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Apr 24, 2010 at 11:02 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> Pushing it into the RelFileNode has some advantages in terms of being
>> able to get at the information from everywhere, but one thing that
>> makes me think that's probably not a good decision is that we somtimes
>> WAL-log relfilenodes.  And WAL-logging the value of the isTemp flag is
>> a waste, because if we're WAL-logging, it's zero.
>
> Yeah.  I think we also use RelFileNode as a hash tag in places, and
> so adding a bool to it would be problematic for a couple of reasons:
> possibly uninitialized pad bytes, and uselessly incorporating more bytes
> into the hash calculation.

Right. I was thinking about the padding issue, too. I took a crack
at adding an isTemp argument to smgropen() and removing it everywhere
else, but it turns out this isn't as straightforward as it ought to be
because nbtsort.c, rewriteheap.c, and tablecmds.c feel entitled to
violate the abstraction layer by passing isTemp = true to smgrwrite()
and/or smgrextend() even when the real value is false, under the
assumption that the only thing isTemp is doing in those functions is
controlling fsync behavior. I think we'll have to replace the isTemp
argument to those functions with a boolean whose explicit charter is
to do what they're using it for.

Removing the isTemp from smgrtruncate() and smgrdounlink() argument
looks easy, though. WIP patch that takes it that far is attached.

[davidfetter: patch description for PWN is "WIP patch to push isTemp
down into the smgr layer"]

...Robert

Attachment Content-Type Size
smgropen_istemp_wip.patch text/x-diff 14.9 KB

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: global temporary tables
Date: 2010-04-25 12:57:21
Message-ID: 1272200241.4161.1670.camel@ebony
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, 2010-04-25 at 06:50 -0400, Robert Haas wrote:
> On Sun, Apr 25, 2010 at 3:49 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> > On Sat, 2010-04-24 at 19:01 -0400, Robert Haas wrote:
> >
> >> There is one major problem, though: assigning a
> >> scratch relfilenode to the temporary table requires generating an OID,
> >> which we currently have no way to allow on the standby.
> >
> > Why not have an unlogged counter, which resets each system start, using
> > same datatype as an oid. There's no necessity for the relfilenode to be
> > an actual oid is there?
> >
> > That way we could use it on standbys also.
>
> I don't think that quite works, because the standby might assign a
> relfilenode number for a global temp table and then the master might
> subsequently assign the same relfilenode number to a regular table.
> We might be able to make that not matter, but it's far from obvious to
> me that there are no gotchas there...

That sounds fairly simple to solve.

All I am saying is please include "working on the standby" as part of
your requirement.

--
Simon Riggs www.2ndQuadrant.com


From: Jim Nasby <decibel(at)decibel(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Sabino Mullane <greg(at)turnstep(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: global temporary tables
Date: 2010-04-25 14:26:37
Message-ID: 14A49FEC-5671-4961-88CB-0A7A6A0C58A4@decibel.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Apr 24, 2010, at 10:02 PM, Tom Lane wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> Pushing it into the RelFileNode has some advantages in terms of being
>> able to get at the information from everywhere, but one thing that
>> makes me think that's probably not a good decision is that we somtimes
>> WAL-log relfilenodes. And WAL-logging the value of the isTemp flag is
>> a waste, because if we're WAL-logging, it's zero.
>
> Yeah. I think we also use RelFileNode as a hash tag in places, and
> so adding a bool to it would be problematic for a couple of reasons:
> possibly uninitialized pad bytes, and uselessly incorporating more bytes
> into the hash calculation.

Do we need to hash it that frequently? Do we insert tons of them into WAL? Worrying about those cases smells like premature optimization, but admittedly I don't have enough knowledge here...
--
Jim C. Nasby, Database Architect jim(at)nasby(dot)net
512.569.9461 (cell) http://jim.nasby.net


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: global temporary tables
Date: 2010-04-25 15:35:18
Message-ID: n2p603c8f071004250835x138a5924t47008cc91da09ddf@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Apr 25, 2010 at 8:57 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>> I don't think that quite works, because the standby might assign a
>> relfilenode number for a global temp table and then the master might
>> subsequently assign the same relfilenode number to a regular table.
>> We might be able to make that not matter, but it's far from obvious to
>> me that there are no gotchas there...
>
> That sounds fairly simple to solve.
>
> All I am saying is please include "working on the standby" as part of
> your requirement.

Well, I think I already basically stated my position on this, but let
me try to be more clear. I'm not promising to work on any portion of
this project AT ALL or to have it done at any particular time. I am
specifically not promising to do the extra work required to make it
work with Hot Standby, although I am also not saying that I won't.
Nobody has offered to fund any portion of this work, so there are no
guarantees, full stop. Even if you could successfully convince a
critical mass of people on this list that the work should not be
committed without adding support for temp tables in Hot Standby mode,
the most likely result of that would be that I would give up and not
pursue this at all, rather than that I would agree to do that in
addition to solving all the problems already discussed. And I don't
think you can even get that far, because I don't think too many people
here are going to say that we shouldn't add global temporary tables
unless we can also make them work with Hot Standby.

In all honesty, I would think that you would be happy about my
possibly implementing a flavor of temporary tables that would be
substantially more feasible to make work with Hot Standby than the
kind we have now, rather than (as you seem to be) complaining that I'm
not solving the entire problem.

...Robert


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jim Nasby <decibel(at)decibel(dot)org>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Sabino Mullane <greg(at)turnstep(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: global temporary tables
Date: 2010-04-25 19:02:23
Message-ID: 7079.1272222143@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Jim Nasby <decibel(at)decibel(dot)org> writes:
> On Apr 24, 2010, at 10:02 PM, Tom Lane wrote:
>> Yeah. I think we also use RelFileNode as a hash tag in places, and
>> so adding a bool to it would be problematic for a couple of reasons:
>> possibly uninitialized pad bytes, and uselessly incorporating more bytes
>> into the hash calculation.

> Do we need to hash it that frequently?

Every buffer access, for instance.

> Do we insert tons of them into WAL?

Yes; the vast majority of WAL records contain one.

regards, tom lane


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: global temporary tables
Date: 2010-04-25 21:29:47
Message-ID: 4BD4B44B.6030406@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert,

> (1). What I *think* it is supposed to mean is that the table is a
> permanent object which is "globally" visible - that is, it's part of
> some non-temp schema like public or $user and it's column definitions
> etc. are visible to all backends - and it's not automatically removed
> on commit, backend exit, etc. - but the *contents* of the table are
> temporary and backend-local, so that each new backend initially sees
> it as empty and can then insert, update, and delete data independently
> of what any other backend does.

While closer to the standard, the above definition is a lot less useful
than what I believe a lot of people want, which is a table which is
globally visible, but has no durability; that is, it does not get
WAL-logged or recovered on restart. Certainly this latter definition
would be far more useful to support materialized views.

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: global temporary tables
Date: 2010-04-25 21:51:28
Message-ID: 4BD4B960.5010204@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Josh Berkus wrote:
> Robert,
>
>> (1). What I *think* it is supposed to mean is that the table is a
>> permanent object which is "globally" visible - that is, it's part of
>> some non-temp schema like public or $user and it's column definitions
>> etc. are visible to all backends - and it's not automatically removed
>> on commit, backend exit, etc. - but the *contents* of the table are
>> temporary and backend-local, so that each new backend initially sees
>> it as empty and can then insert, update, and delete data independently
>> of what any other backend does.
>
> While closer to the standard, the above definition is a lot less
> useful than what I believe a lot of people want, which is a table
> which is globally visible, but has no durability; that is, it does not
> get WAL-logged or recovered on restart. Certainly this latter
> definition would be far more useful to support materialized views.

These are not mutually exclusive features. What you're asking for has
value, certainly, but it's not a temp table in the standard's terms
(which is a feature that also has value, I believe).

cheers

andrew


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: global temporary tables
Date: 2010-04-25 21:54:44
Message-ID: q2u603c8f071004251454of94da533n61da0ed02fb0f074@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Apr 25, 2010 at 5:29 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> Robert,
>
>> (1).  What I *think* it is supposed to mean is that the table is a
>> permanent object which is "globally" visible - that is, it's part of
>> some non-temp schema like public or $user and it's column definitions
>> etc. are visible to all backends - and it's not automatically removed
>> on commit, backend exit, etc. - but the *contents* of the table are
>> temporary and backend-local, so that each new backend initially sees
>> it as empty and can then insert, update, and delete data independently
>> of what any other backend does.
>
> While closer to the standard, the above definition is a lot less useful than
> what I believe a lot of people want, which is a table which is globally
> visible, but has no durability; that is, it does not get WAL-logged or
> recovered on restart.  Certainly this latter definition would be far more
> useful to support materialized views.

I think it's arguable which one is more useful, but I think a good
deal of the infrastructure can be made to serve both purposes, as I
further expounded upon here.

http://archives.postgresql.org/pgsql-hackers/2010-04/msg01123.php

...Robert


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: global temporary tables
Date: 2010-04-26 19:30:47
Message-ID: 1272310247.4161.5081.camel@ebony
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, 2010-04-25 at 11:35 -0400, Robert Haas wrote:

> And I don't
> think you can even get that far, because I don't think too many people
> here are going to say that we shouldn't add global temporary tables
> unless we can also make them work with Hot Standby.

The policy round here for some time has been that when we implement
things we make them work fully and seamlessly. I don't see why Hot
Standby would be singled out any more than any other feature, say
Windows support or tablespaces should be occasionally ignored.

People need to get used to the new feature set, just as we had to with
HOT, subtransactions, prepared transactions, Gist etc.. That may require
a thwack from various people, but the responsibility lies with the new
feature implementor, not the person supporting existing code.

I fully understand your wish to implement a partial feature with caveats
because I have argued that many times myself. But I've come to realise
that the best way is to build things so they work cleanly across the
board. Other developers can plan projects in the knowledge that they can
build directly on firm foundations, not fill in the cracks. In the end
this comes down to a choice as developers, do we help each other by
doing a full job, or do we leave unexploded bombs for each other through
short-termism? Now I understand this better myself, I act differently
and accept objections if people think a fuller, more complete design is
what is needed. Recent demonstrations of that available, both objecting
and accepting.

Don't see this as an extra task, just see it as one of the many aspects
that will need to be considered when developing it. If you do that it
need not be additional work.

--
Simon Riggs www.2ndQuadrant.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: global temporary tables
Date: 2010-04-26 19:40:27
Message-ID: 12163.1272310827@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
> On Sun, 2010-04-25 at 11:35 -0400, Robert Haas wrote:
>> And I don't
>> think you can even get that far, because I don't think too many people
>> here are going to say that we shouldn't add global temporary tables
>> unless we can also make them work with Hot Standby.

> The policy round here for some time has been that when we implement
> things we make them work fully and seamlessly. I don't see why Hot
> Standby would be singled out any more than any other feature, say
> Windows support or tablespaces should be occasionally ignored.

The current definition of Hot Standby is that it's a *read only*
behavior. Not read mostly. What you are proposing is a rather
fundamental change in the behavior of HS, and it doesn't seem to me
that it should be on the head of anybody else to make it work.

IOW: I agree with Robert that this is not an essential part of global
temp tables. If it happens to fall out that it works like that, great,
but it isn't a requirement.

regards, tom lane


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: global temporary tables
Date: 2010-04-26 19:50:53
Message-ID: 1272311453.4161.5143.camel@ebony
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, 2010-04-26 at 15:40 -0400, Tom Lane wrote:
> Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
> > On Sun, 2010-04-25 at 11:35 -0400, Robert Haas wrote:
> >> And I don't
> >> think you can even get that far, because I don't think too many people
> >> here are going to say that we shouldn't add global temporary tables
> >> unless we can also make them work with Hot Standby.
>
> > The policy round here for some time has been that when we implement
> > things we make them work fully and seamlessly. I don't see why Hot
> > Standby would be singled out any more than any other feature, say
> > Windows support or tablespaces should be occasionally ignored.
>
> The current definition of Hot Standby is that it's a *read only*
> behavior. Not read mostly. What you are proposing is a rather
> fundamental change in the behavior of HS, and it doesn't seem to me
> that it should be on the head of anybody else to make it work.

That's a dangerous precedent you just set.

--
Simon Riggs www.2ndQuadrant.com


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: global temporary tables
Date: 2010-04-26 20:02:53
Message-ID: q2v603c8f071004261302waf6a2eaeyf7956586a0d4e443@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Apr 26, 2010 at 3:30 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On Sun, 2010-04-25 at 11:35 -0400, Robert Haas wrote:
>
>> And I don't
>> think you can even get that far, because I don't think too many people
>> here are going to say that we shouldn't add global temporary tables
>> unless we can also make them work with Hot Standby.
>
> The policy round here for some time has been that when we implement
> things we make them work fully and seamlessly. I don't see why Hot
> Standby would be singled out any more than any other feature, say
> Windows support or tablespaces should be occasionally ignored.
>
> People need to get used to the new feature set, just as we had to with
> HOT, subtransactions, prepared transactions, Gist etc.. That may require
> a thwack from various people, but the responsibility lies with the new
> feature implementor, not the person supporting existing code.
>
> I fully understand your wish to implement a partial feature with caveats
> because I have argued that many times myself. But I've come to realise
> that the best way is to build things so they work cleanly across the
> board. Other developers can plan projects in the knowledge that they can
> build directly on firm foundations, not fill in the cracks. In the end
> this comes down to a choice as developers, do we help each other by
> doing a full job, or do we leave unexploded bombs for each other through
> short-termism? Now I understand this better myself, I act differently
> and accept objections if people think a fuller, more complete design is
> what is needed. Recent demonstrations of that available, both objecting
> and accepting.
>
> Don't see this as an extra task, just see it as one of the many aspects
> that will need to be considered when developing it. If you do that it
> need not be additional work.

I think you're looking at this the wrong way. If temporary tables
have to work with Hot Standby in order for it to be committable, then
we should never have committed Hot Standby in the first place because
our current flavor of temporary tables doesn't. Was that an oversight
on your part, or a recognition that you can't solve every problem in
one commit?

...Robert


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: global temporary tables
Date: 2010-04-26 20:06:44
Message-ID: 12880.1272312404@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
> On Mon, 2010-04-26 at 15:40 -0400, Tom Lane wrote:
>> The current definition of Hot Standby is that it's a *read only*
>> behavior. Not read mostly. What you are proposing is a rather
>> fundamental change in the behavior of HS, and it doesn't seem to me
>> that it should be on the head of anybody else to make it work.

> That's a dangerous precedent you just set.

[ shrug... ] If you have near-term solutions for all the *other*
problems that would be involved (like what XID to put into rows you
insert in the temp tables) then I might think that what you're asking
Robert to do is reasonable. Personally I think non-read-only HS is
entirely pie in the sky, and therefore it's not reasonable to saddle
unrelated development tasks with an expectation that they should work
with a behavior that probably won't ever happen.

regards, tom lane


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Sabino Mullane <greg(at)turnstep(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: global temporary tables
Date: 2010-04-26 20:18:47
Message-ID: 20100426201847.GF3963@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane escribió:
> [ forgot to respond to this part ]
>
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> > ... I don't see the problem with DROP.
> > Under the proposed design, it's approximately equivalent to dropping a
> > table that someone else has truncated. You just wait for the
> > necessary lock and then do it.
>
> And do *what*? You can remove the catalog entries, but how are you
> going to make the physical storage of other backends' versions go away?
> (To say nothing of making them flush their local buffers for it.)

Maybe we could add a sinval message to that effect.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support