Re: idea: global temp tables

Lists: pgsql-hackers
From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: idea: global temp tables
Date: 2009-04-27 20:44:18
Message-ID: 162867790904271344s1ec96d90j6cde295fdcc7806f@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello

I am thinking about global temp tables. One possible solution is
creating global temporary table like normal table and in planner stage
check using this table. When some global temporary table is detected,
then real temporary table is created and used in execution plan. It's
like:

CREATE GLOBAL TEMP TABLE foo(a varchar); -- create global empty table foo
SELECT * FROM foo;
a) is relevant temp table for foo, use it
a) when not, then CREATE TEMP TABLE pg_temp_1.foo(LIKE foo INCLUDING
DEFAULTS INCLUDING CONSTRAINTS INCLUDING INDEXES);
b) transform origin query to SELECT * FROM pg_temp_1.foo;

Ideas? Notes? Objections?

regards
Pavel Stehule


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>, "PostgreSQL Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: idea: global temp tables
Date: 2009-04-27 21:06:36
Message-ID: 49F5D80C.EE98.0025.0@wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:

> I am thinking about global temp tables.

These would have some value to us.

In case anyone doesn't know, this is a feature in the SQL standard.
You have a permanent definition of the schema, but the table is
materialized as a temporary table on reference by any connection.

I can't speak to the practicality of the proposed implementation
techniques.

-Kevin


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: idea: global temp tables
Date: 2009-04-27 21:22:14
Message-ID: 49F62206.6030705@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Kevin Grittner wrote:
> Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>
>
>> I am thinking about global temp tables.
>>
>
> These would have some value to us.
>
> In case anyone doesn't know, this is a feature in the SQL standard.
> You have a permanent definition of the schema, but the table is
> materialized as a temporary table on reference by any connection.
>
> I can't speak to the practicality of the proposed implementation
> techniques.
>
>
>

Using a global table to achieve schema-persistent temp tables seems like
a horrid hack - what would you do if the table used a type other than a
standard built-in type?

Or perhaps Pavel doesn't really mean "global" as the term is used in
Postgres (c.f. the pg_database table)?

cheers

andrew


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Andrew Dunstan" <andrew(at)dunslane(dot)net>
Cc: "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>, "PostgreSQL Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: idea: global temp tables
Date: 2009-04-27 21:27:36
Message-ID: 49F5DCF8.EE98.0025.0@wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:

> Or perhaps Pavel doesn't really mean "global" as the term is used
> in Postgres (c.f. the pg_database table)?

I'd bet that he doesn't. He's taking terminology from the standard,
where it means "not limited to one SQL-client module". It just means
it is available as long as you are using the connection.

-Kevin


From: "A(dot)M(dot)" <agentm(at)themactionfaction(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: idea: global temp tables
Date: 2009-04-27 21:32:50
Message-ID: 1B144E02-2143-4AAA-BA3C-956D9A854BCC@themactionfaction.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

- -----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Apr 27, 2009, at 4:44 PM, Pavel Stehule wrote:

> Hello
>
> I am thinking about global temp tables. One possible solution is
> creating global temporary table like normal table and in planner stage
> check using this table. When some global temporary table is detected,
> then real temporary table is created and used in execution plan. It's
> like:
>
> CREATE GLOBAL TEMP TABLE foo(a varchar); -- create global empty
> table foo
> SELECT * FROM foo;
> a) is relevant temp table for foo, use it
> a) when not, then CREATE TEMP TABLE pg_temp_1.foo(LIKE foo INCLUDING
> DEFAULTS INCLUDING CONSTRAINTS INCLUDING INDEXES);
> b) transform origin query to SELECT * FROM pg_temp_1.foo;
>
> Ideas? Notes? Objections?

When will postgresql offer "global" temporary tables with data which
are shared among sessions? Such tables are great for transient data
such as web session data where writing to the WAL is a waste. (On DB
startup, the tables would simply be empty.) We're currently stuck with
the memcached plugin which makes it impossible to use database
constructs such as foreign keys against the temporary data.

Cheers,
M
- -----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.10 (Darwin)

iEYEARECAAYFAkn2JHQACgkQqVAj6JpR7t4YRgCdGj8JPJY61PPaK79jnPFXu8c7
vjIAn2F1lA0Nr/2EHVPcYQohWqGjWElK
=3zYu
- -----END PGP SIGNATURE-----
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.10 (Darwin)

iEYEARECAAYFAkn2JIIACgkQqVAj6JpR7t6IOgCdE0le+MAlcwCYNqEt+w9jt/Y3
Z/sAni8Jm3ndYZSI1pIQLBVtKnBnJ8Ee
=VXWF
-----END PGP SIGNATURE-----


From: Dimitri Fontaine <dfontaine(at)hi-media(dot)com>
To: A(dot)M(dot) <agentm(at)themactionfaction(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: idea: global temp tables
Date: 2009-04-27 21:39:22
Message-ID: B2CF45DE-97D6-4ED2-A240-8492932E855B@hi-media.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

Le 27 avr. 09 à 23:32, A.M. a écrit :
> When will postgresql offer "global" temporary tables with data which
> are shared among sessions? Such tables are great for transient data
> such as web session data where writing to the WAL is a waste. (On DB
> startup, the tables would simply be empty.) We're currently stuck
> with the memcached plugin which makes it impossible to use database
> constructs such as foreign keys against the temporary data.

If using 8.3 you can SET LOCAL synchronous_commit TO off; for web
session management transactions, it'll skip the WAL fsync'ing, which
is already a good start.

HTH,
--
dim


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "PostgreSQL Hackers" <pgsql-hackers(at)postgresql(dot)org>, "A(dot)M(dot)" <agentm(at)themactionfaction(dot)com>
Subject: Re: idea: global temp tables
Date: 2009-04-27 21:46:13
Message-ID: 49F5E155.EE98.0025.0@wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"A.M." <agentm(at)themactionfaction(dot)com> wrote:

> When will postgresql offer "global" temporary tables with data
> which are shared among sessions?

Well, that would certainly be far different from what the standard
calls a temporary table of any flavor. In the standard all temporary
tables are restricted to a single connection, and the scope is:

GLOBAL: Schema always present. Once materialized, present for as
long as the connection exists.

CREATED LOCAL: Schema always present. Once materialized, visible
only within a particular module.

DECLARED LOCAL: No permanent schema. Materialized when declared in a
compound statement (standard BEGIN/END; not related to transaction
boundaries), and automatically dropped on exit from the compound
statement.

Current PostgreSQL temporary tables are sort of a hybrid between
GLOBAL and DECLARED LOCAL temporary tables from the standard.

-Kevin


From: "A(dot)M(dot)" <agentm(at)themactionfaction(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: idea: global temp tables
Date: 2009-04-27 21:51:16
Message-ID: 2E8BFBEB-9B84-4178-9D67-64A7B11CBE26@themactionfaction.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Apr 27, 2009, at 5:39 PM, Dimitri Fontaine wrote:

> Hi,
>
> Le 27 avr. 09 à 23:32, A.M. a écrit :
>> When will postgresql offer "global" temporary tables with data
>> which are shared among sessions? Such tables are great for
>> transient data such as web session data where writing to the WAL is
>> a waste. (On DB startup, the tables would simply be empty.) We're
>> currently stuck with the memcached plugin which makes it impossible
>> to use database constructs such as foreign keys against the
>> temporary data.
>
>
> If using 8.3 you can SET LOCAL synchronous_commit TO off; for web
> session management transactions, it'll skip the WAL fsync'ing, which
> is already a good start.

That's pretty close, but it's not table specific and wouldn't let us
to reliably mix transient data changes with real data changes.

Cheers,
M
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.10 (Darwin)

iEYEARECAAYFAkn2KNcACgkQqVAj6JpR7t4OrQCgpU9K3FzG2LWWyM245vUaop1G
ZMIAn379RDewxKUmCsZsWLo8KdWAYGIs
=kHl5
-----END PGP SIGNATURE-----


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "PostgreSQL Hackers" <pgsql-hackers(at)postgresql(dot)org>, "A(dot)M(dot)" <agentm(at)themactionfaction(dot)com>
Subject: Re: idea: global temp tables
Date: 2009-04-27 22:01:40
Message-ID: 49F5E4F4.EE98.0025.0@wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"A.M." <agentm(at)themactionfaction(dot)com> wrote:
> On Apr 27, 2009, at 5:39 PM, Dimitri Fontaine wrote:
>> Le 27 avr. 09 à 23:32, A.M. a écrit :
>>> When will postgresql offer "global" temporary tables with data
>>> which are shared among sessions? Such tables are great for
>>> transient data such as web session data where writing to the WAL is

>>> a waste. (On DB startup, the tables would simply be empty.) We're

>>> currently stuck with the memcached plugin which makes it impossible

>>> to use database constructs such as foreign keys against the
>>> temporary data.
>>
>>
>> If using 8.3 you can SET LOCAL synchronous_commit TO off; for web
>> session management transactions, it'll skip the WAL fsync'ing, which

>> is already a good start.
>
> That's pretty close, but it's not table specific and wouldn't let us

> to reliably mix transient data changes with real data changes.

Yeah, we have a dozen or so tables we use with the pattern you
describe; so the feature you describe would also have some value for
us. To avoid confusion, we don't refer to these as "temporary
tables", but rather as "permanent work tables". Again, I can't
comment on practical issues regarding implementation; but it would be
a "nice feature" to add some day. The tricky bit would be to figure
out how to ensure that it got cleaned up properly, especially if the
PostgreSQL went down or client processes wend down before tidying up.

-Kevin


From: "A(dot)M(dot)" <agentm(at)themactionfaction(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: idea: global temp tables
Date: 2009-04-28 01:39:48
Message-ID: 202117E0-6D79-4329-9B69-B47A0546EC84@themactionfaction.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On Apr 27, 2009, at 6:01 PM, Kevin Grittner wrote:

> "A.M." <agentm(at)themactionfaction(dot)com> wrote:
>> On Apr 27, 2009, at 5:39 PM, Dimitri Fontaine wrote:
>>> Le 27 avr. 09 à 23:32, A.M. a écrit :
>>>> When will postgresql offer "global" temporary tables with data
>>>> which are shared among sessions? Such tables are great for
>>>> transient data such as web session data where writing to the WAL is
>
>>>> a waste. (On DB startup, the tables would simply be empty.) We're
>
>>>> currently stuck with the memcached plugin which makes it impossible
>
>>>> to use database constructs such as foreign keys against the
>>>> temporary data.
>>>
>>>
>>> If using 8.3 you can SET LOCAL synchronous_commit TO off; for web
>>> session management transactions, it'll skip the WAL fsync'ing, which
>
>>> is already a good start.
>>
>> That's pretty close, but it's not table specific and wouldn't let us
>
>> to reliably mix transient data changes with real data changes.
>
> Yeah, we have a dozen or so tables we use with the pattern you
> describe; so the feature you describe would also have some value for
> us. To avoid confusion, we don't refer to these as "temporary
> tables", but rather as "permanent work tables". Again, I can't
> comment on practical issues regarding implementation; but it would be
> a "nice feature" to add some day. The tricky bit would be to figure
> out how to ensure that it got cleaned up properly, especially if the
> PostgreSQL went down or client processes wend down before tidying up.

Actually, for our usage, that's the easiest part- truncate all the
"permanent work tables" whenever the db starts. That's really the
only sane thing to do anyway. That's what I mean by "transient" data-
if it's there, that's great, if not, I can re-generate it (cache) or
I don't care because, if the database goes down, then the data is
useless on restart anyway.

Cheers,
M


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: idea: global temp tables
Date: 2009-04-28 03:44:01
Message-ID: 162867790904272044i9396e92m3a0dad047ebaec36@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2009/4/27 Andrew Dunstan <andrew(at)dunslane(dot)net>:
>
>
> Kevin Grittner wrote:
>>
>> Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>>>
>>> I am thinking about global temp tables.
>>>
>>
>>  These would have some value to us.
>>  In case anyone doesn't know, this is a feature in the SQL standard. You
>> have a permanent definition of the schema, but the table is
>> materialized as a temporary table on reference by any connection.
>>  I can't speak to the practicality of the proposed implementation
>> techniques.
>>
>>
>
> Using a global table to achieve schema-persistent temp tables seems like a
> horrid hack - what would you do if the table used a type other than a
> standard built-in type?

Where is a problem? - there is normal dependency between types and
relation. Of course, ALTER TABLE have to be little bit different - a)
should be done, when no table is used, b) should be done only on all
temporary tables. But this technique do minimal changes in pg internal
structure. The core of problem is structure of pg_class table, that
contains possibly shared and not shared fields between global tables.
So implementation needs a) significant change of pg_class table OR b)
using some transparent table overloading

One year ago I though about some memory tables for it. But it is too
different and now, when VACUUM should be effective I thing, it is
needless.

>
> Or perhaps Pavel doesn't really mean "global" as the term is used in
> Postgres (c.f. the pg_database table)?
>

no, I though global tables in sense of SQL standard. What do you thing
are shared tables (in pg terminology)

regards
Pavel Stehule

> cheers
>
> andrew
>


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, "A(dot)M(dot)" <agentm(at)themactionfaction(dot)com>
Subject: Re: idea: global temp tables
Date: 2009-04-28 03:47:25
Message-ID: 162867790904272047s331a25a6x40f787941577da49@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2009/4/28 Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>:
> "A.M." <agentm(at)themactionfaction(dot)com> wrote:
>> On Apr 27, 2009, at 5:39 PM, Dimitri Fontaine wrote:
>>> Le 27 avr. 09 à 23:32, A.M. a écrit :
>>>> When will postgresql offer "global" temporary tables with data
>>>> which are shared among sessions? Such tables are great for
>>>> transient data such as web session data where writing to the WAL is
>
>>>> a waste. (On DB startup, the tables would simply be empty.) We're
>
>>>> currently stuck with the memcached plugin which makes it impossible
>
>>>> to use database constructs such as foreign keys against the
>>>> temporary data.
>>>
>>>
>>> If using 8.3 you can SET LOCAL synchronous_commit TO off; for web
>>> session management transactions, it'll skip the WAL fsync'ing, which
>
>>> is already a good start.
>>
>> That's pretty close, but it's not table specific and wouldn't let us
>
>> to reliably mix transient data changes with real data changes.
>
> Yeah, we have a dozen or so tables we use with the pattern you
> describe; so the feature you describe would also have some value for
> us.  To avoid confusion, we don't refer to these as "temporary
> tables", but rather as "permanent work tables".  Again, I can't
> comment on practical issues regarding implementation; but it would be
> a "nice feature" to add some day.  The tricky bit would be to figure
> out how to ensure that it got cleaned up properly, especially if the
> PostgreSQL went down or client processes wend down before tidying up.

For me, GLOBAL TEMP TABLES should significant to increase comfort for
developers. That is main reason.

reagards
Pavel Stehule

>
> -Kevin
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: idea: global temp tables
Date: 2009-04-28 14:59:26
Message-ID: 20090428145926.GK10358@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Pavel Stehule escribió:
> Hello
>
> I am thinking about global temp tables. One possible solution is
> creating global temporary table like normal table and in planner stage
> check using this table. When some global temporary table is detected,
> then real temporary table is created and used in execution plan. It's
> like:
>
> CREATE GLOBAL TEMP TABLE foo(a varchar); -- create global empty table foo
> SELECT * FROM foo;
> a) is relevant temp table for foo, use it
> a) when not, then CREATE TEMP TABLE pg_temp_1.foo(LIKE foo INCLUDING
> DEFAULTS INCLUDING CONSTRAINTS INCLUDING INDEXES);
> b) transform origin query to SELECT * FROM pg_temp_1.foo;
>
> Ideas? Notes? Objections?

Maybe we could make this work by fiddling with a different smgr -- on
it, smgr_sync would be a noop, as would smgr_immedsync, and we could
kludge something up to truncate relations during recovery.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: idea: global temp tables
Date: 2009-04-28 15:18:08
Message-ID: 3617.1240931888@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
> Maybe we could make this work by fiddling with a different smgr -- on
> it, smgr_sync would be a noop, as would smgr_immedsync, and we could
> kludge something up to truncate relations during recovery.

Interesting thought but I think it falls down on pg_statistic.

One comment I've got is that we have already concluded that the spec's
GLOBAL/LOCAL TEMP TABLE distinction is not related to cross-session
persistence of the table definitions, but rather to module visibility
which is a concept we have not got (yet). Ergo, we should not use the
phrase "global temp table" for these things. Not sure what to suggest
instead. Perhaps call them "session tables" instead of "temp tables"?

regards, tom lane


From: "vacuum" <vacuum(at)quantentunnel(dot)de>
To: "'Tom Lane'" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "'Alvaro Herrera'" <alvherre(at)commandprompt(dot)com>
Cc: "'Pavel Stehule'" <pavel(dot)stehule(at)gmail(dot)com>, "'PostgreSQL Hackers'" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: idea: global temp tables
Date: 2009-04-28 15:31:19
Message-ID: 6C5D89CA3E334E93B83579BBE3884195@zero
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Perhaps call them "session tables" instead of "temp tables"?

regards, tom lane

Or "transient table" ...

Maybe we can define when such table lose data

But in real - there is no need in this feature - databases are made to hold
data, not to lose.

If an application requires mechanism to store transient session-data, it
should create its own session-objects.


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>, "PostgreSQL Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: idea: global temp tables
Date: 2009-04-28 15:46:30
Message-ID: 49F6DE86.EE98.0025.0@wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> we have already concluded that the spec's
> GLOBAL/LOCAL TEMP TABLE distinction is not related
> to cross-session persistence of the table definitions

How do you reconcile that conclusion with the following,
from ISO/IEC 9075-2:2003 (E), 4.14 Tables:

"The definition of a global temporary table or a created local
temporary table appears in a schema. In SQL language, the name and the
scope of the name of a global temporary table or a created local
temporary table are indistinguishable from those of a persistent base
table. However, because global temporary table contents are distinct
within SQL-sessions, and created local temporary tables are distinct
within SQL-client modules within SQL-sessions, the effective <schema
name> of the schema in which the global temporary table or the created
local temporary table is instantiated is an implementation-dependent
<schema name> that may be thought of as having been effectively
derived from the <schema name> of the schema in which the global
temporary table or created local temporary table is defined and the
implementation-dependent SQL- session identifier associated with the
SQL-session."

There is a distinction between the definition, which "appears in a
schema" and for which "the name and the scope ... are
indistinguishable from those of a persistent base table", versus the
effective schema in which an instance is materialized, which is
session and/or module dependent.

-Kevin


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>, "PostgreSQL Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: idea: global temp tables
Date: 2009-04-28 15:58:31
Message-ID: 4562.1240934311@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> writes:
> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> we have already concluded that the spec's
>> GLOBAL/LOCAL TEMP TABLE distinction is not related
>> to cross-session persistence of the table definitions

> How do you reconcile that conclusion with the following,
> from ISO/IEC 9075-2:2003 (E), 4.14 Tables:

The point is that what we call "temp tables" are not either global or
local temp tables by the spec's definition. If we invent something that
behaves as Pavel suggests, then it could be considered either a global
or a local temp table per spec (without any module support you can't
really say which it is). We're stuck in a terminological problem
anyway, but it will get a whole lot worse if we fail to acknowledge that
there's more than one property involved here.

regards, tom lane


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>, "PostgreSQL Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: idea: global temp tables
Date: 2009-04-28 15:59:37
Message-ID: 49F6E199.EE98.0025.0@wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> GLOBAL/LOCAL TEMP TABLE distinction is not related to cross-session
> persistence of the table definitions

On a re-read, I think I see your point -- it is the DECLARE LOCAL TEMP
TABLE versus CREATE { GLOBAL | LOCAL } TEMP TABLE which determines
whether the table definition is persisted. Both forms of CREATE TEMP
TABLE should persist the definition if you go by the standard, so you
don't want to muddy the waters by complying on one and not the other?

-Kevin


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>, "PostgreSQL Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: idea: global temp tables
Date: 2009-04-28 16:05:43
Message-ID: 49F6E307.EE98.0025.0@wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> it could be considered either a global or a local temp table per
> spec (without any module support you can't really say which it is).

That seems bogus -- without modules it is clearly not LOCAL. What
Pavel is requesting exactly matches the spec's definition of a global
temporary table, but it does make me uneasy that after accepting the
standard syntax, and behaving differently from it (including making no
distinction between GLOBAL and LOCAL declarations) we would suddenly
go to compliance on GLOBAL declarations but leave LOCAL as is.

Maybe too messy to try to improve.

-Kevin


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>, "PostgreSQL Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: idea: global temp tables
Date: 2009-04-28 16:12:54
Message-ID: 4889.1240935174@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> writes:
> ... Both forms of CREATE TEMP
> TABLE should persist the definition if you go by the standard, so you
> don't want to muddy the waters by complying on one and not the other?

Right. This goes back to our old principle of trying not to use
spec-defined syntax for not-per-spec behavior. We are already behind
the eight ball as far as temp tables go, but let's not make it worse by
blindly picking some spec-defined syntax without a plan for where we go
from here. (I'm assuming that it's reasonably likely that we will want
a spec-compatible module feature someday. We'll really have painted
ourselves into a corner if we don't think about the issue now.)

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>, "PostgreSQL Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: idea: global temp tables
Date: 2009-04-28 16:38:25
Message-ID: 5422.1240936705@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> writes:
> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> it could be considered either a global or a local temp table per
>> spec (without any module support you can't really say which it is).

> That seems bogus -- without modules it is clearly not LOCAL.

You could just as easily say it's not GLOBAL.

> What
> Pavel is requesting exactly matches the spec's definition of a global
> temporary table, but it does make me uneasy that after accepting the
> standard syntax, and behaving differently from it (including making no
> distinction between GLOBAL and LOCAL declarations) we would suddenly
> go to compliance on GLOBAL declarations but leave LOCAL as is.

Right. What I'm suggesting is that before we mess with this we should
have a road map on whether we are going to try to get to spec
compliance in this area, and if so how.

One thing I just noticed is that the spec does not consider GLOBAL/LOCAL
to be optional --- per spec you *must* write one or the other in front
of TEMPORARY. So we could adopt the view that omitting this keyword
implies our current non-spec behavior (which is far too useful to give
up, spec compliance or no) while writing one or the other selects the
spec behavior. However, if we're going to do that then we should start
throwing warnings for use of the keywords, preferably before the release
in which they actually start doing something different.

regards, tom lane


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>, "PostgreSQL Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: idea: global temp tables
Date: 2009-04-28 16:51:47
Message-ID: 49F6EDD3.EE98.0025.0@wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> One thing I just noticed is that the spec does not consider
> GLOBAL/LOCAL to be optional --- per spec you *must* write one or the
> other in front of TEMPORARY.

Agreed.

> So we could adopt the view that omitting this keyword
> implies our current non-spec behavior (which is far too useful to
> give up, spec compliance or no) while writing one or the other
> selects the spec behavior.

+1 (+1)

> However, if we're going to do that then we should start
> throwing warnings for use of the keywords, preferably before the
> release in which they actually start doing something different.

We might actually want to have a major release which rejects the
standard syntax before the release where we implement standard
behavior for it. (After, of course, a major release which issues the
warning.) When we get to the point of breaking existing code (which
is inevitable if we move to compliance here), it's better to break in
a clear and predictable way....

Of course, that would mean that implementation would be three releases
away (warn, disable syntax, reenable syntax with standard semantics).

-Kevin


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>, "PostgreSQL Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: idea: global temp tables
Date: 2009-04-28 17:00:55
Message-ID: 5943.1240938055@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> writes:
> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> However, if we're going to do that then we should start
>> throwing warnings for use of the keywords, preferably before the
>> release in which they actually start doing something different.

> We might actually want to have a major release which rejects the
> standard syntax before the release where we implement standard
> behavior for it. (After, of course, a major release which issues the
> warning.) When we get to the point of breaking existing code (which
> is inevitable if we move to compliance here), it's better to break in
> a clear and predictable way....

I was thinking in terms of throwing a warning in 8.4 and implementing
new behavior in 8.5. An extra release only helps if you assume everyone
adopts that release at some point. The number of questions we see about
multi-version jumps should disabuse people of the notion that everyone
does it that way ...

regards, tom lane


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>, "PostgreSQL Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: idea: global temp tables
Date: 2009-04-28 17:18:38
Message-ID: 49F6F41E.EE98.0025.0@wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> I was thinking in terms of throwing a warning in 8.4 and
> implementing new behavior in 8.5.

If we're on that time frame with it, I guess it's not too early to
suggest what we would implement in 8.5.

I would suggest that it is clear that what Pavel is requesting is
fully compliant with the spec's definition of global temporary tables.
Since the essence of "global" in this context is that they are not
constrained by modules, the absence module support doesn't seem to
create any possible compatibility issue when and if modules are added.

I would suggest (with somewhat less confidence) that both created and
declared local temporary tables might make sense in the context of
whatever procedural languages are supported. In PL/pgSQL, for
example, the declaration for a declared local tempoary table would be
allowed inside the PL's BEGIN block, in the area where local variables
are allowed. Such a table would be visible only within the context of
the block (meaning we would probably need to munge the name somehow to
support recursion or other functions with a duplicate table name).
The temporary table would be materialized at the point where it is
declared, and dropped at the END of the block.

I not clear on whether a created local temporary table should retain
its contents from one invocation of a function to the next. I'm
inclined to think it shouldn't -- that the scope for a materialized
instance is the same as a declared local tempoarary table; the CREATE
just ensures a consistent definition wherever used.

Or perhaps it's just a bad idea to attempt to use the LOCAL syntax
outside of a proper module at all. The GLOBAL option seems clear;
LOCAL seems a bit muddy to me.

-Kevin


From: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: idea: global temp tables
Date: 2009-04-29 09:49:31
Message-ID: 49F822AB.6090908@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dne 28.04.09 16:59, Alvaro Herrera napsal(a):
> Pavel Stehule escribió:

>
> Maybe we could make this work by fiddling with a different smgr -- on
> it, smgr_sync would be a noop, as would smgr_immedsync, and we could
> kludge something up to truncate relations during recovery.

Maybe set path like <db oid>/pg_temp/<session id>/<table oid> do the
work for this kind of table.

Zdenek


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>, "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>
Subject: Re: idea: global temp tables
Date: 2009-04-29 10:33:33
Message-ID: 200904291333.34205.peter_e@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tuesday 28 April 2009 19:38:25 Tom Lane wrote:
> One thing I just noticed is that the spec does not consider GLOBAL/LOCAL
> to be optional --- per spec you *must* write one or the other in front
> of TEMPORARY. So we could adopt the view that omitting this keyword
> implies our current non-spec behavior (which is far too useful to give
> up, spec compliance or no) while writing one or the other selects the
> spec behavior. However, if we're going to do that then we should start
> throwing warnings for use of the keywords, preferably before the release
> in which they actually start doing something different.

There are actually two orthogonal properties at work here: How the table is
visible with respect to modules (LOCAL/GLOBAL) and whether the table
disappears at the end of the session (currently yes, proposed new behavior
optionally no). We should have two separate settings for these.

On the matter of LOCAL/GLOBAL, I think the correct thing to do is to reject
LOCAL and accept GLOBAL as equivalent to the default.

And then invent a separate setting, say EPHEMERAL/PERSISTENT that controls the
behavior that Pavel requested.


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Peter Eisentraut" <peter_e(at)gmx(dot)net>, <pgsql-hackers(at)postgresql(dot)org>
Cc: "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: idea: global temp tables
Date: 2009-04-29 14:13:25
Message-ID: 49F81A35.EE98.0025.0@wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:

> There are actually two orthogonal properties at work here: How the
> table is visible with respect to modules (LOCAL/GLOBAL) and whether
> the table disappears at the end of the session (currently yes,
> proposed new behavior optionally no).

Pavel's request and the standard always have the temporary table
disappearing at the end of the session; the second question for these
is whether the *definition* of the table disappears at the end of the
session, or remains in the schema to generate an instance for another
session on demand.

There was a digression about a possible feature other than what Pavel
requested, which is not defined in the standard, which I referred to
as "permanent work tables", which would be distinguished from normal
tables only by the fact that no effort would be made to log the data
for recovery, and perhaps the table should be truncated on PostgreSQL
startup. If we want to pursue that discussion, it should probably be
on a different thread.

> On the matter of LOCAL/GLOBAL, I think the correct thing to do is to
> reject LOCAL and accept GLOBAL as equivalent to the default.

+1

> And then invent a separate setting, say EPHEMERAL/PERSISTENT that
> controls the behavior that Pavel requested.

While I probably wouldn't have chosen the syntax that the standard
did, the SQL spec does specify a syntax to do what Pavel requested.
I don't think it's so bad that we should ignore it and invent our own
alternative.

-Kevin


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Peter Eisentraut" <peter_e(at)gmx(dot)net>, <pgsql-hackers(at)postgresql(dot)org>
Cc: "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: idea: global temp tables
Date: 2009-04-29 14:21:54
Message-ID: 49F81C32.EE98.0025.0@wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:

> On the matter of LOCAL/GLOBAL, I think the correct thing to do is to
> reject LOCAL and accept GLOBAL as equivalent to the default.

Oops. Read that too quickly before my first reply. I think that we
should, as Tom said, warn on *both* for 8.4, and treat GLOBAL per the
standard in 8.5. I don't know whether we can do anything useful with
LOCAL for 8.5, or whether it should be rejected at that point.

Sorry for my confusion about what you were saying.

-Kevin


From: Greg Stark <stark(at)enterprisedb(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: idea: global temp tables
Date: 2009-04-29 15:12:03
Message-ID: 4136ffa0904290812rd018bffl16a8336a88c14b1e@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Apr 28, 2009 at 6:18 PM, Kevin Grittner
<Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:
> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
>> I was thinking in terms of throwing a warning in 8.4 and
>> implementing new behavior in 8.5.
>
> If we're on that time frame with it, I guess it's not too early to
> suggest what we would implement in 8.5.
>
> I would suggest that it is clear that what Pavel is requesting is
> fully compliant with the spec's definition of global temporary tables.

I think there are implicit assumptions that the spec is making about
the performance implications of using these temporary tables. It's
offering a tool that can reasonably be used in place of views and CTEs
in otherwise pure DML.

I didn't follow precisely what Pavel was describing but IMHO anything
which does any DDL, even implicitly, would be make the feature
impractical in many cases where it really ought to work. Anything
which causes pg_class to bloat or require special vacuum strategies is
just not going to scale.

The whole point of having the schema declared in advance and then
having each procedure execution have access to a private (or
non-private) data store following that predefined schema is to avoid
having to execute any catalog changes with all the locking and catalog
i/o that DDL requires.

--
greg


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Greg Stark" <stark(at)enterprisedb(dot)com>
Cc: "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>, "PostgreSQL Hackers" <pgsql-hackers(at)postgresql(dot)org>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: idea: global temp tables
Date: 2009-04-29 15:24:42
Message-ID: 49F82AEA.EE98.0025.0@wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Stark <stark(at)enterprisedb(dot)com> wrote:

> The whole point of having the schema declared in advance and then
> having each procedure execution have access to a private (or
> non-private) data store following that predefined schema is to avoid
> having to execute any catalog changes with all the locking and
> catalog i/o that DDL requires.

Global temporary tables are not shared between connections. Look back
to Pavel's original post; he's looking for a convenience -- a way to
have a temporary table materialized for a connection on reference,
"INCLUDING DEFAULTS INCLUDING CONSTRAINTS INCLUDING INDEXES". It is
not meant to be better in performance than creating a temporary table
explicitly on the fly and adding all these things, it's meant to
provide a consistent definition which can be materialized and used on
demand. Nothing more; nothing less. I understand not everyone has a
need for such a thing, but isn't that true of most features in
PostgreSQL?

If you're saying we can implement the standard's global temporary
tables in a way that performs better than current temporary tables,
that's cool. That would be a nice "bonus" in addition to the
application programmer convenience and having another tick-mark on the
standards compliance charts. Do you think that's feasible? If not,
the feature would be useful to some with the same performance that
temporary tables currently provide.

-Kevin


From: Greg Stark <stark(at)enterprisedb(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: idea: global temp tables
Date: 2009-04-29 17:28:08
Message-ID: 4136ffa0904291028q8dd226bu6b47ed2dc6f4b3b5@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Apr 29, 2009 at 4:24 PM, Kevin Grittner
<Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:
> I understand not everyone has a
> need for such a thing, but isn't that true of most features in
> PostgreSQL?

Well I think implementing a feature which only works if it's used at
low transaction rates as a convenience wrapper for a single command
seems like a lot of work for little gain.

> If you're saying we can implement the standard's global temporary
> tables in a way that performs better than current temporary tables,
> that's cool.  That would be a nice "bonus" in addition to the
> application programmer convenience and having another tick-mark on the
> standards compliance charts.

Well I claim it's not just a nice bonus but is the difference between
implementing something which falls technically within the standard's
rules but fails to actually be useful for the standard's intended
purpose. I claim there's an implied expectation that by predefining
these schema definitions you eliminate the overhead of DDL creating
and dropping tables on the fly. That you can basically explicitly code
up algorithms which might be too complex or detailed for an SQL query
which get executed as high transaction rate DML using temporary
storage just as our SQL engine uses it in materialize nodes and sort
nodes.

Just to give a real-world example, think of web pages that do paging
of moderately complex query results. You often want to calculate the
total number of matches and then also return a subset of those
matches. Currently the only practical way to do it is to execute the
query twice.Creating a temporary table for this purpose would
transform your simple read-only DML into a complex DDL operation you
can't expose to the masses without a lot of precautions and extra
consequences.

> Do you think that's feasible?  If not,
> the feature would be useful to some with the same performance that
> temporary tables currently provide.

I've been thinking about Alvaro's idea of a separate smgr. If you had
a single pg_class entry for all sessions but the smgr knew to store
the actual data for it in a session-local file, either in a
session-specific tablespace or using the same mechanism the temporary
files use to direct data then the backend would basically never know
it wasn't a regular table.

It could still use local buffers but it could use the global relcache,
invalidation, locks, etc. I think we would hav eto take a
session-level access lock as soon as we put any data in our local
store. And each DDL operation would have to be visited to see whether
it needs special behaviour for locally stored tables. I suspect most
of them will only be able to be handled if there are no active
sessions using the table so they'll basically be no-ops except for the
catalog changes.

--
greg


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Greg Stark <stark(at)enterprisedb(dot)com>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: idea: global temp tables
Date: 2009-04-29 17:42:49
Message-ID: 10859.1241026969@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Stark <stark(at)enterprisedb(dot)com> writes:
> Well I claim it's not just a nice bonus but is the difference between
> implementing something which falls technically within the standard's
> rules but fails to actually be useful for the standard's intended
> purpose.

I agree with Kevin's objection that you are unfairly raising the bar
for this feature by demanding a performance improvement to go along
with a functionality change. The use-case for this feature is to
simplify application logic by allowing apps to assume that a temp
table exists without having to create it at the start of a session.
That's particularly handy in connection-pooling scenarios, for instance.
Currently, you have to have some sort of "if exists" check, and you
pay just as much in catalog thrashing as you would if the feature
was present without any catalog optimization.

It would be great to find a way to avoid the catalog thrashing,
but I completely disagree with a point of view that says we can't
have this without solving that first. It's an improvement on the
current state of affairs anyway.

> I've been thinking about Alvaro's idea of a separate smgr. If you had
> a single pg_class entry for all sessions but the smgr knew to store
> the actual data for it in a session-local file, either in a
> session-specific tablespace or using the same mechanism the temporary
> files use to direct data then the backend would basically never know
> it wasn't a regular table.

1. pg_statistic.

2. How you going to have transaction-safe behavior for things like
TRUNCATE, if you don't have an updatable private catalog entry to keep
the current relfilenode in?

> It could still use local buffers but it could use the global relcache,
> invalidation, locks, etc.

Locks would be another big problem: if only smgr knows that different
instances of the table are different, then different backends' locks
would conflict, which would be Bad. This might not matter for simple
read/update, but again TRUNCATE is a counterexample of something that
is likely to be needed and should not cause cross-backend conflicts.

regards, tom lane


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Greg Stark" <stark(at)enterprisedb(dot)com>
Cc: "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>, "PostgreSQL Hackers" <pgsql-hackers(at)postgresql(dot)org>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: idea: global temp tables
Date: 2009-04-29 17:44:31
Message-ID: 49F84BAF.EE98.0025.0@wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Stark <stark(at)enterprisedb(dot)com> wrote:

> I've been thinking about Alvaro's idea of a separate smgr. If you
> had a single pg_class entry for all sessions but the smgr knew to
> store the actual data for it in a session-local file, either in a
> session-specific tablespace or using the same mechanism the
> temporary files use to direct data then the backend would basically
> never know it wasn't a regular table.
>
> It could still use local buffers but it could use the global
> relcache, invalidation, locks, etc. I think we would have to take a
> session-level access lock as soon as we put any data in our local
> store. And each DDL operation would have to be visited to see
> whether it needs special behaviour for locally stored tables. I
> suspect most of them will only be able to be handled if there are no
> active sessions using the table so they'll basically be no-ops
> except for the catalog changes.

Any chance that some of these improvements could be applied to temp
tables created with the PostgreSQL-specific syntax while we're at it?
The need for several tables to be created on disk to materialize a
single temp table currently causes performance problems in some
contexts. I don't think the updates to the system tables have the
same magnitude of performance hit as creating these tables, especially
if write barriers are on.

-Kevin


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: "Greg Stark" <stark(at)enterprisedb(dot)com>, "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>, "PostgreSQL Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: idea: global temp tables
Date: 2009-04-29 17:52:10
Message-ID: 11074.1241027530@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> writes:
> Any chance that some of these improvements could be applied to temp
> tables created with the PostgreSQL-specific syntax while we're at it?

You mean the Postgres-specific behavior, no?

Trying to support a table without *any* pre-existing catalog entries
seems even harder than doing it without changing the pre-existing
catalog entries. It's something we've thought about before and failed
to find a decent solution for. I think there was some discussion of
allowing pg_class and other relevant tables to implicitly have
session-local child tables that would hold session-local catalog entries
(and be stored in the local bufmgr); but AFAIR we never worked all the
bugs out of the concept.

> The need for several tables to be created on disk to materialize a
> single temp table currently causes performance problems in some
> contexts. I don't think the updates to the system tables have the
> same magnitude of performance hit as creating these tables, especially
> if write barriers are on.

True; it's pure supposition that avoiding the catalog thrashing is
actually important in context of everything else that has to happen.

regards, tom lane


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "Greg Stark" <stark(at)enterprisedb(dot)com>, "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>, "PostgreSQL Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: idea: global temp tables
Date: 2009-04-29 18:04:24
Message-ID: 49F85057.EE98.0025.0@wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> writes:
>> Any chance that some of these improvements could be applied to temp
>> tables created with the PostgreSQL-specific syntax while we're at
>> it?
>
> You mean the Postgres-specific behavior, no?
>
> Trying to support a table without *any* pre-existing catalog entries
> seems even harder than doing it without changing the pre-existing
> catalog entries.

I can't say that I followed all of Greg's ideas, but it seemed that
some of them related to cheaper ways to materialize the body of the
temp table, as opposed to updating the system tables. That seemed
like it might be orthogonal to the issue of persistent temp table
definitions, and perhaps the ideas could help performance of all temp
tables, including the PostgreSQL-specific variety. Being out of my
depth on the technical issues he was discussing, I can't really do
more than pose the question, however....

-Kevin


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Greg Stark <stark(at)enterprisedb(dot)com>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: idea: global temp tables
Date: 2009-04-29 18:30:43
Message-ID: 162867790904291130jbe00fb9td5ffe4be201419fa@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2009/4/29 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> Greg Stark <stark(at)enterprisedb(dot)com> writes:
>> Well I claim it's not just a nice bonus but is the difference between
>> implementing something which falls technically within the standard's
>> rules but fails to actually be useful for the standard's intended
>> purpose.
>
> I agree with Kevin's objection that you are unfairly raising the bar
> for this feature by demanding a performance improvement to go along
> with a functionality change.  The use-case for this feature is to
> simplify application logic by allowing apps to assume that a temp
> table exists without having to create it at the start of a session.
> That's particularly handy in connection-pooling scenarios, for instance.
> Currently, you have to have some sort of "if exists" check, and you
> pay just as much in catalog thrashing as you would if the feature
> was present without any catalog optimization.
>

exactly

> It would be great to find a way to avoid the catalog thrashing,
> but I completely disagree with a point of view that says we can't
> have this without solving that first.  It's an improvement on the
> current state of affairs anyway.
>
>> I've been thinking about Alvaro's idea of a separate smgr. If you had
>> a single pg_class entry for all sessions but the smgr knew to store
>> the actual data for it in a session-local file, either in a
>> session-specific tablespace or using the same mechanism the temporary
>> files use to direct data then the backend would basically never know
>> it wasn't a regular table.
>
> 1. pg_statistic.
>
> 2. How you going to have transaction-safe behavior for things like
> TRUNCATE, if you don't have an updatable private catalog entry to keep
> the current relfilenode in?
>
>> It could still use local buffers but it could use the global relcache,
>> invalidation, locks, etc.
>
> Locks would be another big problem: if only smgr knows that different
> instances of the table are different, then different backends' locks
> would conflict, which would be Bad.  This might not matter for simple
> read/update, but again TRUNCATE is a counterexample of something that
> is likely to be needed and should not cause cross-backend conflicts.

I though about some techniques for elimination changes in pg_class and
pg_statistic. Teoreticly, we could to overwrite some columns (or
complete rows) from these tables via stored values in memory. My last
(and not sucessfull) prototype was based on some alchymy over
syscache. It was wrong way.

Maybe we could do some like

int get_relpages(oid)
{
tuple = read_tuple_pg_class(oid);
if is_global(tuple)
{
tuple2 = find_global(oid);
if (tuple2 == NULL)
{
store_global(tuple);
return relpages(tuple);
}
else
return relpages(tuple2);
}
else
return relpages(tuple);
}

But question?

about MVCC?
Is necessary to use MVCC on pg_statistic and some columns from pg_proc?

regards
Pavel Stehule

>
>                        regards, tom lane
>


From: Greg Stark <stark(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: idea: global temp tables
Date: 2009-04-29 18:42:10
Message-ID: 4136ffa0904291142k265c4511l710e5ba1dd2b7196@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Apr 29, 2009 at 6:42 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Greg Stark <stark(at)enterprisedb(dot)com> writes:
>> Well I claim it's not just a nice bonus but is the difference between
>> implementing something which falls technically within the standard's
>> rules but fails to actually be useful for the standard's intended
>> purpose.
>
> I agree with Kevin's objection that you are unfairly raising the bar
> for this feature by demanding a performance improvement to go along
> with a functionality change.

I think having the feature is making a promise that we can't keep.
Having a feature which meets the letter of the rules but fails to
actually work as users have a right to expect is going to trap people
unaware that they're writing code that works in testing but will never
scale.

> The use-case for this feature is to
> simplify application logic by allowing apps to assume that a temp
> table exists without having to create it at the start of a session.
> That's particularly handy in connection-pooling scenarios, for instance.
> Currently, you have to have some sort of "if exists" check, and you
> pay just as much in catalog thrashing as you would if the feature
> was present without any catalog optimization.

That seems like a trivial little annoyance. Spending effort fixing
that that in a way that will only have to be replaced if we ever want
to support using temporary tables for heavy oltp load is a waste of
effort.

> It would be great to find a way to avoid the catalog thrashing,
> but I completely disagree with a point of view that says we can't
> have this without solving that first.  It's an improvement on the
> current state of affairs anyway.

Not if it promises something we can't deliver. My claim is that the
whole point of having a persistent catalog definition is *precisely*
to avoid the catalog thrashing and that's obvious to users who would
be using this feature. This is just like the idea of prepared queries
-- nowhere in the standard does it say that prepared queries have to
cache the prepared plan and run any faster than non-prepared queries
but if we didn't it would be somewhat broken. Now in that case
planning queries is relativelyl cheap, but how happy do you think
users would be if we supported prepared queries and they built their
application around that fact only to discover that every time any
session executed a PREPARE it created new records in catalog tables?
We would be better off not supporting the command at all so at least
users would know not to head down that dead-end path.

> 1. pg_statistic.
> 2. How you going to have transaction-safe behavior for things like
> TRUNCATE

hm, truncate is a bit of a special case. I had been thinking that ddl
would basically require preventing other backends from using the table
at the same time. A lot of people don't realize truncate behaves like
ddl but in any case I agree it sure seems like it would be important
for temporary tables.

I still maintain that this feature is not primarily about programmer
convenience, but rather comes with an expectation that the schema
definition is being given in advance so that the database doesn't have
to incur the costs of issuing the ddl for every session. If we provide
the interface but not the expected behaviour it could be worse than
not having the interface at all.

--
greg


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Greg Stark" <stark(at)enterprisedb(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>, "PostgreSQL Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: idea: global temp tables
Date: 2009-04-29 18:57:39
Message-ID: 49F85CD3.EE98.0025.0@wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Stark <stark(at)enterprisedb(dot)com> wrote:
> On Wed, Apr 29, 2009 at 6:42 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

>> I agree with Kevin's objection that you are unfairly raising the
>> bar for this feature by demanding a performance improvement to go
>> along with a functionality change.
>
> I think having the feature is making a promise that we can't keep.
> Having a feature which meets the letter of the rules but fails to
> actually work as users have a right to expect is going to trap
> people unaware that they're writing code that works in testing but
> will never scale.

But the scaling issues are there already with temp tables. This
actually makes it better, not worse, because the table can be
materialized once per session, not once per request.

>> The use-case for this feature is to
>> simplify application logic by allowing apps to assume that a temp
>> table exists without having to create it at the start of a session.
>> That's particularly handy in connection-pooling scenarios, for
>> instance. Currently, you have to have some sort of "if exists"
>> check, and you pay just as much in catalog thrashing as you would
>> if the feature was present without any catalog optimization.
>
> That seems like a trivial little annoyance.

And a potentially large performance booster.

>> It would be great to find a way to avoid the catalog thrashing,
>> but I completely disagree with a point of view that says we can't
>> have this without solving that first. It's an improvement on the
>> current state of affairs anyway.
>
> Not if it promises something we can't deliver. My claim is that the
> whole point of having a persistent catalog definition is *precisely*
> to avoid the catalog thrashing and that's obvious to users who would
> be using this feature.

I've re-read the spec on this several times now, and I can't see where
that is implied.

> I still maintain that this feature is not primarily about programmer
> convenience, but rather comes with an expectation that the schema
> definition is being given in advance so that the database doesn't
> have to incur the costs of issuing the ddl for every session.

I think many would be satisfied not to have those costs on every
*request* on the connection.

-Kevin


From: Greg Stark <stark(at)enterprisedb(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: idea: global temp tables
Date: 2009-04-29 19:48:33
Message-ID: 4136ffa0904291248r3b9c78f2x47e0fd5611768468@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Apr 29, 2009 at 7:57 PM, Kevin Grittner
<Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:
> But the scaling issues are there already with temp tables.  This
> actually makes it better, not worse, because the table can be
> materialized once per session, not once per request.

Currently you have to issue CREATE TABLE and associated DDL manually.
That makes it clear to the user that they're executing DDL and should
expect it to behave like DDL.

I don't understand what you mean by a cost once per request. You only
have to create the temporary table on the first request. If you can't
tell which is the first request you only have to test whether it
exists which doesn't incur the consequences that ddl incurs.

What we're talking about means that when someone issues "SELECT * FROM
cache" they're going to -- unexpected -- be a) turning their virtual
transaction id into a real transaction id b) creating a new entry in
pg_catalog and its indexes c) wal logging the new pg_catalog entry
(including having to fsync at commit time) d) acquiring an exclusive
lock on the new entry.

There have been posts by people who were bitten by expecting that they
could create temporary work tables for short frequently run queries
who didn't realize that would mean pg_class would have to be vacuumed
every few minutes and that it would slow down every index lookup for
table names. I don't see it as friendly to make that the implicit
behaviour for innocent looking dml operations.

--
greg


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Greg Stark <stark(at)enterprisedb(dot)com>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: idea: global temp tables
Date: 2009-04-29 19:59:38
Message-ID: 24110.1241035178@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Stark <stark(at)enterprisedb(dot)com> writes:
> I don't understand what you mean by a cost once per request. You only
> have to create the temporary table on the first request. If you can't
> tell which is the first request you only have to test whether it
> exists which doesn't incur the consequences that ddl incurs.

This is all based on utterly-unproven assumptions about relative costs.
In particular, ISTM an additional network round trip or two associated
with testing for/creating a temp table could easily swamp any costs
associated with catalog entry creation. Even if it doesn't,
creating/deleting a few dozen rows in the system catalogs shouldn't
really be something that autovacuum can't deal with. If it were,
we'd be hearing a lot more complaints about the *existing* temp table
feature being unusable. (And yes, I know it's come up once or twice,
but not all that often.)

I'm all for eliminating catalog overheads, if we can find a way to do
that. I don't think that you get to veto implementation of the feature
until we can find a way to optimize it better. The question is not
about whether having the optimization would be better than not having it
--- it's about whether having the unoptimized feature is better than
having no feature at all (which means people have to implement the same
behavior by hand, and they'll *still* not get the optimization).

regards, tom lane


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Stark <stark(at)enterprisedb(dot)com>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: idea: global temp tables
Date: 2009-04-29 20:07:11
Message-ID: 20090429200711.GC6316@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Pavel Stehule escribió:

> But question?
>
> about MVCC?
> Is necessary to use MVCC on pg_statistic and some columns from pg_proc?

Right now, we use MVCC everywhere because we have no choice. But there
are some things that would be better off not using MVCC; and indeed we
use a hack to make things that way in certain, very limited cases (see
callers of heap_update_inplace).

Years ago I proposed a patch (search for pg_class_nt and/or pg_ntclass)
that created a catalog for non-MVCC attributes, with an eye on extending
it to more attributes. The use case for those other attributes had
fatal flaws so it went nowhere, but perhaps we can give the idea another
whirl now. For example I think it would be possible to implement
read-only partitions that don't need vacuuming using that idea.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Greg Stark" <stark(at)enterprisedb(dot)com>
Cc: "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>, "PostgreSQL Hackers" <pgsql-hackers(at)postgresql(dot)org>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: idea: global temp tables
Date: 2009-04-29 20:08:16
Message-ID: 49F86D60.EE98.0025.0@wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Stark <stark(at)enterprisedb(dot)com> wrote:
> On Wed, Apr 29, 2009 at 7:57 PM, Kevin Grittner
> <Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:
>> But the scaling issues are there already with temp tables. This
>> actually makes it better, not worse, because the table can be
>> materialized once per session, not once per request.
>
> Currently you have to issue CREATE TABLE and associated DDL
> manually. That makes it clear to the user that they're executing
> DDL and should expect it to behave like DDL.
>
> I don't understand what you mean by a cost once per request. You
> only have to create the temporary table on the first request. If you
> can't tell which is the first request you only have to test whether
> it exists which doesn't incur the consequences that ddl incurs.

True. I got myself thinking that without this feature people would be
dropping the table after each use, which doesn't have to be the case.

> What we're talking about means that when someone issues "SELECT *
> FROM cache" they're going to -- unexpected -- be a) turning their
> virtual transaction id into a real transaction id b) creating a new
> entry in pg_catalog and its indexes c) wal logging the new
> pg_catalog entry (including having to fsync at commit time) d)
> acquiring an exclusive lock on the new entry.

Only if they are making the first reference to the table in that
session, and it's only unexpected if they don't know that such a
reference to a global temp table can cause the table to materialize.
Surely you will grant that someone referencing such a table should
know what it is?

> There have been posts by people who were bitten by expecting that
> they could create temporary work tables for short frequently run
> queries who didn't realize that would mean pg_class would have to be
> vacuumed every few minutes and that it would slow down every index
> lookup for table names.

Like I said, I have run into performance problems with temp table
creation, especially when write barriers were configured on due to
battery failure or OS misconfiguration, and the cost turned out to be
almost entirely in the creation of the disk files which support the
temp table -- base, toast, indexes, etc. Unless you can fix the big
problems, worrying about the stuff we do optimize well will be a drop
in the bucket.

> I don't see it as friendly to make that the implicit
> behaviour for innocent looking dml operations.

Perhaps a note in the documentation of global temporary tables could
set appropriate expectations? It seems that your whole objection to
adding the requested feature hinges on anticipation of particular user
expectations.

-Kevin


From: Greg Stark <stark(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: idea: global temp tables
Date: 2009-04-29 20:39:03
Message-ID: 4136ffa0904291339sb5c0a41y3ba4c11bffda2246@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Apr 29, 2009 at 8:59 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> This is all based on utterly-unproven assumptions about relative costs.
> In particular, ISTM an additional network round trip or two associated
> with testing for/creating a temp table could easily swamp any costs
> associated with catalog entry creation.  Even if it doesn't,
> creating/deleting a few dozen rows in the system catalogs shouldn't
> really be something that autovacuum can't deal with.

I don't see why it's limited to a few dozen rows. Moderately busy web
sites these days count their traffic in hundreds of page views per
second.

> If it were,
> we'd be hearing a lot more complaints about the *existing* temp table
> feature being unusable.  (And yes, I know it's come up once or twice,
> but not all that often.)

Well my point is that currently you have to type CREATE TEMPORARY
TABLE somewhere which at least gives you a clue that maybe you're
doing something significant.

--
greg


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Greg Stark" <stark(at)enterprisedb(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>, "PostgreSQL Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: idea: global temp tables
Date: 2009-04-29 20:47:07
Message-ID: 49F8767B.EE98.0025.0@wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Stark <stark(at)enterprisedb(dot)com> wrote:

>> creating/deleting a few dozen rows in the system catalogs shouldn't
>> really be something that autovacuum can't deal with.
>
> I don't see why it's limited to a few dozen rows. Moderately busy
> web sites these days count their traffic in hundreds of page views
> per second.

Sure. We're there. And many of those hits run ten to twenty queries.
We'd be insane to get a new connection for each one rather than use a
connection pool; and this overhead only occurs once per referenced
table per connection.

-Kevin


From: James Mansion <james(at)mansionfamily(dot)plus(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: Greg Stark <stark(at)enterprisedb(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: idea: global temp tables
Date: 2009-04-30 19:26:02
Message-ID: 49F9FB4A.7080809@mansionfamily.plus.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Kevin Grittner wrote:
> contexts. I don't think the updates to the system tables have the
> same magnitude of performance hit as creating these tables, especially
> if write barriers are on.
>
Wouldn't it be cleaner just to defer creation of real files to support the
structures associated with a temp table until it i snecessary to spill the
data from the backend's RAM? This data doesn't need to be in
shared memory and the tables and data aren't visible to any other
session, so can't they run out of RAM most of the time (or all the
time if the data in them is short lived)?


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "James Mansion" <james(at)mansionfamily(dot)plus(dot)com>
Cc: "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "Greg Stark" <stark(at)enterprisedb(dot)com>, "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>, "PostgreSQL Hackers" <pgsql-hackers(at)postgresql(dot)org>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: idea: global temp tables
Date: 2009-04-30 19:32:20
Message-ID: 49F9B674.EE98.0025.0@wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

James Mansion <james(at)mansionfamily(dot)plus(dot)com> wrote:

> Wouldn't it be cleaner just to defer creation of real files to
support the
> structures associated with a temp table until it i snecessary to
spill the
> data from the backend's RAM? This data doesn't need to be in
> shared memory and the tables and data aren't visible to any other
> session, so can't they run out of RAM most of the time (or all the
> time if the data in them is short lived)?

See this thread for a discussion of the idea:

http://archives.postgresql.org/pgsql-hackers/2008-11/msg00342.php

-Kevin