Re: Not HOT enough

Lists: pgsql-hackers
From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Not HOT enough
Date: 2011-11-22 20:23:34
Message-ID: CA+U5nM+pSAxK7pVHbf7huUutzNbnS674Tg14JMUj7Yq7b3FcOg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Some time ago, I changed GetOldestXmin() to ignore procs in other
databases resulting in a potentially later xmin.

GetSnapshotData() was not touched when that happened, even though the
comments say "...This is the same computation done by
GetOldestXmin(true, true)." The transam/README file says it stronger
"GetSnapshotData also performs an oldest-xmin calculation (which had
better
match GetOldestXmin's)". Doh.

As a result, VACUUM ignores procs in other databases, whereas HOT does
not. That means we aren't cleaning up as much as we could do when
running multiple databases. If its OK for VACUUM, then it must be OK
for HOT cleanup also.

Attached patch ignores procs in other databases during
GetSnapshotData() when IsMVCCSnapshot(), using similar coding to
GetOldestXmin().

Any doubters?

I suggest this is backpatched a few releases.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
fix_getsnapshotdata.v1.patch application/octet-stream 1011 bytes

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Not HOT enough
Date: 2011-11-22 21:21:50
Message-ID: CA+TgmobO1Xp_7=935maSpMQHkRuQdH4dJjv126=rES-LtNj5ew@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Nov 22, 2011 at 3:23 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> Some time ago, I changed GetOldestXmin() to ignore procs in other
> databases resulting in a potentially later xmin.
>
> GetSnapshotData() was not touched when that happened, even though the
> comments say "...This is the same computation done by
> GetOldestXmin(true, true)." The transam/README file says it stronger
> "GetSnapshotData also performs an oldest-xmin calculation (which had
> better
> match GetOldestXmin's)". Doh.
>
> As a result, VACUUM ignores procs in other databases, whereas HOT does
> not. That means we aren't cleaning up as much as we could do when
> running multiple databases. If its OK for VACUUM, then it must be OK
> for HOT cleanup also.
>
> Attached patch ignores procs in other databases during
> GetSnapshotData() when IsMVCCSnapshot(), using similar coding to
> GetOldestXmin().
>
> Any doubters?

I think this is unsafe for shared catalogs.

> I suggest this is backpatched a few releases.

-1.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Not HOT enough
Date: 2011-11-22 23:40:29
Message-ID: CA+U5nM+gk2DEXGmTJtjpFtBuH8BGz3MTk1YL+GHEsCXqsXZQ_Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Nov 22, 2011 at 9:21 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Tue, Nov 22, 2011 at 3:23 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>> Some time ago, I changed GetOldestXmin() to ignore procs in other
>> databases resulting in a potentially later xmin.
>>
>> GetSnapshotData() was not touched when that happened, even though the
>> comments say "...This is the same computation done by
>> GetOldestXmin(true, true)." The transam/README file says it stronger
>> "GetSnapshotData also performs an oldest-xmin calculation (which had
>> better
>> match GetOldestXmin's)". Doh.
>>
>> As a result, VACUUM ignores procs in other databases, whereas HOT does
>> not. That means we aren't cleaning up as much as we could do when
>> running multiple databases. If its OK for VACUUM, then it must be OK
>> for HOT cleanup also.
>>
>> Attached patch ignores procs in other databases during
>> GetSnapshotData() when IsMVCCSnapshot(), using similar coding to
>> GetOldestXmin().
>>
>> Any doubters?
>
> I think this is unsafe for shared catalogs.

I think so too. Thats why it uses IsMVCCSnapshot() to confirm when it
is safe to do so.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Not HOT enough
Date: 2011-11-23 00:25:34
Message-ID: CA+U5nM+hvRJZ8Aw7hQXRp0LS2fMs5c-o9d4GwcqfKhhHYWMhAg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Nov 22, 2011 at 11:40 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:

>> I think this is unsafe for shared catalogs.
>
> I think so too. Thats why it uses IsMVCCSnapshot() to confirm when it
> is safe to do so.
>

Ah, you mean access to shared catalogs using MVCC snapshots.

Fixed.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
fix_getsnapshotdata.v2.patch application/octet-stream 2.0 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Not HOT enough
Date: 2011-11-23 14:00:12
Message-ID: CA+Tgmobf4mAmedvC=tuFuE2yNuzakLy0t8UHQknfjPcuEEXAPA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Nov 22, 2011 at 7:25 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On Tue, Nov 22, 2011 at 11:40 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>>> I think this is unsafe for shared catalogs.
>> I think so too. Thats why it uses IsMVCCSnapshot() to confirm when it
>> is safe to do so.
> Ah, you mean access to shared catalogs using MVCC snapshots.

Yeah. This change would have the disadvantage of disabling HOT
cleanup for shared catalogs; I'm not sure whether that's a good
decision.

But now that you mention it, something seems funky about the other bit
you mention, too:

+ /* MVCC snapshots ignore other databases */
+ if (!allDbs &&
+ proc->databaseId != MyDatabaseId &&
+ proc->databaseId != 0) /* always include WalSender */
+ continue;
+

It doesn't make sense for the RecentGlobalXmin calculation to depend
on whether or not the current snapshot is an MVCC snapshot, because
RecentGlobalXmin is a global variable not related to any particular
snapshot. I don't believe it's safe to assume that RecentGlobalXmin
will only ever be used in conjunction with the most-recently-taken
snapshot.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Not HOT enough
Date: 2011-11-23 14:25:10
Message-ID: CA+U5nMJWEEaFN0oY19OocBJ2W07paRUo2neLHe=o1x3NCrDZJw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Nov 23, 2011 at 2:00 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Tue, Nov 22, 2011 at 7:25 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>> On Tue, Nov 22, 2011 at 11:40 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>>>> I think this is unsafe for shared catalogs.
>>> I think so too. Thats why it uses IsMVCCSnapshot() to confirm when it
>>> is safe to do so.
>> Ah, you mean access to shared catalogs using MVCC snapshots.
>
> Yeah.  This change would have the disadvantage of disabling HOT
> cleanup for shared catalogs; I'm not sure whether that's a good
> decision.

No, it disables cleanup when being read. They are still VACUUMed normally.

Note that non-MVCC snapshots never did run HOT page-level cleanup, so
this hardly changes anything.

And it effects shared catalogs only, which are all low traffic anyway.

> But now that you mention it, something seems funky about the other bit
> you mention, too:
>
> +                       /* MVCC snapshots ignore other databases */
> +                       if (!allDbs &&
> +                               proc->databaseId != MyDatabaseId &&
> +                               proc->databaseId != 0)          /* always include WalSender */
> +                               continue;
> +
>
> It doesn't make sense for the RecentGlobalXmin calculation to depend
> on whether or not the current snapshot is an MVCC snapshot, because
> RecentGlobalXmin is a global variable not related to any particular
> snapshot.  I don't believe it's safe to assume that RecentGlobalXmin
> will only ever be used in conjunction with the most-recently-taken
> snapshot.

Why would that matter exactly? RecentGlobalXmin is used in 4 places
and this works with them all.

This changes the meaning of that variable from what it was previously,
but so what? It's backend local.

The huge benefit is that we clean up data in normal tables much better
than we did before in cases where people use multiple databases, which
is a common case.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Not HOT enough
Date: 2011-11-23 15:15:55
Message-ID: CA+TgmobreJ17SP=GaALYu029MKQ_XcZYkiPhbezLBiXBYUgy7w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Nov 23, 2011 at 9:25 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>> Yeah.  This change would have the disadvantage of disabling HOT
>> cleanup for shared catalogs; I'm not sure whether that's a good
>> decision.
>
> No, it disables cleanup when being read. They are still VACUUMed normally.
>
> Note that non-MVCC snapshots never did run HOT page-level cleanup, so
> this hardly changes anything.
>
> And it effects shared catalogs only, which are all low traffic anyway.

I think "low traffic" is the key point. I understand that you're not
changing the VACUUM behavior, but you are making heap_page_prune_opt()
not do anything when a shared catalog is involved. That would be
unacceptable if we expected shared catalogs to be updated frequently,
either now or in the future, but I guess we don't expect that.

I suppose we could compute two RecentGlobalXmin values, one for all
databases and one for just the current database. But that would make
GetSnapshotData() slower, which would almost certainly be a cure worse
than the disease.

>> But now that you mention it, something seems funky about the other bit
>> you mention, too:
>>
>> +                       /* MVCC snapshots ignore other databases */
>> +                       if (!allDbs &&
>> +                               proc->databaseId != MyDatabaseId &&
>> +                               proc->databaseId != 0)          /* always include WalSender */
>> +                               continue;
>> +
>>
>> It doesn't make sense for the RecentGlobalXmin calculation to depend
>> on whether or not the current snapshot is an MVCC snapshot, because
>> RecentGlobalXmin is a global variable not related to any particular
>> snapshot.  I don't believe it's safe to assume that RecentGlobalXmin
>> will only ever be used in conjunction with the most-recently-taken
>> snapshot.
>
> Why would that matter exactly? RecentGlobalXmin is used in 4 places
> and this works with them all.
>
> This changes the meaning of that variable from what it was previously,
> but so what? It's backend local.

I don't object to changing the meaning of the variable, but I don't
understand how it can be correct to include backends from other
databases in the RecentGlobalXmin calculation when using an MVCC
snapshot, but not otherwise. Come to think of it, when does
GetSnapshotData() get called with a non-MVCC snapshot anyway?

> The huge benefit is that we clean up data in normal tables much better
> than we did before in cases where people use multiple databases, which
> is a common case.

I understand the benefit; I just want to make sure we're not going to
break anything.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Not HOT enough
Date: 2011-11-23 15:20:44
Message-ID: 1322061559-sup-6871@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Excerpts from Robert Haas's message of mié nov 23 12:15:55 -0300 2011:

> > And it effects shared catalogs only, which are all low traffic anyway.
>
> I think "low traffic" is the key point. I understand that you're not
> changing the VACUUM behavior, but you are making heap_page_prune_opt()
> not do anything when a shared catalog is involved. That would be
> unacceptable if we expected shared catalogs to be updated frequently,
> either now or in the future, but I guess we don't expect that.

Maybe not pg_database or pg_tablespace and such, but I'm not so sure
about pg_shdepend. (Do we record pg_shdepend entries for temp tables?)

--
Álvaro Herrera <alvherre(at)commandprompt(dot)com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Not HOT enough
Date: 2011-11-23 15:28:38
Message-ID: CA+TgmoYH4mZhCamFW0PjXJFLNtDtQZPhBq022qgOY=4Z9HhYaA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Nov 23, 2011 at 10:20 AM, Alvaro Herrera
<alvherre(at)commandprompt(dot)com> wrote:
> Excerpts from Robert Haas's message of mié nov 23 12:15:55 -0300 2011:
>> > And it effects shared catalogs only, which are all low traffic anyway.
>>
>> I think "low traffic" is the key point.  I understand that you're not
>> changing the VACUUM behavior, but you are making heap_page_prune_opt()
>> not do anything when a shared catalog is involved.  That would be
>> unacceptable if we expected shared catalogs to be updated frequently,
>> either now or in the future, but I guess we don't expect that.
>
> Maybe not pg_database or pg_tablespace and such, but I'm not so sure
> about pg_shdepend.  (Do we record pg_shdepend entries for temp tables?)

Hmm, I'm not seeing any increase in the number of entries in
pg_shdepend when I create either a temporary or permanent table:

rhaas=# select sum(1) from pg_shdepend;
sum
-----
2
(1 row)

rhaas=# create temp table xyz (a int);
CREATE TABLE
rhaas=# select sum(1) from pg_shdepend;
sum
-----
2
(1 row)

rhaas=# create table abc (a int);
CREATE TABLE
rhaas=# select sum(1) from pg_shdepend;
sum
-----
2
(1 row)

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Not HOT enough
Date: 2011-11-23 15:35:47
Message-ID: 1322062429-sup-1570@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Excerpts from Robert Haas's message of mié nov 23 12:28:38 -0300 2011:

> Hmm, I'm not seeing any increase in the number of entries in
> pg_shdepend when I create either a temporary or permanent table:
>
> rhaas=# select sum(1) from pg_shdepend;
> sum
> -----
> 2
> (1 row)
>
> rhaas=# create temp table xyz (a int);
> CREATE TABLE
> rhaas=# select sum(1) from pg_shdepend;
> sum
> -----
> 2
> (1 row)

That's because the owner is "pinned" (i.e. the bootstrap user). Try
with a different user. I see new rows with both temp and non-temp
tables.

--
Álvaro Herrera <alvherre(at)commandprompt(dot)com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Not HOT enough
Date: 2011-11-23 16:01:18
Message-ID: CA+Tgmoa68NgnTR+r+nQ6dgL3kR+mcnRepMUDoyedJiYeyW7Ucg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Nov 23, 2011 at 10:35 AM, Alvaro Herrera
<alvherre(at)commandprompt(dot)com> wrote:
> That's because the owner is "pinned" (i.e. the bootstrap user).  Try
> with a different user.  I see new rows with both temp and non-temp
> tables.

Oh, wow. I had no idea it worked like that. You learn something new every day.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Not HOT enough
Date: 2011-11-23 16:14:04
Message-ID: CA+U5nM+RBxcgmEWgp8znjK0QYCyaTsLdRZWdJkVtg8--6H-NKw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Nov 23, 2011 at 3:20 PM, Alvaro Herrera
<alvherre(at)commandprompt(dot)com> wrote:
>
> Excerpts from Robert Haas's message of mié nov 23 12:15:55 -0300 2011:
>
>> > And it effects shared catalogs only, which are all low traffic anyway.
>>
>> I think "low traffic" is the key point.  I understand that you're not
>> changing the VACUUM behavior, but you are making heap_page_prune_opt()
>> not do anything when a shared catalog is involved.  That would be
>> unacceptable if we expected shared catalogs to be updated frequently,
>> either now or in the future, but I guess we don't expect that.
>
> Maybe not pg_database or pg_tablespace and such, but I'm not so sure
> about pg_shdepend.  (Do we record pg_shdepend entries for temp tables?)

Normal catalog access does not use HOT and never has.

If catalogs need VACUUMing then autovacuum takes care of it.

If we're saying that isn't enough and we actually depend on the
occasional user inspecting the catalog then we are already hosed.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Not HOT enough
Date: 2011-11-23 17:00:55
Message-ID: 1322067518-sup-5258@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Excerpts from Simon Riggs's message of mié nov 23 13:14:04 -0300 2011:
> On Wed, Nov 23, 2011 at 3:20 PM, Alvaro Herrera
> <alvherre(at)commandprompt(dot)com> wrote:
> >
> > Excerpts from Robert Haas's message of mié nov 23 12:15:55 -0300 2011:
> >
> >> > And it effects shared catalogs only, which are all low traffic anyway.
> >>
> >> I think "low traffic" is the key point.  I understand that you're not
> >> changing the VACUUM behavior, but you are making heap_page_prune_opt()
> >> not do anything when a shared catalog is involved.  That would be
> >> unacceptable if we expected shared catalogs to be updated frequently,
> >> either now or in the future, but I guess we don't expect that.
> >
> > Maybe not pg_database or pg_tablespace and such, but I'm not so sure
> > about pg_shdepend.  (Do we record pg_shdepend entries for temp tables?)
>
> Normal catalog access does not use HOT and never has.

Oh.

> If we're saying that isn't enough and we actually depend on the
> occasional user inspecting the catalog then we are already hosed.

Probably not. I have heard of cases of pg_shdepend getting bloated
though, so it'd be nice if it happened.

--
Álvaro Herrera <alvherre(at)commandprompt(dot)com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Not HOT enough
Date: 2011-11-23 17:01:40
Message-ID: 28026.1322067700@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
> On Wed, Nov 23, 2011 at 3:20 PM, Alvaro Herrera
> <alvherre(at)commandprompt(dot)com> wrote:
>> Maybe not pg_database or pg_tablespace and such, but I'm not so sure
>> about pg_shdepend. (Do we record pg_shdepend entries for temp tables?)

> Normal catalog access does not use HOT and never has.

You are mistaken.

regards, tom lane


From: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Not HOT enough
Date: 2011-11-23 17:07:11
Message-ID: CABOikdN2LGr2enWD2W657LCcYQp71jfnrBhHBm7W6iMevzdwrw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Nov 23, 2011 at 9:44 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On Wed, Nov 23, 2011 at 3:20 PM, Alvaro Herrera
> <alvherre(at)commandprompt(dot)com> wrote:
>>
>> Excerpts from Robert Haas's message of mié nov 23 12:15:55 -0300 2011:
>>
>>> > And it effects shared catalogs only, which are all low traffic anyway.
>>>
>>> I think "low traffic" is the key point.  I understand that you're not
>>> changing the VACUUM behavior, but you are making heap_page_prune_opt()
>>> not do anything when a shared catalog is involved.  That would be
>>> unacceptable if we expected shared catalogs to be updated frequently,
>>> either now or in the future, but I guess we don't expect that.
>>
>> Maybe not pg_database or pg_tablespace and such, but I'm not so sure
>> about pg_shdepend.  (Do we record pg_shdepend entries for temp tables?)
>
> Normal catalog access does not use HOT and never has.
>

I don't understand that. We started with the simplified assumption
that HOT can skip catalog tables, but later that was one of the
pre-conditions Tom spelled out to accept HOT patch because his view
was if this does not work for system tables, it probably does not work
at all.

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB     http://www.enterprisedb.com


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Not HOT enough
Date: 2011-11-23 17:58:25
Message-ID: CA+U5nMKESsS+4=RCJJO+qqgYe-G=qLiuRRAHSEpy8aJucD1ODw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Nov 23, 2011 at 5:01 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
>> On Wed, Nov 23, 2011 at 3:20 PM, Alvaro Herrera
>> <alvherre(at)commandprompt(dot)com> wrote:
>>> Maybe not pg_database or pg_tablespace and such, but I'm not so sure
>>> about pg_shdepend. (Do we record pg_shdepend entries for temp tables?)
>
>> Normal catalog access does not use HOT and never has.
>
> You are mistaken.

Normal catalog access against shared catalogs via heap_scan does not
use HOT cleanup, because it uses SnapshotNow.
Page cleanup when reading a page only happens when scan->rs_pageatatime is set.
scan->rs_pageatatime = IsMVCCSnapshot(snapshot);

Index access does use HOT cleanup, which is probably "normal".

However, since we're talking about these tables only

postgres=# select relname, pg_relation_size(oid) from pg_class where
relisshared and relkind = 'r';
relname | pg_relation_size
--------------------+------------------
pg_authid | 8192
pg_database | 8192
pg_tablespace | 8192
pg_pltemplate | 8192
pg_auth_members | 0
pg_shdepend | 8192
pg_shdescription | 8192
pg_db_role_setting | 0
(8 rows)

then I think it's fair to say that they are seldom updated/deleted and
so the effect of HOT cleanup is not important for those tables.

The real question is do we favour HOT cleanup on those small 8 tables,
or do we favour HOT cleanup of every other table? There are clearly
pros and cons but the balance must surely be in favour of better
cleaning of user tables since they are accessed millions of times more
frequently than shared catalog tables.

If we are concerned about those 8 tables then we can always set
autovacuum more intensively.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Not HOT enough
Date: 2011-11-23 18:15:58
Message-ID: 507.1322072158@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
> On Wed, Nov 23, 2011 at 5:01 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
>>> Normal catalog access does not use HOT and never has.

>> You are mistaken.

> Normal catalog access against shared catalogs via heap_scan does not
> use HOT cleanup, because it uses SnapshotNow.

Not sure what you are basing these statements on. Normal catalog access
typically goes through indexam.c, which AFAICS will call
heap_page_prune_opt on every heap page it visits, quite independently
of what snapshot is used. There are no cases I know of where the system
prefers heapscans on catalogs, except possibly pg_am which is known to
be small.

> However, since we're talking about these tables only
> ...
> then I think it's fair to say that they are seldom updated/deleted and
> so the effect of HOT cleanup is not important for those tables.

I agree with Alvaro that pg_shdepend is probably a bit too volatile
to make such an assumption safe.

> The real question is do we favour HOT cleanup on those small 8 tables,
> or do we favour HOT cleanup of every other table?

No, the real question is why not think a little harder and see if we can
come up with a solution that doesn't involve making some cases worse to
make others better.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Not HOT enough
Date: 2011-11-23 18:30:18
Message-ID: 760.1322073018@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
> On Tue, Nov 22, 2011 at 11:40 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>>> I think this is unsafe for shared catalogs.

>> I think so too. Thats why it uses IsMVCCSnapshot() to confirm when it
>> is safe to do so.

> Ah, you mean access to shared catalogs using MVCC snapshots.

[ having now read the patch a bit more carefully ]

I think the fundamental problem with this is that it's conflating "what
to do in shared catalogs" with "what to do when an MVCC snapshot is
being used". HOT cleanup activity really ought not have anything at all
to do with what snapshot is being used to scan the page.

I'm also extremely uncomfortable with the fact that your proposed coding
changes not only the RecentGlobalXmin output of GetSnapshotData, but the
actual snapshot output --- you have not even made an argument why that
is safe, and I doubt that it is.

What I think might make more sense is to keep two variables,
RecentGlobalXmin with its current meaning and RecentDatabaseWideXmin
which considers only xmins of transactions in the current database.
Then HOT cleanup could select the appropriate cutoff depending on
whether it's working on a shared or non-shared relation.

regards, tom lane


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Not HOT enough
Date: 2011-11-23 19:33:18
Message-ID: CA+U5nMJCmMMZtQk48PTj3F1F09vaAyBc7=kHrOhQQPYLJEkyCQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Nov 23, 2011 at 6:15 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

>> The real question is do we favour HOT cleanup on those small 8 tables,
>> or do we favour HOT cleanup of every other table?
>
> No, the real question is why not think a little harder and see if we can
> come up with a solution that doesn't involve making some cases worse to
> make others better.

Slightly modified patch attached.

When we access a shared relation and the page is prunable we re-derive
the cutoff value using GetOldestXmin.

That code path is rarely taken. In particular, pg_shdepend is only
accessed during object creation/alter/drop, so this isn't a path we
can't spare a small amount little extra on, just like the trade-off
we've taken to make locking faster for DML while making DDL a little
slower.

If this is still unacceptable, then I'll have to look at reducing
impact on pg_shdepend from temp tables - which is still a plan.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
fix_getsnapshotdata.v3.patch application/octet-stream 3.6 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Not HOT enough
Date: 2011-11-23 19:57:06
Message-ID: CA+Tgmob=fNqBTzGVwBvJR7N1OmEKuhHaX-B=4sV_g2uyfxmyPA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Nov 23, 2011 at 1:30 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> What I think might make more sense is to keep two variables,
> RecentGlobalXmin with its current meaning and RecentDatabaseWideXmin
> which considers only xmins of transactions in the current database.
> Then HOT cleanup could select the appropriate cutoff depending on
> whether it's working on a shared or non-shared relation.

Unfortunately, that would have the effect of lengthening the time for
which ProcArrayLock is held, and as benchmark results from Pavan's
patch in that area show, that makes a very big difference to total
throughput on write-heavy workloads. On a related note, Simon's
proposed change here would also complicate things for that patch,
because databaseId would have to become part of PGXACT rather than
PGPROC, and that would make the PGXACT act array larger and thus
slower to scan. I have deep respect for the perils of not doing HOT
cleanup quickly enough, but ProcArrayLock contention is nothing to
sneeze at either.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Not HOT enough
Date: 2011-11-23 20:14:45
Message-ID: CA+U5nMLDTBRvOmc=sTnq5SHk5iu2KxbH7KAXF0H5a0RcWF+0DA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Nov 23, 2011 at 7:57 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Wed, Nov 23, 2011 at 1:30 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> What I think might make more sense is to keep two variables,
>> RecentGlobalXmin with its current meaning and RecentDatabaseWideXmin
>> which considers only xmins of transactions in the current database.
>> Then HOT cleanup could select the appropriate cutoff depending on
>> whether it's working on a shared or non-shared relation.
>
> Unfortunately, that would have the effect of lengthening the time for
> which ProcArrayLock is held, and as benchmark results from Pavan's
> patch in that area show, that makes a very big difference to total
> throughput on write-heavy workloads.

Agreed. The performance effect of doing that would be noticeable, and
I quickly ruled out that solution without even mentioning it at
version one.

> On a related note, Simon's
> proposed change here would also complicate things for that patch,
> because databaseId would have to become part of PGXACT rather than
> PGPROC, and that would make the PGXACT act array larger

That is correct.

> and thus
> slower to scan.  I have deep respect for the perils of not doing HOT
> cleanup quickly enough, but ProcArrayLock contention is nothing to
> sneeze at either.

If you measure the difference we'll be able to see what difference
adding 4 bytes onto every pgtran makes. I think it will be small.

OTOH, the effect of database bloat is well documented and has a
seriously negative effect on performance that easily outweighs the
slight loss.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Not HOT enough
Date: 2011-11-23 20:15:36
Message-ID: 2619.1322079336@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Wed, Nov 23, 2011 at 1:30 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> What I think might make more sense is to keep two variables,
>> RecentGlobalXmin with its current meaning and RecentDatabaseWideXmin
>> which considers only xmins of transactions in the current database.
>> Then HOT cleanup could select the appropriate cutoff depending on
>> whether it's working on a shared or non-shared relation.

> Unfortunately, that would have the effect of lengthening the time for
> which ProcArrayLock is held, and as benchmark results from Pavan's
> patch in that area show, that makes a very big difference to total
> throughput on write-heavy workloads.

[ shrug... ] Simon's patch already adds nearly as many cycles in the
hot spot as would be required to do what I suggest.

regards, tom lane


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Not HOT enough
Date: 2011-11-23 20:32:07
Message-ID: CA+U5nMLJtHq2V+JBn-hrL1O6QytxKK_GiJnZruWJujLtWfWZGg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Nov 23, 2011 at 8:15 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On Wed, Nov 23, 2011 at 1:30 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> What I think might make more sense is to keep two variables,
>>> RecentGlobalXmin with its current meaning and RecentDatabaseWideXmin
>>> which considers only xmins of transactions in the current database.
>>> Then HOT cleanup could select the appropriate cutoff depending on
>>> whether it's working on a shared or non-shared relation.
>
>> Unfortunately, that would have the effect of lengthening the time for
>> which ProcArrayLock is held, and as benchmark results from Pavan's
>> patch in that area show, that makes a very big difference to total
>> throughput on write-heavy workloads.
>
> [ shrug... ]  Simon's patch already adds nearly as many cycles in the
> hot spot as would be required to do what I suggest.

Well, its deeper than that.

My patch actually skips xids that aren't in the user's database. That
avoids other work in GetSnapshotData(), so will in many cases make it
faster. The snapshots returned will be smaller, which also means more
speed.

As you point out upthread, that generates an MVCC snapshot that is not
safe for user queries against shared catalog tables. Standard catalog
access is safe, but user access isn't. The way to solve that problem
is to make all scans against shared catalog tables use SnapshotNow,
whatever the snapshot says. Which would be more useful since you'll
see exactly what the DBMS sees. Given the infrequency of change to
those tables and the infrequency of user access to those tables it
seems like a very good thing.

If we do as you suggest, snapshots would contain all xids from all
databases, so no effort would be skipped, but we would pay the cost of
deriving two values just in case we ever decide to read a shared
catalog table, which is blue moon frequency, so a net loss.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Not HOT enough
Date: 2011-11-23 20:45:16
Message-ID: CA+TgmoaSzGPUsuvQm0uVrbdVfVH00NtSuxyFTJkBNpp8PbPM7g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Nov 23, 2011 at 3:15 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On Wed, Nov 23, 2011 at 1:30 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> What I think might make more sense is to keep two variables,
>>> RecentGlobalXmin with its current meaning and RecentDatabaseWideXmin
>>> which considers only xmins of transactions in the current database.
>>> Then HOT cleanup could select the appropriate cutoff depending on
>>> whether it's working on a shared or non-shared relation.
>
>> Unfortunately, that would have the effect of lengthening the time for
>> which ProcArrayLock is held, and as benchmark results from Pavan's
>> patch in that area show, that makes a very big difference to total
>> throughput on write-heavy workloads.
>
> [ shrug... ]  Simon's patch already adds nearly as many cycles in the
> hot spot as would be required to do what I suggest.

Well, that's a point in favor of your idea as compared with Simon's, I
suppose, but it's not making me feel entirely sanguine about either
approach.

I've wondered a few times whether we could get rid of the
RecentGlobalXmin computation from GetSnapshotData() altogether. We
think that it's cheap to do it there because we already hold
ProcArrayLock in exclusive mode, but Pavan's work suggests that it
really isn't that cheap. Instead of updating RecentGlobalXmin every
time we take a snapshot (which is likely to be a waste in many cases,
since even in a busy system many snapshots are very short lived and
therefore unlikely to trigger a HOT cleanup) maybe we should only
update it "on demand" - e.g. if heap_page_prune_opt sees a
page-prune-hint XID that is older than TransactionXmin and newer than
the last-computed value of RecentGlobalXmin, there's hope that a
recomputation might yield a new RecentGlobalXmin value new enough to
allow a HOT cleanup, so if we haven't recomputed it "lately", then we
should.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Not HOT enough
Date: 2011-11-23 21:36:46
Message-ID: CA+U5nMKjKQWSdUzdh3-=TYCj3X=sFeb3ebJ_LmC3gh5KyU49mA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Nov 23, 2011 at 8:45 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> I've wondered a few times whether we could get rid of the
> RecentGlobalXmin computation from GetSnapshotData() altogether.

You have to calculate an xmin, so its unavoidable.

My patch actually improves the speed of snapshots, rather than slowing
them as Tom's would.

>  We
> think that it's cheap to do it there because we already hold
> ProcArrayLock in exclusive mode, but Pavan's work suggests that it
> really isn't that cheap.  Instead of updating RecentGlobalXmin every
> time we take a snapshot (which is likely to be a waste in many cases,
> since even in a busy system many snapshots are very short lived and
> therefore unlikely to trigger a HOT cleanup) maybe we should only
> update it "on demand" - e.g. if heap_page_prune_opt sees a
> page-prune-hint XID that is older than TransactionXmin and newer than
> the last-computed value of RecentGlobalXmin, there's hope that a
> recomputation might yield a new RecentGlobalXmin value new enough to
> allow a HOT cleanup, so if we haven't recomputed it "lately", then we
> should.

When we prune a page while running an UPDATE if we see that the page
is left with less freespace than average row length for that relation
AND page sees a RecentlyDead xid we could then re-derive a later
db-local cutoff value and re-prune the page.

That increases page lock time, but pages are locked for longer if we
do non-HOT updates anyway, so it would still be a win.

What % of non-HOT updates do you see in your recent benchmarks?

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Not HOT enough
Date: 2011-11-23 21:55:29
Message-ID: 4530.1322085329@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
> My patch actually improves the speed of snapshots, rather than slowing
> them as Tom's would.

It can be arbitrarily fast if it doesn't have to get the right answer.
Unfortunately, you're not producing the right answer. You can not
exclude XIDs in other databases from the snapshot, at least not unless
you know that the snapshot will not be used for examining any shared
catalogs ... and GetSnapshotData certainly cannot know that.

I think that the idea of computing a different cutoff on the
probably-rare occasions where we need to prune a shared catalog page
has some merit, but the change you are currently proposing to
GetSnapshotData flat out does not work.

regards, tom lane


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Not HOT enough
Date: 2011-11-23 22:43:15
Message-ID: CA+U5nM+aTpv2Q3Qmf=cMftL1Ve2VZMhj0bE9SRyZ5T-Ozqx27g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Nov 23, 2011 at 9:55 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
>> My patch actually improves the speed of snapshots, rather than slowing
>> them as Tom's would.
>
> It can be arbitrarily fast if it doesn't have to get the right answer.

(LOL) - laughing with you

> Unfortunately, you're not producing the right answer.  You can not
> exclude XIDs in other databases from the snapshot, at least not unless
> you know that the snapshot will not be used for examining any shared
> catalogs ... and GetSnapshotData certainly cannot know that.
>
> I think that the idea of computing a different cutoff on the
> probably-rare occasions where we need to prune a shared catalog page
> has some merit, but the change you are currently proposing to
> GetSnapshotData flat out does not work.

All true, but I already said that myself in a direct reply to you 2 hours ago.

And I proposed a solution, which was to use SnapshotNow as an override
for user queries against shared catalog tables.

Computing two cutoffs is overkill for the rare event of pruning a
shared catalog page. I did think of that already and its not a good
solution. I'm tempted by the idea of getting rid of multiple databases
altogether. The hassle of supporting that feature far outweighs the
fairly low benefit.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Not HOT enough
Date: 2011-11-23 22:47:39
Message-ID: CA+Tgmoaayee12wTxcpxFk6viJuvy=5nfsFfs+3e-ZnA+SAzZJw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Nov 23, 2011 at 5:43 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> Computing two cutoffs is overkill for the rare event of pruning a
> shared catalog page. I did think of that already and its not a good
> solution. I'm tempted by the idea of getting rid of multiple databases
> altogether. The hassle of supporting that feature far outweighs the
> fairly low benefit.

That seems a rather sudden U-turn. The point of your proposal is to
improve life for people using multiple databases in a single cluster.
If that's not an important use case, why bother with any of this?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Not HOT enough
Date: 2011-11-23 23:00:37
Message-ID: CA+U5nM+K8c1DAhZhSr=UDN2iM4gFsvuBCCgh1bVadVnXJ2Q69A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Nov 23, 2011 at 10:47 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Wed, Nov 23, 2011 at 5:43 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>> Computing two cutoffs is overkill for the rare event of pruning a
>> shared catalog page. I did think of that already and its not a good
>> solution. I'm tempted by the idea of getting rid of multiple databases
>> altogether. The hassle of supporting that feature far outweighs the
>> fairly low benefit.
>
> That seems a rather sudden U-turn.  The point of your proposal is to
> improve life for people using multiple databases in a single cluster.
> If that's not an important use case, why bother with any of this?

Where's the U-turn? I've not argued at any point that running multiple
databases was a great idea.

Offering multiple databases causes the problems I noted and have been
trying to solve.

If we didn't have databases we could probably chuck out tons of
complexity and code that no longer need to exist now we have
namespaces.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Not HOT enough
Date: 2012-08-17 00:59:16
Message-ID: 20120817005916.GJ30286@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Did we want to apply this?

---------------------------------------------------------------------------

On Wed, Nov 23, 2011 at 07:33:18PM +0000, Simon Riggs wrote:
> On Wed, Nov 23, 2011 at 6:15 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> >> The real question is do we favour HOT cleanup on those small 8 tables,
> >> or do we favour HOT cleanup of every other table?
> >
> > No, the real question is why not think a little harder and see if we can
> > come up with a solution that doesn't involve making some cases worse to
> > make others better.
>
> Slightly modified patch attached.
>
> When we access a shared relation and the page is prunable we re-derive
> the cutoff value using GetOldestXmin.
>
> That code path is rarely taken. In particular, pg_shdepend is only
> accessed during object creation/alter/drop, so this isn't a path we
> can't spare a small amount little extra on, just like the trade-off
> we've taken to make locking faster for DML while making DDL a little
> slower.
>
> If this is still unacceptable, then I'll have to look at reducing
> impact on pg_shdepend from temp tables - which is still a plan.
>
> --
>  Simon Riggs                   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services

> diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c
> index 61f2ce4..5feaedc 100644
> --- a/src/backend/access/heap/pruneheap.c
> +++ b/src/backend/access/heap/pruneheap.c
> @@ -16,13 +16,13 @@
>
> #include "access/heapam.h"
> #include "access/transam.h"
> +#include "catalog/catalog.h"
> #include "miscadmin.h"
> #include "pgstat.h"
> #include "storage/bufmgr.h"
> #include "utils/rel.h"
> #include "utils/tqual.h"
>
> -
> /* Working data for heap_page_prune and subroutines */
> typedef struct
> {
> @@ -72,6 +72,7 @@ heap_page_prune_opt(Relation relation, Buffer buffer, TransactionId OldestXmin)
> {
> Page page = BufferGetPage(buffer);
> Size minfree;
> + Transactionid CutoffXmin = OldestXmin;
>
> /*
> * Let's see if we really need pruning.
> @@ -91,6 +92,18 @@ heap_page_prune_opt(Relation relation, Buffer buffer, TransactionId OldestXmin)
> return;
>
> /*
> + * We choose to set RecentGlobalXmin only for the current database, which
> + * means we cannot use it to prune shared relations when reading them with
> + * MVCC snapshots. By making that choice it allows our snapshots to be
> + * smaller and faster, and the RecentGlobalXmin will be further forward
> + * and offer better pruning of non-shared relations. So when accessing a
> + * shared relation and we see the page is prunable (above) we get an
> + * OldestXmin across all databases.
> + */
> + if (IsSharedRelation(relation->rd_id))
> + CutoffXmin = GetOldestXmin(true, true);
> +
> + /*
> * We prune when a previous UPDATE failed to find enough space on the page
> * for a new tuple version, or when free space falls below the relation's
> * fill-factor target (but not less than 10%).
> @@ -124,7 +137,7 @@ heap_page_prune_opt(Relation relation, Buffer buffer, TransactionId OldestXmin)
> * needed */
>
> /* OK to prune */
> - (void) heap_page_prune(relation, buffer, OldestXmin, true, &ignore);
> + (void) heap_page_prune(relation, buffer, CutoffXmin, true, &ignore);
> }
>
> /* And release buffer lock */
> diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
> index 1a48485..60153f4 100644
> --- a/src/backend/storage/ipc/procarray.c
> +++ b/src/backend/storage/ipc/procarray.c
> @@ -55,6 +55,7 @@
> #include "storage/spin.h"
> #include "utils/builtins.h"
> #include "utils/snapmgr.h"
> +#include "utils/tqual.h"
>
>
> /* Our shared memory area */
> @@ -1185,6 +1186,8 @@ GetMaxSnapshotSubxidCount(void)
> * RecentGlobalXmin: the global xmin (oldest TransactionXmin across all
> * running transactions, except those running LAZY VACUUM). This is
> * the same computation done by GetOldestXmin(true, true).
> + * For MVCC snapshots we examine transactions running only in our
> + * database, ignoring longer running transactions in other databases.
> *
> * Note: this function should probably not be called with an argument that's
> * not statically allocated (see xip allocation below).
> @@ -1200,9 +1203,12 @@ GetSnapshotData(Snapshot snapshot)
> int count = 0;
> int subcount = 0;
> bool suboverflowed = false;
> + bool allDbs = false;
>
> Assert(snapshot != NULL);
>
> + allDbs = !IsMVCCSnapshot(snapshot);
> +
> /*
> * Allocating space for maxProcs xids is usually overkill; numProcs would
> * be sufficient. But it seems better to do the malloc while not holding
> @@ -1278,6 +1284,12 @@ GetSnapshotData(Snapshot snapshot)
> if (proc->vacuumFlags & PROC_IN_VACUUM)
> continue;
>
> + /* MVCC snapshots ignore other databases */
> + if (!allDbs &&
> + proc->databaseId != MyDatabaseId &&
> + proc->databaseId != 0) /* always include WalSender */
> + continue;
> +
> /* Update globalxmin to be the smallest valid xmin */
> xid = proc->xmin; /* fetch just once */
> if (TransactionIdIsNormal(xid) &&

>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Not HOT enough
Date: 2012-08-17 17:21:47
Message-ID: CA+TgmoZCoyeSDH4oDcmmt7uffenc8g1_Hc3pghwZL9kJC=csMw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Aug 16, 2012 at 8:59 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> Did we want to apply this?

Tom and I both opined upthread that it wasn't safe. Nothing's
happened since then to change my mind.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company