Re: heap_page_prune comments

Lists: pgsql-hackers
From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: heap_page_prune comments
Date: 2011-11-02 16:27:02
Message-ID: CA+TgmoaFjJ57B-RBG-RxE9XMXgXvySns0q8_ujW5CfXM76vgwA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

The following comment - or at least the last sentence thereof -
appears to be out of date.

/*
* XXX Should we update the FSM information of this page ?
*
* There are two schools of thought here. We may not want to update FSM
* information so that the page is not used for unrelated
UPDATEs/INSERTs
* and any free space in this page will remain available for further
* UPDATEs in *this* page, thus improving chances for doing HOT updates.
*
* But for a large table and where a page does not receive
further UPDATEs
* for a long time, we might waste this space by not updating the FSM
* information. The relation may get extended and fragmented further.
*
* One possibility is to leave "fillfactor" worth of space in this page
* and update FSM with the remaining space.
*
* In any case, the current FSM implementation doesn't accept
* one-page-at-a-time updates, so this is all academic for now.
*/

The simple fix here is just to delete that last sentence, but does
anyone think we ought to do change the behavior, now that we have the
option to do so?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Jim Nasby <jim(at)nasby(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: heap_page_prune comments
Date: 2011-11-04 00:27:37
Message-ID: 892D103B-919F-484B-9A60-9B41F20C53C0@nasby.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Nov 2, 2011, at 11:27 AM, Robert Haas wrote:
> The following comment - or at least the last sentence thereof -
> appears to be out of date.
>
> /*
> * XXX Should we update the FSM information of this page ?
> *
> * There are two schools of thought here. We may not want to update FSM
> * information so that the page is not used for unrelated
> UPDATEs/INSERTs
> * and any free space in this page will remain available for further
> * UPDATEs in *this* page, thus improving chances for doing HOT updates.
> *
> * But for a large table and where a page does not receive
> further UPDATEs
> * for a long time, we might waste this space by not updating the FSM
> * information. The relation may get extended and fragmented further.
> *
> * One possibility is to leave "fillfactor" worth of space in this page
> * and update FSM with the remaining space.
> *
> * In any case, the current FSM implementation doesn't accept
> * one-page-at-a-time updates, so this is all academic for now.
> */
>
> The simple fix here is just to delete that last sentence, but does
> anyone think we ought to do change the behavior, now that we have the
> option to do so?

The fillfactor route seems to make the most sense here... it certainly seems to be the least surprising behavior.
--
Jim C. Nasby, Database Architect jim(at)nasby(dot)net
512.569.9461 (cell) http://jim.nasby.net


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jim Nasby <jim(at)nasby(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: heap_page_prune comments
Date: 2011-11-04 13:47:26
Message-ID: CA+TgmoZ_G-opTVqNXz9vOMFepadX=YTc-61q9KLQb7ujZLxNUw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Nov 3, 2011 at 8:27 PM, Jim Nasby <jim(at)nasby(dot)net> wrote:
> On Nov 2, 2011, at 11:27 AM, Robert Haas wrote:
>> The following comment - or at least the last sentence thereof -
>> appears to be out of date.
>>
>>        /*
>>         * XXX Should we update the FSM information of this page ?
>>         *
>>         * There are two schools of thought here. We may not want to update FSM
>>         * information so that the page is not used for unrelated
>> UPDATEs/INSERTs
>>         * and any free space in this page will remain available for further
>>         * UPDATEs in *this* page, thus improving chances for doing HOT updates.
>>         *
>>         * But for a large table and where a page does not receive
>> further UPDATEs
>>         * for a long time, we might waste this space by not updating the FSM
>>         * information. The relation may get extended and fragmented further.
>>         *
>>         * One possibility is to leave "fillfactor" worth of space in this page
>>         * and update FSM with the remaining space.
>>         *
>>         * In any case, the current FSM implementation doesn't accept
>>         * one-page-at-a-time updates, so this is all academic for now.
>>         */
>>
>> The simple fix here is just to delete that last sentence, but does
>> anyone think we ought to do change the behavior, now that we have the
>> option to do so?
>
> The fillfactor route seems to make the most sense here... it certainly seems to be the least surprising behavior.

Seems a little hackish, though: we'd be reporting an amount of
freespace that we've deliberately set to an incorrect value. I'm
almost thinking we should report the freespace that's actually
available, on the theory that Bload Is Bad (TM).

Maybe some testing is in order.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Jim Nasby <jim(at)nasby(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: heap_page_prune comments
Date: 2011-11-04 14:46:24
Message-ID: 7442.1320417984@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> Seems a little hackish, though: we'd be reporting an amount of
> freespace that we've deliberately set to an incorrect value. I'm
> almost thinking we should report the freespace that's actually
> available, on the theory that Bload Is Bad (TM).

IIRC, this code is following the very longstanding precedent of
RelationGetBufferForTuple.

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jim Nasby <jim(at)nasby(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: heap_page_prune comments
Date: 2011-11-04 15:18:45
Message-ID: CA+TgmoYLBcUgqqdE=sMKwcf5ZCsSsrcPGO8fpjGBKv_TPKdv6w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Nov 4, 2011 at 10:46 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> Seems a little hackish, though: we'd be reporting an amount of
>> freespace that we've deliberately set to an incorrect value.  I'm
>> almost thinking we should report the freespace that's actually
>> available, on the theory that Bload Is Bad (TM).
>
> IIRC, this code is following the very longstanding precedent of
> RelationGetBufferForTuple.

I don't understand the analogy - that function isn't freeing any
space, just searching for a block that already has some. And it does
update the free space map if the free space map is found to be out of
date, whereas this function does not.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Jim Nasby <jim(at)nasby(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: heap_page_prune comments
Date: 2011-11-04 18:17:50
Message-ID: 12194.1320430670@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Fri, Nov 4, 2011 at 10:46 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> IIRC, this code is following the very longstanding precedent of
>> RelationGetBufferForTuple.

> I don't understand the analogy - that function isn't freeing any
> space, just searching for a block that already has some. And it does
> update the free space map if the free space map is found to be out of
> date, whereas this function does not.

No, I'm talking about what it does at the very bottom, when it's had to
add a new block to the relation:

* XXX should we enter the new page into the free space map immediately,
* or just keep it for this backend's exclusive use in the short run
* (until VACUUM sees it)? Seems to depend on whether you expect the
* current backend to make more insertions or not, which is probably a
* good bet most of the time. So for now, don't add it to FSM yet.

Now, heap_page_prune is in a slightly different place, because it
doesn't actually know whether the current backend is going to make an
insertion or update in the page. If it did know that was going to
happen, then the analogy would be exact.

In any case, the comment in heap_page_prune is ignoring the probability
that VACUUM will eventually visit the page and then update the FSM.
That ought to be factored into any discussion of what to do here.

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jim Nasby <jim(at)nasby(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: heap_page_prune comments
Date: 2011-11-04 18:31:08
Message-ID: CA+TgmoYw=OsuGwz6gWzsa0gN0x8oKHqx5GPFNU3dwUXfGOLifA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Nov 4, 2011 at 2:17 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Now, heap_page_prune is in a slightly different place, because it
> doesn't actually know whether the current backend is going to make an
> insertion or update in the page.  If it did know that was going to
> happen, then the analogy would be exact.

OK.

> In any case, the comment in heap_page_prune is ignoring the probability
> that VACUUM will eventually visit the page and then update the FSM.
> That ought to be factored into any discussion of what to do here.

True. Unfortunately, I have no intuition on what the right thing to
do is, here.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: heap_page_prune comments
Date: 2012-08-16 23:03:08
Message-ID: 20120816230308.GD30286@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Nov 2, 2011 at 12:27:02PM -0400, Robert Haas wrote:
> The following comment - or at least the last sentence thereof -
> appears to be out of date.
>
> /*
> * XXX Should we update the FSM information of this page ?
> *
> * There are two schools of thought here. We may not want to update FSM
> * information so that the page is not used for unrelated
> UPDATEs/INSERTs
> * and any free space in this page will remain available for further
> * UPDATEs in *this* page, thus improving chances for doing HOT updates.
> *
> * But for a large table and where a page does not receive
> further UPDATEs
> * for a long time, we might waste this space by not updating the FSM
> * information. The relation may get extended and fragmented further.
> *
> * One possibility is to leave "fillfactor" worth of space in this page
> * and update FSM with the remaining space.
> *
> * In any case, the current FSM implementation doesn't accept
> * one-page-at-a-time updates, so this is all academic for now.
> */
>
> The simple fix here is just to delete that last sentence, but does
> anyone think we ought to do change the behavior, now that we have the
> option to do so?

Last sentence removed.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +