Re: Turning off HOT/Cleanup sometimes

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Turning off HOT/Cleanup sometimes
Date: 2015-04-15 12:42:33
Message-ID: CA+U5nM+0h+q4K360F+QPs2eP_qz_rbA5FhRdkZ94mQGpmEpZeg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 15 April 2015 at 08:04, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Wed, Apr 15, 2015 at 3:37 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>> On 14 April 2015 at 21:53, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>> Peter commented previously that README.HOT should get an update. The
>>> relevant section seems to be "When can/should we prune or
>>> defragment?".
>>
>> That's easy enough to change once we agree to commit.
>>
>>> I wonder if it would be a useful heuristic to still prune pages if
>>> those pages are already dirty.
>>
>> Useful for who? This is about responsibility. Why should someone
>> performing a large SELECT take the responsibility for cleaning pages?
>
> Because it makes it subsequent accesses to the page cheaper.

Cheaper for whom?

> Of
> course, that applies in all cases, but when the page is already dirty,
> the cost of pruning it is probably quite small - we're going to have
> to write the page anyway, and pruning it before it gets evicted
> (perhaps even by our scan) will be cheaper than writing it now and
> writing it again after it's pruned. When the page is clean, the cost
> of pruning is significantly higher.

"We" aren't going to have to write the page, but someone will.

In a single workload, the mix of actions can be useful. In separate
workloads, where some guy just wants to run a report or a backup, its
not right that we slow them down because of someone else's actions.

> I won't take responsibility for paying my neighbor's tax bill, but I
> might take responsibility for picking up his mail while he's on
> holiday.

That makes it sound like this is an occasional, non-annoying thing.

It's more like, whoever fetches the mail needs to fetch it for
everybody. So we are slowing down one person disproportionately, while
others fly through without penalty. There is no argument that one
workload necessarily needs to perform that on behalf of the other
workload.

The actions you suggest are reasonable and should ideally be the role
of a background process. But that doesn't mean in the absence of that
we should pay the cost in the foreground.

Let me apply this patch.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, RemoteDBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2015-04-15 12:42:47 Re: FPW compression leaks information
Previous Message Michael Paquier 2015-04-15 12:31:42 Re: [COMMITTERS] pgsql: Move pg_upgrade from contrib/ to src/bin/