Re: Turning off HOT/Cleanup sometimes

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Turning off HOT/Cleanup sometimes
Date: 2014-02-03 06:42:08
Message-ID: CAA4eK1Kx+07WdKeYwsBUAOg3RmnKu7T2M_+kxxkhyhTbTcrLUA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jan 15, 2014 at 2:43 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On 8 January 2014 08:33, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>
> Patch attached, implemented to reduce writes by SELECTs only.

This is really a valuable improvement over current SELECT behaviour
w.r.t Writes.

While going though patch, I observed few points, so thought of
sharing with you:

+ /*
+ * If we are tracking pruning in SELECTs then we can only get
+ * here by heap_page_prune_opt() call that cleans a block,
+ * so in that case, register it as a pruning operation.
+ * Make sure we don't double count during VACUUMs.
+ */
+ if (PrunePageDirtyLimit > -1)
+ PrunePageDirty++;

a. As PrunePageDirtyLimit variable is not initialized for DDL flow,
any statement like Create Function().. will have value of
PrunePageDirtyLimit as 4 (default) and in such cases MarkBufferDirty()
will increment the wrong counter.

b. For DDL statements like Create Materialized view, it will behave as
Select statement.
Ex.
Create Materialized view mv1 as select * from t1;

Now here I think it might not be a problem, because for t1 anyway there
will be no write, so skipping pruning should not be a problem and for
materialized views also there will no dead rows, so skipping should be
okay, but I think it is not strictly adhering to statement "to reduce writes
by SELECTs only" and purpose of patch which is to avoid only when
Top level statement is SELECT.
Do you think it's better to consider such cases and optimize for them
or should we avoid it by following thumb rule that pruning will be avoided
only for top level SELECT?

2. + "Allow cleanup of shared buffers by foreground processes, allowing
later cleanup by VACUUM",
This line is not clear, what do you mean to say by "allowing later cleanup
by VACUUM", if already foreground process has done cleanup, then it
should save effort of Vacuum.

In general, though both the optimisations (allow_buffer_cleanup and
prune_page_dirty_limit ) used in patch have similarity in the sense
that they will be used to avoid pruning, but still I feel they are for different
cases (READ ONLY OP and WRITE ON SMALL TABLES) and also as there
are more people inclined to do this for only SELECT operations, do you think
it will be a good idea to make them as separate patches?

I think there can be some applications or use cases which can be benefited
by avoiding pruning for WRITE ON SMALL TABLES, but the case for SELECT
is more general and more applications can get benefit with this optimisation,so
it would be better if we first try to accomplish that case.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Oleg Bartunov 2014-02-03 06:53:45 Re: GIN improvements part2: fast scan
Previous Message Craig Ringer 2014-02-03 06:31:11 Re: FOR [SHARE|UPDATE] NOWAIT may still block in EvalPlanQualFetch