Re: CTE optimization fence on the todo list?

Lists: pgsql-hackers
From: Daniel Browning <db(at)kavod(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: CTE optimization fence on the todo list?
Date: 2012-09-19 20:05:44
Message-ID: 201209191305.44674.db@kavod.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I would like to have the option of disabling the CTE optimization fence for
certain CTEs and/or queries. Can that be added to the official todo list? If
not, why not?

I would find the option beneficial because large, complicated queries are
often a lot clearer, simpler, and easier to read with CTEs than the
equivalent query without CTEs. In some cases, the query with CTEs is also
faster because of the optimization fence. But in other cases, the fence
makes it a lot slower. In the latter cases, you are left with a choice
between ugly and slow.

If there was some method to disable the optimization fence for certain CTEs
or entire queries, then it would be possible to have the best of both
worlds.

I apologize if this has already been covered before. I could only find two
earlier discussions on this topic:

http://archives.postgresql.org/pgsql-performance/2011-10/msg00208.php
http://archives.postgresql.org/pgsql-performance/2011-11/msg00015.php

In the latter, I counted four people would are in support of the general
idea: Robert Haas, Andres Freund, Gavin Flower, Justin Pitts. However, I'm
sure there are a lot of conflicting ideas on how exactly to go about it,
such as whether to enable or disable it by default, the specific syntax to
use, backwards compatibility, future-proofing, etc.

One good reason to reject it would be if it can't be done with SQL standard
syntax and would require some sort of PG-specific hint or GUC variable for
the query planner. If so, then I understand that it's opposed for all the
same reasons that hints are opposed in general.

Another good reason to reject it might be because the only way to disable
the CTE fence is to disable it by default. If that were the case, then I
would imagine that it would break backwards compatibility, especially in the
case of writable CTEs that currently depend on the fence for their current
functionality. If there is no way to palatably enable it by default but
allow certain CTEs or certain queries to disable it, then I don't see any
way around that problem.

A third reason I can imagine is that the only desirable solution (e.g. the
one without additional non-standard keywords or session GUC variables) is
effectively impossible. For example, if it requires that the query planner
determine definitively whether a CTE is read only or not, that may be a
bridge too far.

A fourth possible reason is that the core team feels that CTEs do not
improve readability, or that any such readability benefits are not worth the
effort to support the option. Personally, I feel that the queries which
could most benefit from the readability of CTEs are precisely the same ones
that could most often benefit from the performance increase of disabling the
fence (particularly if it could be done on a per-CTE basis rather than for
the whole query at once).

Of course the real reason could be something else entirely, hence this post.
Thanks in advance for your feedback.
--
DB


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Daniel Browning <db(at)kavod(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: CTE optimization fence on the todo list?
Date: 2012-10-01 13:05:18
Message-ID: CAHyXU0zpM5+Dsb_pKxDmm-ZoWUAt=SkHHaiK_DBqcmtxTas6Nw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Sep 19, 2012 at 3:05 PM, Daniel Browning <db(at)kavod(dot)com> wrote:
> Another good reason to reject it might be because the only way to disable
> the CTE fence is to disable it by default. If that were the case, then I
> would imagine that it would break backwards compatibility, especially in the
> case of writable CTEs that currently depend on the fence for their current
> functionality.

Yeah: I constantly rely on CTE fencing and it's a frequently suggested
performance trick on the lists. LATERAL is coming out soon and this
will remove one of the largest reasons to fence but there are of
course others. Also, a GUC setting is almost certainly the wrong
approach.

I'm wondering if there are any technical/standards constraints that
are behind the fencing behavior. If there aren't any, maybe an opt-in
keyword might do the trick -- WITH UNBOXED foo AS (..)?

merlin


From: Peter Geoghegan <peter(at)2ndquadrant(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Daniel Browning <db(at)kavod(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: CTE optimization fence on the todo list?
Date: 2012-10-01 13:09:59
Message-ID: CAEYLb_WUiP7kNR1xv3UWXhukfdFL-nCC05eyog=17ghX+z0DJw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 1 October 2012 14:05, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
> On Wed, Sep 19, 2012 at 3:05 PM, Daniel Browning <db(at)kavod(dot)com> wrote:
> I'm wondering if there are any technical/standards constraints that
> are behind the fencing behavior. If there aren't any, maybe an opt-in
> keyword might do the trick -- WITH UNBOXED foo AS (..)?

I may be mistaken, but I think that the fence is described in the SQL standard.

--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Daniel Browning <db(at)kavod(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: CTE optimization fence on the todo list?
Date: 2012-10-01 13:45:54
Message-ID: m2vceuuwz1.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
> I'm wondering if there are any technical/standards constraints that
> are behind the fencing behavior. If there aren't any, maybe an opt-in

The fencing is per standard, and very useful when used in wCTEs.

> keyword might do the trick -- WITH UNBOXED foo AS (..)?

I would paint your proposal WITH VIEW foo AS (), which would make the
behaviour obvious I think.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Daniel Browning <db(at)kavod(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: CTE optimization fence on the todo list?
Date: 2012-10-01 14:07:01
Message-ID: 23258.1349100421@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
> I'm wondering if there are any technical/standards constraints that
> are behind the fencing behavior.

I think the key reason is that we don't want partial execution of DML
operations (ie, INSERT/UPDATE/DELETE inside a WITH). The fencing
behavior was put in originally because we foresaw adding DML later.

We could possibly relax the rule for WITH SELECT only, but it would
be rather inconsistent, not to mention unpleasant for all the people
who are relying on the current behavior for one reason or another.

Another issue is that if the CTE is scanned multiple times by the outer
query, you really can't optimize it on the basis of any one call site.

regards, tom lane


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, Daniel Browning <db(at)kavod(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: CTE optimization fence on the todo list?
Date: 2012-10-01 14:49:29
Message-ID: 20121001144929.GB30089@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Oct 1, 2012 at 10:07:01AM -0400, Tom Lane wrote:
> Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
> > I'm wondering if there are any technical/standards constraints that
> > are behind the fencing behavior.
>
> I think the key reason is that we don't want partial execution of DML
> operations (ie, INSERT/UPDATE/DELETE inside a WITH). The fencing
> behavior was put in originally because we foresaw adding DML later.
>
> We could possibly relax the rule for WITH SELECT only, but it would
> be rather inconsistent, not to mention unpleasant for all the people
> who are relying on the current behavior for one reason or another.
>
> Another issue is that if the CTE is scanned multiple times by the outer
> query, you really can't optimize it on the basis of any one call site.

If we wanted to relax the fencing, we might need to do it via an SQL
keyword on the SELECT, to avoid the confusion caused by GUCs.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Daniel Browning <db(at)kavod(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: CTE optimization fence on the todo list?
Date: 2012-10-01 16:06:19
Message-ID: 20121001160619.GU1267@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Bruce Momjian (bruce(at)momjian(dot)us) wrote:
> If we wanted to relax the fencing, we might need to do it via an SQL
> keyword on the SELECT, to avoid the confusion caused by GUCs.

I like the idea of providing a way for users to request non-fencing,
perhaps only allowed for SELECT CTEs. I don't like the GUC approach. I
also wonder if it'd make sense and/or be possible to have the fence
applied on a per-CTE basis (inside of the same overall query). If we
add a keyword for this and it's not hard to do, I think that'd be a
really neat capability. (No, unlike the OP, I don't have specific use
cases for that offhand, but why limit it to all or nothing for an entire
query..)

Thanks,

Stephen