Re: WIP: parameterized function scan

Lists: pgsql-hackers
From: Antonin Houska <antonin(dot)houska(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: WIP: parameterized function scan
Date: 2012-05-11 21:52:39
Message-ID: 4FAD8A27.1030106@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello,
following this short discussion
http://archives.postgresql.org/message-id/4F5AA202.9020906@gmail.com
I gave it one more try and hacked the optimizer so that function can
become an inner relation in NL join, parametrized with values from the
outer relation.

I tried to explain my thoughts in comments. Other than that:

1. I haven't tried to use SpecialJoinInfo to constrain join order. Even
if the order matters in query like
SELECT * from a, func(a.i)
it's still inner join by nature. SpecialJoinInfo is used for INNER join
rarely, but never stored in PlannerInfo. Doing so only for these lateral
functions would be rather disruptive.

2. Simple SQL function (i.e. one that gets pulled-up into the main
query) is a special case. The query that results from such a pull-up no
longer contains any function (and thus is not affected by this patch)
but such cases need to be newly taken into account and examined / tested
(the patch unblocks them at parsing stage too).

3. There might be some open questions about SQL conformance.

I've spent quite a while looking into the optimizer code and after all I
was surprised that it didn't require that many changes. At least to make
few simple examples work. Do I ignore any important fact(s) ?

Thanks,
Tony.

Attachment Content-Type Size
parametrized_function_scan.patch text/x-patch 8.9 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Antonin Houska <antonin(dot)houska(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: parameterized function scan
Date: 2012-05-22 19:31:08
Message-ID: CA+TgmoaeysfJ1uS-vqAtUwEr7MCaewtNMWrgn_C-MGG9qwqdmA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, May 11, 2012 at 5:52 PM, Antonin Houska
<antonin(dot)houska(at)gmail(dot)com> wrote:
> Hello,
> following this short discussion
> http://archives.postgresql.org/message-id/4F5AA202.9020906@gmail.com
> I gave it one more try and hacked the optimizer so that function can become
> an inner relation in NL join, parametrized with values from the outer
> relation.
>
> I tried to explain my thoughts in comments. Other than that:
>
> 1. I haven't tried to use SpecialJoinInfo to constrain join order. Even if
> the order matters in query like
> SELECT * from a, func(a.i)
> it's still inner join by nature. SpecialJoinInfo is used for INNER join
> rarely, but never stored in PlannerInfo. Doing so only for these lateral
> functions would be rather disruptive.
>
> 2. Simple SQL function (i.e. one that gets pulled-up into the main query) is
> a special case. The query that results from such a pull-up no longer
> contains any function (and thus is not affected by this patch) but such
> cases need to be newly taken into account and examined / tested (the patch
> unblocks them at parsing stage too).
>
> 3. There might be some open questions about SQL conformance.
>
> I've spent quite a while looking into the optimizer code and after all I was
> surprised that it didn't require that many changes. At least to make few
> simple examples work. Do I ignore any important fact(s) ?

This implementation looks different than I'd expect: I would have
thought that it would work by generating paths with param_info set to
the appropriate set of rels to provide the necessary values, rather
than inventing its own mechanism for forcing a nestloop.

Also, I think we will want something that implements the LATERAL()
syntax, rather than just removing the prohibition on lateral
references.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Antonin Houska <antonin(dot)houska(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: parameterized function scan
Date: 2012-05-23 22:01:52
Message-ID: 4FBD5E50.1020301@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 05/22/2012 09:31 PM, Robert Haas wrote:
> This implementation looks different than I'd expect: I would have
> thought that it would work by generating paths with param_info set to
> the appropriate set of rels to provide the necessary values, rather
> than inventing its own mechanism for forcing a nestloop.
My consideration was something like: clauses are responsible for the
fact that parameter info is path-specific. However function parameters
have little to do with clauses, so they should be treated rather
separate, at higher level than that of individual paths.

Looking at the code again, I admit there's yet no strong reason not to
use the existing 'parameterization infrastructure'. I'll try to think up
something better.

> Also, I think we will want something that implements the LATERAL()
> syntax, rather than just removing the prohibition on lateral references.

So you think it's not good to first implement (implicitly) lateral
functions and add the explicit LATERAL() operator later, when the
optimizer can handle lateral subqueries too? (explicit LATERAL() that
only can take function as the argument and not subquery sounds weird).

I personally need the lateral functions much more than lateral
subqueries, but I have no impression of the attitude of (majority of)
other users.

While I believe I'm (slowly though) approaching a viable concept of
lateral functions, it's quite clear to me that the lateral subqueries
are obviously a bigger challenge.

These are both the reasons why I started to think about the functions
as a separate patch.

Tony H.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Antonin Houska <antonin(dot)houska(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: parameterized function scan
Date: 2012-05-23 22:46:53
Message-ID: 14239.1337813213@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Antonin Houska <antonin(dot)houska(at)gmail(dot)com> writes:
> On 05/22/2012 09:31 PM, Robert Haas wrote:
>> Also, I think we will want something that implements the LATERAL()
>> syntax, rather than just removing the prohibition on lateral references.

> So you think it's not good to first implement (implicitly) lateral
> functions and add the explicit LATERAL() operator later, when the
> optimizer can handle lateral subqueries too?

Well, it's not per spec: what you did accepts queries that are invalid
per spec and are very likely to be errors rather than intentional
invocations of the LATERAL facility. This might be all right for
experimental playing around, but there is zero chance that we will
commit it that way.

Quite aside from the error-detection issue, I would think that we would
want explicit LATERAL JOIN syntax which we'd turn into join order
restrictions (cf SpecialJoinInfo), so that the planner need not waste
time searching join orders that can't possibly work because of the
presence of the parameter references. This situation is unlike the
standard nestloop-with-parameterized-inner-scan case, because in that
case the parameterized path is merely one option of many. With a
LATERAL reference, there are no other implementation options.

> I personally need the lateral functions much more than lateral
> subqueries, but I have no impression of the attitude of (majority of)
> other users.
> While I believe I'm (slowly though) approaching a viable concept of
> lateral functions, it's quite clear to me that the lateral subqueries
> are obviously a bigger challenge.
> These are both the reasons why I started to think about the functions
> as a separate patch.

Quite honestly, I think that something that only handles functions as
LATERAL righthands is broken by design. It doesn't meet the spec, and
it's unlikely to represent much of a step towards a full implementation.
Remember Polya's Inventor's Paradox: the more general problem may be
easier to solve.

regards, tom lane


From: Antonin Houska <antonin(dot)houska(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: parameterized function scan
Date: 2012-06-07 23:04:53
Message-ID: 4FD13395.2010608@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 05/24/2012 12:46 AM, Tom Lane wrote:
> Well, it's not per spec: what you did accepts queries that are invalid
> per spec and are very likely to be errors rather than intentional
> invocations of the LATERAL facility. This might be all right for
>
I think I saw queries where function is joined with no explicit LATERAL().

Nevertheless...
> Quite honestly, I think that something that only handles functions as
> LATERAL righthands is broken by design. It doesn't meet the spec, and
> it's unlikely to represent much of a step towards a full implementation.
> Remember Polya's Inventor's Paradox: the more general problem may be
> easier to solve.
... sounds more serious. I'll keep it in mind if I get the impression
that I have a new idea about this problem anytime. Thanks,
Tony H.