Re: recursive query crash

Lists: pgsql-hackers
From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: Postgres <pgsql-hackers(at)postgresql(dot)org>
Subject: recursive query crash
Date: 2008-10-11 13:37:03
Message-ID: 87zllbxngg.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


This crashes, apparently it tries to look up the result type on a NULL
planstate:

with recursive z(i) as (
select *
from t
union all
(with a(i) as (select * from z)
select * from a)
)
select * from z;

Incidentally, why are the parentheses required around the second branch of the
union?

#0 0x081b0d39 in ExecGetResultType (planstate=0x0) at execUtils.c:535
#1 0x081becfe in ExecInitWorkTableScan (node=0x851fc08, estate=0x8523220,
eflags=0) at nodeWorktablescan.c:140
#2 0x081a6698 in ExecInitNode (node=0x851fc08, estate=0x8523220, eflags=0)
at execProcnode.c:218
#3 0x081a3569 in InitPlan (queryDesc=0x8522e18, eflags=0) at execMain.c:676
#4 0x081a2cd4 in ExecutorStart (queryDesc=0x8522e18, eflags=0)
at execMain.c:202
#5 0x0826e74e in PortalStart (portal=0x851ae00, params=0x0, snapshot=0x0)
at pquery.c:528
#6 0x08269529 in exec_simple_query (
query_string=0x8516df0 "with recursive z(i) as (select * from t union all (with a(i) as (select * from z) select * from a)) select * from z;")
at postgres.c:955
#7 0x0826d384 in PostgresMain (argc=4, argv=0x84878a8,
username=0x8487880 "stark") at postgres.c:3569
#8 0x08238a67 in BackendRun (port=0x84a24c0) at postmaster.c:3258
#9 0x08237fa2 in BackendStartup (port=0x84a24c0) at postmaster.c:2872
#10 0x082358e4 in ServerLoop () at postmaster.c:1283
#11 0x0823524e in PostmasterMain (argc=3, argv=0x8484428) at postmaster.c:1031
#12 0x081cfa89 in main (argc=3, argv=0x8484428) at main.c:188

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's 24x7 Postgres support!


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: Postgres <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: recursive query crash
Date: 2008-10-11 15:25:09
Message-ID: 21910.1223738709@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Gregory Stark <stark(at)enterprisedb(dot)com> writes:
> This crashes, apparently it tries to look up the result type on a NULL
> planstate:

Tsk tsk, running CVS HEAD without Asserts?

It looks like things are getting initialized in the wrong order. Maybe
we need to attach the initplan lower down.

> Incidentally, why are the parentheses required around the second branch of the
> union?

AFAICT, the spec's grammar won't let you put WITH there at all, even
with the parens; but it's definitely invalid without --- note their
distinction between <query expression> and <query expression body>.
We'll see if we can make it work in PG though.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: Postgres <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: recursive query crash
Date: 2008-10-11 23:03:48
Message-ID: 29499.1223766228@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Gregory Stark <stark(at)enterprisedb(dot)com> writes:
> This crashes, apparently it tries to look up the result type on a NULL
> planstate:
> with recursive z(i) as (
> select *
> from t
> union all
> (with a(i) as (select * from z)
> select * from a)
> )
> select * from z;

Hmm ... I tried to fix this by changing the order in which the subplans
get initialized, but that just moves the crash to the other subplan.
The problem really is that what we have here is two mutually recursive
WITH entries, and neither one can be successfully initialized in the
executor before the other one is. I begin to see why the SQL spec
forbids this syntax altogether.

I'm inclined to prevent this case by forbidding recursive references
inside nested WITH clauses. Thoughts?

regards, tom lane


From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Postgres <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: recursive query crash
Date: 2008-10-12 00:43:40
Message-ID: 87myhay75v.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

> Gregory Stark <stark(at)enterprisedb(dot)com> writes:
>> This crashes, apparently it tries to look up the result type on a NULL
>> planstate:
>> with recursive z(i) as (
>> select *
>> from t
>> union all
>> (with a(i) as (select * from z)
>> select * from a)
>> )
>> select * from z;
>
> Hmm ... I tried to fix this by changing the order in which the subplans
> get initialized, but that just moves the crash to the other subplan.
> The problem really is that what we have here is two mutually recursive
> WITH entries, and neither one can be successfully initialized in the
> executor before the other one is. I begin to see why the SQL spec
> forbids this syntax altogether.
>
> I'm inclined to prevent this case by forbidding recursive references
> inside nested WITH clauses. Thoughts?

I'm a bit puzzled where the root of the problem lies here. Surely the nested
with clause is just equivalent to a plain "select * from z" after all. Is it
just that the WITH clause is making it hard for the recursive planner to
recognize that this is in fact the recursive side of the union and needs
special treatment? What else might confuse it?

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's Slony Replication support!


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: Postgres <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: recursive query crash
Date: 2008-10-12 02:21:44
Message-ID: 2351.1223778104@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Gregory Stark <stark(at)enterprisedb(dot)com> writes:
> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
>> I'm inclined to prevent this case by forbidding recursive references
>> inside nested WITH clauses. Thoughts?

> I'm a bit puzzled where the root of the problem lies here. Surely the nested
> with clause is just equivalent to a plain "select * from z" after all.

If we were to flatten it to a plain "select * from z" then maybe things
would work all right, but the present implementation treats both WITH
clauses as equally requiring single evaluation.

I don't find the flattening argument to be especially compelling
anyway. Suppose that the query containing the nested WITH refers
to the WITH query more than once (or more than twice, or whatever
your threshold of pain is before you agree that single evaluation is
required). Should the query suddenly become invalid at that point?

regards, tom lane


From: "Robert Haas" <robertmhaas(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Gregory Stark" <stark(at)enterprisedb(dot)com>, Postgres <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: recursive query crash
Date: 2008-10-12 12:28:07
Message-ID: 603c8f070810120528k33549312qac7e5b7f275ed593@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> If we were to flatten it to a plain "select * from z" then maybe things
> would work all right, but the present implementation treats both WITH
> clauses as equally requiring single evaluation.

Surely it should be a single evaluation for each time that branch of
the upper WITH is recursively evaluated? I can't think what the
semantics are otherwise. a is a function of z, so you can't change
the definition of z and pretend like it's OK that a still has the same
contents as before.

...Robert


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Robert Haas" <robertmhaas(at)gmail(dot)com>
Cc: "Gregory Stark" <stark(at)enterprisedb(dot)com>, Postgres <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: recursive query crash
Date: 2008-10-13 00:44:09
Message-ID: 18403.1223858649@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

OK, I found a real solution: we can fix nodeWorktablescan.c so that the
order of initialization doesn't matter, by postponing the steps that
need information from the RecursiveUnion node until the first exec call
for the worktable node.

Although nodeCtescan is making a similar assumption about initialization
order, I think that one ought to be safe. We can always change it later
if need be.

regards, tom lane