Re: review: Non-recursive processing of AND/OR lists

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Gurjeet Singh <gurjeet(at)singh(dot)im>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: review: Non-recursive processing of AND/OR lists
Date: 2013-07-18 13:22:21
Message-ID: CA+Tgmoam3eTBdNnTtS3_GKJhcFgak5toWGKOWzP95qYz1X87=g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jul 17, 2013 at 2:03 PM, Gurjeet Singh <gurjeet(at)singh(dot)im> wrote:
> In v6 of the patch, I have deferred the 'pending' list initialization to
> until we actually hit a candidate right-branch. So in the common case the
> pending list will never be populated, and if we find a bushy or right-deep
> tree (for some reason an ORM/tool may choose to build AND/OR lists that may
> end being right-deep when in Postgres), then the pending list will be used
> to process them iteratively.
>
> Does that alleviate your concern about 'pending' list management causing an
> overhead.
>
> Agreed that bushy/right-deep trees are a remote corner case, but we are
> addressing a remote corner case in the first place (insanely long AND lists)
> and why not handle another remote corner case right now if it doesn't cause
> an overhead for common case.

Because simpler code is less likely to have bugs and is easier to
maintain. It's worth noting that the change you're proposing is in
fact user-visible, as demonstrated by the fact that you had to update
the regression test output:

- | WHERE (((rsl.sl_color =
rsh.slcolor) AND (rsl.sl_len_cm >= rsh.slminlen_cm)) AND
(rsl.sl_len_cm <= rsh.slmaxlen_cm));
+ | WHERE ((rsl.sl_color =
rsh.slcolor) AND (rsl.sl_len_cm >= rsh.slminlen_cm) AND (rsl.sl_len_cm
<= rsh.slmaxlen_cm));

Now, I think that change is actually an improvement, because here's
what that WHERE clause looked like when it was entered:

WHERE rsl.sl_color = rsh.slcolor
AND rsl.sl_len_cm >= rsh.slminlen_cm
AND rsl.sl_len_cm <= rsh.slmaxlen_cm;

But flattening a = 1 AND (b = 1 AND c = 1 AND d = 1) AND e = 1 to a
flat list doesn't have any of the same advantages.

At the end of the day, this is a judgement call, and I'm giving you
mine. If somebody else wants to weigh in, that's fine. I can't think
of anything that would actually be outright broken under your proposed
approach, but my personal feeling is that it's better to only add the
amount of code that we know is needed to solve the problem actually
observed in practice, and no more.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2013-07-18 14:12:08 Re: Return of "can't paste into psql" issue
Previous Message Merlin Moncure 2013-07-18 12:59:23 Re: Return of "can't paste into psql" issue