Re: Handle LIMIT/OFFSET before select clause (was: Feature request: optimizer improvement)

From: Joe Love <joe(at)primoweb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Jim Nasby <jim(at)nasby(dot)net>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Handle LIMIT/OFFSET before select clause (was: Feature request: optimizer improvement)
Date: 2013-11-05 17:16:15
Message-ID: CAK3BLoSH7Ts+d1uKXdjdWPqqvqr3RgN=JNKE8SpYUP_p=SXsFg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I'm wondering what type of index would work for this as it is a volatile
function. Not knowing how PGs optimizer runs, I'm at a loss as to why this
wouldn't be possible or worth doing. It seems to me that all functions in
the "select" part of the statement could be calculated at the end of the
query after the results have been gathered, and even after the sorting had
been done as long as the column wasn't part of the order by (or perhaps
group by).

I have an entire set of functions that perform in this way. For example,
I'm selecting a list of all my products and the function does a complex
calculation based on inventory in the warehouse + expected deliveries from
the factory to determine how many of each item is available, and when they
first become available. What's helpful is for the users search criteria
to initially limit the search result, and then I want to paginate the
results and only show them a few at a time. In the verbose syntax I
mentioned originally, the query performs well, in the most straightforward
syntax, it does not. I'm not sure I even need to "hint" the optimizer to
perform this type of an optimization as it seems it would be beneficial (or
at least not detrimental) 100% of the time.

On Sat, Nov 2, 2013 at 10:06 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Atri Sharma <atri(dot)jiit(at)gmail(dot)com> writes:
> > I understand the reasons for executing SELECT before the sort. But,
> > couldnt we get the planner to see the LIMIT part and push the sort
> > node above the select node for this specific case?
>
> [ Shrug... ] I don't see the point. If the OP actually cares about the
> speed of this query, he's going to want to avoid the sort step too,
> which is what makes the index a good idea.
>
> More generally, this is not a transformation we could apply
> unconditionally --- at the very least it'd need to be avoided when
> volatile functions are involved, and I don't think it's invariably
> a win from a cost standpoint even if there aren't semantic blockers.
> But right now the planner has no real ability to reason about placement
> of SELECT-list evaluation: it's done in a fixed spot in any particular
> plan structure. I don't think it's practical to add such considerations
> to the rat's nest that is grouping_planner and friends. I have
> ambitions to replace all that with a Path-generation-and-comparison
> approach, and the Paths used for this would need to carry some
> indication of which expressions would be evaluated where. So maybe
> once that's done we could think about whether this is worth doing.
> I remain dubious though.
>
> regards, tom lane
>

--
Joe's Computer Service
405-227-0951
Computer Running Slow? Call Joe!
$125, Your computer will run like new!
www.JoesComputerService.net

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Janes 2013-11-05 17:28:56 Re: [PATCH] configure: add git describe output to PG_VERSION when building a git tree
Previous Message Jeff Janes 2013-11-05 16:57:06 Re: Fast insertion indexes: why no developments