Re: Patch for removng unused targets

From: "Etsuro Fujita" <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
To: "'Craig Ringer'" <craig(at)2ndQuadrant(dot)com>, "'Alexander Korotkov'" <aekorotkov(at)gmail(dot)com>
Cc: "'Tom Lane'" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "'pgsql-hackers'" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Patch for removng unused targets
Date: 2013-01-22 05:24:47
Message-ID: 016d01cdf860$cb1fcc20$615f6460$@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I'd like to rework on this optimization and submit a patch at the next CF. Is
that okay?

Thanks,

Best regards,

Etsuro Fujita

From: Craig Ringer [mailto:craig(at)2ndQuadrant(dot)com]
Sent: Friday, January 18, 2013 8:30 PM
To: Alexander Korotkov
Cc: Tom Lane; Etsuro Fujita; pgsql-hackers
Subject: Re: [HACKERS] Patch for removng unused targets

On 12/05/2012 04:15 AM, Alexander Korotkov wrote:

On Tue, Dec 4, 2012 at 11:52 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

Alexander Korotkov <aekorotkov(at)gmail(dot)com> writes:
> On Mon, Dec 3, 2012 at 8:31 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

>> But having said that, I'm wondering (without having read the patch)
>> why you need anything more than the existing "resjunk" field.

> Actually, I don't know all the cases when "resjunk" flag is set. Is it
> reliable to decide target to be used only for "ORDER BY" if it's "resjunk"
> and neither system or used in grouping? If it's so or there are some other
> cases which are easy to determine then I'll remove "resorderbyonly" flag.

resjunk means that the target is not supposed to be output by the query.
Since it's there at all, it's presumably referenced by ORDER BY or GROUP
BY or DISTINCT ON, but the meaning of the flag doesn't depend on that.

What you would need to do is verify that the target is resjunk and not
used in any clause besides ORDER BY. I have not read your patch, but
I rather imagine that what you've got now is that the parser checks this
and sets the new flag for consumption far downstream. Why not just make
the same check in the planner?

A more invasive, but possibly cleaner in the long run, approach is to
strip all resjunk targets from the query's tlist at the start of
planning and only put them back if needed.

BTW, when I looked at this a couple years ago, it seemed like the major
problem was that the planner assumes that all plans for the query should
emit the same tlist, and thus that tlist eval cost isn't a
distinguishing factor. Breaking that assumption seemed to require
rather significant refactoring. I never found the time to try to
actually do it.

May be there is some way to not remove items from tlist, but evade actual
calculation?

Did you make any headway on this? Is there work in a state that's likely to be
committable for 9.3, or is it perhaps best to defer this to post-9.3 pending
further work and review?

https://commitfest.postgresql.org/action/patch_view?id=980

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2013-01-22 06:15:38 Re: CF3+4 (was Re: Parallel query execution)
Previous Message Pavan Deolasee 2013-01-22 05:24:18 Re: CF3+4 (was Re: Parallel query execution)