ORDER BY and LIMIT not propagated on inherited tables / UNIONs

Lists: pgsql-performance
From: Matteo Beccati <php(at)beccati(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: ORDER BY and LIMIT not propagated on inherited tables / UNIONs
Date: 2005-09-02 10:20:57
Message-ID: 43182789.4070307@beccati.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Hi,

I'm using inherited tables to partition some data which can grow very
large. Recently I discovered that a simple query that on a regular table
would use an index was instead using seq scans (70s vs a guessed 2s).
The well known query is:

SELECT foo FROM bar ORDER BY foo DESC LIMIT 1

(The same applies for SELECT MIN(foo) FROM bar using 8.1)

The query plan generated when running the query on a table which has
inheritance forces the planner to choose a seq_scan for each table.
Wouldn't be a good thing to also promote ORDER BYs and LIMITs to each
subscan (like WHERE does)?

I needed a quick solution, so I wrote a function which looks each
inherited table separately and my problem is partially solved, but I
think that a (hopefully) little change in the optimizer could be advisable.

Attached are some EXPLAIN ANALYZE outputs of my suggestion.

Best regards
--
Matteo Beccati
http://phpadsnew.com
http://phppgads.com

Attachment Content-Type Size
order_limit_inheritance.txt plain/text 12.0 KB

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Matteo Beccati <php(at)beccati(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: ORDER BY and LIMIT not propagated on inherited
Date: 2005-09-02 12:35:32
Message-ID: 1125664532.3956.278.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On Fri, 2005-09-02 at 12:20 +0200, Matteo Beccati wrote:

> I'm using inherited tables to partition some data which can grow very
> large. Recently I discovered that a simple query that on a regular table
> would use an index was instead using seq scans (70s vs a guessed 2s).
> The well known query is:
>
> SELECT foo FROM bar ORDER BY foo DESC LIMIT 1
>
> (The same applies for SELECT MIN(foo) FROM bar using 8.1)
>
>
> The query plan generated when running the query on a table which has
> inheritance forces the planner to choose a seq_scan for each table.
> Wouldn't be a good thing to also promote ORDER BYs and LIMITs to each
> subscan (like WHERE does)?

The tuple_fraction implied by LIMIT is already passed through to each
child table when using an inherited table structure. This would then be
taken into account when plans are made for each child table. I don't
think the situation you observe occurs as a result of query planning.

Do your child tables have indexes on them? Indexes are not inherited
onto child tables, so it is possible that there is no index for the
planner to elect to use.

Best Regards, Simon Riggs


From: Matteo Beccati <php(at)beccati(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: ORDER BY and LIMIT not propagated on inherited
Date: 2005-09-02 12:54:39
Message-ID: 43184B8F.2090505@beccati.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Simon Riggs wrote:
>>The query plan generated when running the query on a table which has
>>inheritance forces the planner to choose a seq_scan for each table.
>>Wouldn't be a good thing to also promote ORDER BYs and LIMITs to each
>>subscan (like WHERE does)?
>
> The tuple_fraction implied by LIMIT is already passed through to each
> child table when using an inherited table structure. This would then be
> taken into account when plans are made for each child table. I don't
> think the situation you observe occurs as a result of query planning.
>
> Do your child tables have indexes on them? Indexes are not inherited
> onto child tables, so it is possible that there is no index for the
> planner to elect to use.

In this cases the tuple_fraction is useless if the planner doesn't know
that a ORDER BY on each child table is requested. In fact the sort is
applied after all the rows are appended. The correct strategy IMHO would
be applying the order by and limit for each child table (which results
in an index scan, if possible), appending, then finally sorting a bunch
of rows, and limiting again.

Every table has indexes, as you can see in the third attacheed EXPLAIN
ANALYZE output.

Best regards
--
Matteo Beccati
http://phpadsnew.com
http://phppgads.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Matteo Beccati <php(at)beccati(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: ORDER BY and LIMIT not propagated on inherited
Date: 2005-09-02 14:40:54
Message-ID: 14074.1125672054@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Matteo Beccati <php(at)beccati(dot)com> writes:
> The correct strategy IMHO would
> be applying the order by and limit for each child table (which results
> in an index scan, if possible), appending, then finally sorting a bunch
> of rows, and limiting again.

This would be a win in some cases, and in many others a loss (ie, wasted
sort steps). The hard part is determining when to apply it.

regards, tom lane


From: Matteo Beccati <php(at)beccati(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: ORDER BY and LIMIT not propagated on inherited
Date: 2005-09-02 15:00:50
Message-ID: 43186922.4060703@beccati.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Hi,

>>The correct strategy IMHO would
>>be applying the order by and limit for each child table (which results
>>in an index scan, if possible), appending, then finally sorting a bunch
>>of rows, and limiting again.
>
> This would be a win in some cases, and in many others a loss (ie, wasted
> sort steps). The hard part is determining when to apply it.

I don't actually know how many smaller separate sorts compare to a
single big sort, but I guess the difference wouldn't be so big if the
LIMIT is low. Add to this that you don't need to append the whole
rowsets, but just smaller ones.

Best regards
--
Matteo Beccati
http://phpadsnew.com
http://phppgads.com


From: Jens-Wolfhard Schicke <ml+pgsql-performance(at)asco(dot)de>
To:
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: ORDER BY and LIMIT not propagated on inherited
Date: 2005-09-02 15:09:18
Message-ID: 6A72374A62C8721E4E414937@[192.168.1.72]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

>>> The correct strategy IMHO would
>>> be applying the order by and limit for each child table (which results
>>> in an index scan, if possible), appending, then finally sorting a bunch
>>> of rows, and limiting again.
>>
>> This would be a win in some cases, and in many others a loss (ie, wasted
>> sort steps). The hard part is determining when to apply it.
>
> I don't actually know how many smaller separate sorts compare to a single
> big sort, but I guess the difference wouldn't be so big if the LIMIT is
> low. Add to this that you don't need to append the whole rowsets, but
> just smaller ones.

I think if you have a bunch of sorted thingies, you'd perform exactly one
merge step and be done, should be possible to do that in O(child_tables *
rows)...

Mit freundlichem Gruß
Jens Schicke
--
Jens Schicke j(dot)schicke(at)asco(dot)de
asco GmbH http://www.asco.de
Mittelweg 7 Tel 0531/3906-127
38106 Braunschweig Fax 0531/3906-400


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Matteo Beccati <php(at)beccati(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: ORDER BY and LIMIT not propagated on inherited
Date: 2005-09-02 17:54:56
Message-ID: 1125683697.3956.310.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On Fri, 2005-09-02 at 12:20 +0200, Matteo Beccati wrote:
> I'm using inherited tables to partition some data which can grow very
> large. Recently I discovered that a simple query that on a regular table
> would use an index was instead using seq scans (70s vs a guessed 2s).
> The well known query is:
>
> SELECT foo FROM bar ORDER BY foo DESC LIMIT 1
>
> (The same applies for SELECT MIN(foo) FROM bar using 8.1)
>

Returning to Matteo's original query, what we are saying is that the new
optimization for MIN/MAX queries doesn't work with inherited tables.

It could do, by running optimize_minmax_aggregates() for each query that
gets planned to see if a better plan exists for each child table.

I think that's a TODO item.

Optimizing ORDER BY and LIMIT down looks like it would be harder to do
in the general case, even if Matteo's simple transform looks good. I'm
not sure it's a very common query type though...

Best Regards, Simon Riggs


From: Matteo Beccati <php(at)beccati(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: ORDER BY and LIMIT not propagated on inherited
Date: 2005-09-02 18:33:51
Message-ID: 43189B0F.2000904@beccati.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Simon Riggs wrote:
> Returning to Matteo's original query, what we are saying is that the new
> optimization for MIN/MAX queries doesn't work with inherited tables.
>
> It could do, by running optimize_minmax_aggregates() for each query that
> gets planned to see if a better plan exists for each child table.
>
> I think that's a TODO item.

Great. Of course I'm using ORDER BY ... LIMIT as a workaround to get the
index scan on pre-8.1, and because I'm used to it insted of the
previously not optimized MIN/MAX aggregates.

> Optimizing ORDER BY and LIMIT down looks like it would be harder to do
> in the general case, even if Matteo's simple transform looks good. I'm
> not sure it's a very common query type though...

If I can find some time, I'll try to write some hacks... I just need to
find out where to start ;)

Best regards
--
Matteo Beccati
http://phpadsnew.com/
http://phppgads.com/


From: Ramesh kumar <mbrameshkumar(at)gmail(dot)com>
To: Matteo Beccati <php(at)beccati(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: ORDER BY and LIMIT not propagated on inherited
Date: 2005-09-02 20:08:53
Message-ID: bda43614050902130816116052@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Hi all
I have got lot of information from ur group.
Now i want to relieve from this group.
I kindly request all of you.
Plz unsubcribe me.
Thankz a lot
Ramesh

On 9/3/05, Matteo Beccati <php(at)beccati(dot)com> wrote:
>
> Simon Riggs wrote:
> > Returning to Matteo's original query, what we are saying is that the new
> > optimization for MIN/MAX queries doesn't work with inherited tables.
> >
> > It could do, by running optimize_minmax_aggregates() for each query that
> > gets planned to see if a better plan exists for each child table.
> >
> > I think that's a TODO item.
>
> Great. Of course I'm using ORDER BY ... LIMIT as a workaround to get the
> index scan on pre-8.1, and because I'm used to it insted of the
> previously not optimized MIN/MAX aggregates.
>
> > Optimizing ORDER BY and LIMIT down looks like it would be harder to do
> > in the general case, even if Matteo's simple transform looks good. I'm
> > not sure it's a very common query type though...
>
> If I can find some time, I'll try to write some hacks... I just need to
> find out where to start ;)
>
>
> Best regards
> --
> Matteo Beccati
> http://phpadsnew.com/
> http://phppgads.com/
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly
>

--
urs

RameshKumar.M