Re: slow query plans caused by under-estimation of CTE cardinality

From: Vitalii Tymchyshyn <tivv00(at)gmail(dot)com>
To: John Lumby <johnlumby(at)hotmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: slow query plans caused by under-estimation of CTE cardinality
Date: 2013-02-18 20:40:37
Message-ID: CABWW-d3um0XrHEqORRSTGezV=vY3W0CBN=UJP245VHMyosrhRg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Since cte is already an optimization fence, you can go further and make it
temporary table.
Create table;analyze;select should make optimizer's work much easier.
18 лют. 2013 18:45, "John Lumby" <johnlumby(at)hotmail(dot)com> напис.

>
> On 2012-10-09 23:09:21
> Tom Lane wrote:
> >
>
>
> > re subject Why am I getting great/terrible estimates with these CTE
> queries?
> > You're assuming the case where the estimate is better is better for a
> > reason ... but it's only better as a result of blind dumb luck. The
> > outer-level query planner doesn't know anything about the CTE's output
> > except the estimated number of rows --- in particular, it doesn't drill
> > down to find any statistics about the join column.
> >
>
> I am also struggling with a problem involving CTEs although in my case
> it is caused by huge *under*-estimation of cardinality rather then
> *over*-estimation.
> The statement is quite complex and the problem arises because there is a
> chain of
> RECURSIVE CTEs each defined as a query involving an earlier CTE and more
> tables.
> Eventually there is no hope for making a good cardinality estimate.
>
> One CTE in particular has a cardinality estimate of 1 (I guess the actual
> estimate is nearer zero and rounded up) but actual count is over 100000.
> The planner puts this CTE as inner of a nested loop accessed by simple
> linear CTE scan
> and the full query then takes over 20 minutes.
>
> -> Nested Loop (cost=0.00..0.06 rows=1 width=588) (actual
> time=2340.421..1201593.856 rows=105984 loops=1)
> Join Filter: ((winnum.subnet_id = binoptasc.subnet_id) AND
> (winnum.option_code = binoptasc.option_code) AND
> ((winnum.option_discriminator)::text =
> (binoptasc.option_discriminator)::text) AND (winnum.net_rel_level =
> binoptasc.net_rel_level))
> Rows Removed by Join Filter: 7001612448
> Buffers: shared hit=2290941
> -> CTE Scan on winning_option_nums winnum (cost=0.00..0.02
> rows=1 width=536) (actual time=2338.422..2543.684 rows=62904 loops=1)
> Buffers: shared hit=2290941
> -> CTE Scan on subnet_inhrt_options_asc binoptasc
> (cost=0.00..0.02 rows=1 width=584) (actual time=0.000..9.728 rows=111308
> loops=62904)
>
> Whereas, (by altering various statistics to be very wrong) the entire
> query runs in 21 seconds.
>
> There have been several debates about how to address situations like this
> where
> no practical non-query-specific statistics-gathering scheme can ever hope
> to
> gather enough statistics to model the later derived tables. E.g. the
> frowned-on
> SELECTIVITY clause and ideas for query-specific statistics.
>
> Meanwhile, I have one other suggestion aimed specifically at problematic
> CTEs:
> Would it be reasonable to provide a new Planner Configuration option :
>
> enable_nestloop_cte_inner (boolean)
> Enables or disables the query planner's use of nested-loop join plans in
> which a CTE is the inner.
> It is impossible to suppress such nested-loop joins entirely,
> but turning this variable off discourages the planner from using one
> if there are other methods available, such as sorting the CTE for
> merge-join
> or hashing it for hash-join.
> The default is on.
>
> John
>
>
>
>
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message John Lumby 2013-02-18 23:26:23 Re: slow query plans caused by under-estimation of CTE cardinality
Previous Message John Lumby 2013-02-18 16:45:01 slow query plans caused by under-estimation of CTE cardinality