Quick Links

Re: GROUP BY with reasonable timings in PLAN but unreasonable execution time

From:	Clem Dickey <dickeycl(at)us(dot)ibm(dot)com>
To:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: GROUP BY with reasonable timings in PLAN but unreasonable execution time
Date:	2011-07-09 01:33:15
Message-ID:	iv8b4r$2q83$1@news.hub.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

On 07/06/2011 05:59 PM, Clem Dickey wrote:
> On 07/05/2011 07:26 PM, Clem Dickey wrote:
>
>> Column | Type | Modifiers
>> --------+---------+-----------
>> y | integer | not null
>> x | integer | not null
>> k | integer | not null
>> j | integer | not null
>> z | integer | not null
>> Indexes:
>> "t_pkey" PRIMARY KEY, btree (j, k, x, y, z)
>>
>> The table population, in pseudocode, is this:
>> for x in 0..9
>> for y in 0..9999
>> for z in 0..29
>> INSERT INTO t VALUES(y,x,0,0,z)
>
>> The query is:
>>
>> SELECT *
>> FROM (
>> SELECT * FROM t GROUP BY j,k,x,z,y
>> ) AS f
>> NATURAL JOIN t;
>
> The EXPLAIN ANALYZE output is http://explain.depesz.com/s/KGk
>
> Notes on the analysis:
> 1. I see that the planner estimates that GROUP BY will reduce 300K rows
> to 30K, a bit odd because every row which the planner could examine is
> in a unique group.

GROUP BY assumes an average 10-element grouping in cases with more than
one GROUP BY expression. Wrong for this test case, but probably OK in
general.

> 2. The JOIN is expected to produce one row. I'm not sure how the planner
> came up with that estimate.

The winning Join (merge join) had a very poor estimate of its
performance. Like a low-ball contract bid. :-)

a. The Join cost estimators could have been given more information

The functions which estimate JOIN selectivity (e.g. the chance that
tuples will match in an equijoin, for instance) use data produced by
ANALYZE. But the SELECT .. GROUP BY does not propagate ANALYZE data from
the columns of its input relation to its output relation. That is too
bad, because the column value statistics (number of unique values) would
have improved selectivity estimates for all three join plans (merge
join, nested loop, and hash join).

b. the Merge Join cost estimator did a poor job with the data it was given:

In function eqjoinsel_inner there are two cases (1) ANALYZE data is
available for both sides of the join and (2) ANALYZE data is missing for
one or both sides. Due to the GROUP BY processing described above,
ANALYZE data was available for "t" but not for "SELECT * FROM t GROUP BY
...".

The logic in that case is "use the column with the most distinct values"
to estimate selectivity. The default number of distinct values for a
column with no data (DEFAULT_NUM_DISTINCT) is 200. In my join the number
of values was:

col in GROUP BY in table t
j 200 1
k 200 1
x 200 10
y 200 1000
z 200 30

In 4 of the 5 columns the default value had more distinct values, and
the combined selectivity (chance that two arbitrary rows would have a
join match) was (1/200)^4 * 1/1000. Very small. The error is, IMO, that
the code does not distinguish known numbers from default numbers. A
comment in the code acknowledges this:

"XXX Can we be smarter if we have an MCV list for just one side?"

But it concludes

"It seems that if we assume equal distribution for the other side, we
end up with the same answer anyway."

I don't think that is the case. Preferring a known value, where one
exists, would provide a better estimate of the actual range of the data.
Indeed, the var_eq_non_const in the same file (used by the nested loop
join estimator) does essentially that.

- Clem Dickey

In response to

Re: GROUP BY with reasonable timings in PLAN but unreasonable execution time at 2011-07-07 00:59:19 from Clem Dickey

Responses

Re: Re: GROUP BY with reasonable timings in PLAN but unreasonable execution time at 2011-08-03 13:29:16 from Robert Haas

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Pavel Stehule	2011-07-09 03:39:14	Re: Slow query when using ORDER BY and LIMIT
Previous Message	Jeff Davis	2011-07-09 00:40:23	Re: execution time for first INSERT