Re: BUG #4462: Adding COUNT to query causes massive slowdown

Lists: pgsql-bugs
From: "Jussi Pakkanen" <jpakkane(at)gmail(dot)com>
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #4462: Adding COUNT to query causes massive slowdown
Date: 2008-10-09 09:14:45
Message-ID: 200810090914.m999EjOJ073923@wwwmaster.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs


The following bug has been logged online:

Bug reference: 4462
Logged by: Jussi Pakkanen
Email address: jpakkane(at)gmail(dot)com
PostgreSQL version: 8.3.3
Operating system: Ubuntu x86 8/04
Description: Adding COUNT to query causes massive slowdown
Details:

I have a table in the following format

code CHARACTER(9) NOT NULL
text VARCHAR(200)

I have built an INDEX on "code", VACUUMed and ANALYZEd the table.

I have about 32 million rows and roughly 200 000 unique "code" elements.

I determine the unique codes using the following SQL query:

EXPLAIN SELECT DISTINCT code FROM log;
QUERY PLAN
----------------------------------------------------------------------------
-----------
Unique (cost=0.00..1384173.89 rows=6393 width=10)
-> Index Scan using codeindex on log (cost=0.00..1303930.83
rows=32097224 width=10)
(2 rows)

This takes about 4 minutes (it's a slow machine) but pretty much works as
expected.

However when I try to count the amount of distinct codes, I get this:

EXPLAIN SELECT COUNT(DISTINCT code) FROM log;
QUERY PLAN
----------------------------------------------------------------------------
-----
Aggregate (cost=100801488.30..100801488.31 rows=1 width=10)
-> Seq Scan on log (cost=100000000.00..100721245.24 rows=32097224
width=10)
(2 rows)

For some reason PostgreSQL wants to do a full table scan in this case. This
takes over 11 minutes.

Transferring the result set from the first query to a Python client program
and calculating the lines there takes about 4 seconds. This makes pg over
100 times slower than the naive implementation.

If I do the same COUNT using a view, it uses the index and is fast:

CREATE VIEW distcode AS SELECT DISTINCT code FROM log;

EXPLAIN SELECT COUNT(*) FROM distcode;
QUERY PLAN

----------------------------------------------------------------------------
-----------------
Aggregate (cost=1384253.81..1384253.82 rows=1 width=0)
-> Unique (cost=0.00..1384173.89 rows=6393 width=10)
-> Index Scan using codeindex on log (cost=0.00..1303930.83
rows=320972

I tried setting seq_scan to off. It did not help.

Due to reasons beyond my control, I can't test version 8.3.4 until the next
Ubuntu is released (at the end of this month).


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Jussi Pakkanen <jpakkane(at)gmail(dot)com>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #4462: Adding COUNT to query causes massive slowdown
Date: 2008-10-09 12:05:10
Message-ID: 48EDF376.80801@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Jussi Pakkanen wrote:
> However when I try to count the amount of distinct codes, I get this:
>
>
> EXPLAIN SELECT COUNT(DISTINCT code) FROM log;
> QUERY PLAN
> ----------------------------------------------------------------------------
> -----
> Aggregate (cost=100801488.30..100801488.31 rows=1 width=10)
> -> Seq Scan on log (cost=100000000.00..100721245.24 rows=32097224
> width=10)
> (2 rows)

This looks like you have one of the enable_${plantype} parameters turned
off. 100000000 is the penalty that is added when a plantype if turned off.


From: "Jussi Pakkanen" <jpakkane(at)gmail(dot)com>
To: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #4462: Adding COUNT to query causes massive slowdown
Date: 2008-10-09 14:51:48
Message-ID: 42d23b2e0810090751k255cde25iacd131d66069dce1@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Thu, Oct 9, 2008 at 3:05 PM, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:

> Jussi Pakkanen wrote:
>>
>> However when I try to count the amount of distinct codes, I get this:
>>
>>
>> EXPLAIN SELECT COUNT(DISTINCT code) FROM log;
>> QUERY PLAN
>>
>> ----------------------------------------------------------------------------
>> -----
>> Aggregate (cost=100801488.30..100801488.31 rows=1 width=10)
>> -> Seq Scan on log (cost=100000000.00..100721245.24 rows=32097224
>> width=10)
>> (2 rows)
>
> This looks like you have one of the enable_${plantype} parameters turned
> off. 100000000 is the penalty that is added when a plantype if turned off.

This was caused by enable_seqscan. When I set it to 'on', the penalty
disappears but it still does the full table scan.

Given that PostgreSQL does the scan even with the huge seqscan
penalty, I can think of only two different causes:

1) some sort of a bug in the query analyzer
2) SELECT COUNT(DISTINCT x) for some reason requires information that
is not available in the index. The only one I could think of would be
NULL values, but the 'code' field is defined as 'NOT NULL'.

Also I had a small error in my original but report. There are 2
million distinct values of 'code', rather than 200 000.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Jussi Pakkanen" <jpakkane(at)gmail(dot)com>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #4462: Adding COUNT to query causes massive slowdown
Date: 2008-10-09 15:55:10
Message-ID: 16635.1223567710@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

"Jussi Pakkanen" <jpakkane(at)gmail(dot)com> writes:
> Given that PostgreSQL does the scan even with the huge seqscan
> penalty, I can think of only two different causes:
> 1) some sort of a bug in the query analyzer
> 2) SELECT COUNT(DISTINCT x) for some reason requires information that
> is not available in the index.

Try (3) COUNT(DISTINCT x) ... or any DISTINCT aggregate for that matter
... is implemented by a sort-and-uniq step inside the aggregate function
itself. You can't see it in the plan.

I wouldn't actually think that this approach would be slower than an
indexscan, btw, unless maybe the index were very nearly correlated with
physical order --- but that would make the sort more efficient, too.
Perhaps you need to raise work_mem enough to allow the sort to take
place without spilling to disk? (Turning on trace_sort should let you
see what's happening there.)

regards, tom lane


From: "Jussi Pakkanen" <jpakkane(at)gmail(dot)com>
To: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #4462: Adding COUNT to query causes massive slowdown
Date: 2008-10-10 19:55:10
Message-ID: 42d23b2e0810101255p59d5d1eyce79a0c917bd72ed@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Thu, Oct 9, 2008 at 6:55 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

>> Given that PostgreSQL does the scan even with the huge seqscan
>> penalty, I can think of only two different causes:
>> 1) some sort of a bug in the query analyzer
>> 2) SELECT COUNT(DISTINCT x) for some reason requires information that
>> is not available in the index.
>
> Try (3) COUNT(DISTINCT x) ... or any DISTINCT aggregate for that matter
> ... is implemented by a sort-and-uniq step inside the aggregate function
> itself. You can't see it in the plan.

Does this mean that the sort-and-uniq will always lead to a full table
scan? It would seem so, because I could not force PostgreSQL to use
the index even with enable_seqscan set to off. I understand that in
some cases the table scan is faster, but it is very strange if the
query optimizer refuses to use the index no matter what.

A quick calculation says that the table scan needs to access 32
million elements (and sort them, and uniq them). An index scan needs
only 2 million (or 4 million I suppose, if you account for the higher
levels in the B-tree). That is an order of magnitude difference in
disk reads alone.

> I wouldn't actually think that this approach would be slower than an
> indexscan, btw, unless maybe the index were very nearly correlated with
> physical order --- but that would make the sort more efficient, too.

I have ordered the data table according to the "code" column by first
importing it to a temp table and then building the actual "log" table
with CREATE TABLE log AS SELECT ... ORDER BY code;.

In this case the index is faster. A lot faster. SELECT DISTINCT using
the index and counting the rows takes 4 minutes. Building a view with
distinct "code"s and counting that also takes 4 minutes. But the
aggregate scan takes 11 minutes. Worst of all, COUNT(DISTINCT ...) is
the way all SQL books tell you to do these kinds of queries.

> Perhaps you need to raise work_mem enough to allow the sort to take
> place without spilling to disk? (Turning on trace_sort should let you
> see what's happening there.)

But isn't this just hiding the root cause? Having working sets larger
than available memory is not that uncommon.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Jussi Pakkanen" <jpakkane(at)gmail(dot)com>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #4462: Adding COUNT to query causes massive slowdown
Date: 2008-10-10 21:07:00
Message-ID: 2863.1223672820@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

"Jussi Pakkanen" <jpakkane(at)gmail(dot)com> writes:
> On Thu, Oct 9, 2008 at 6:55 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Try (3) COUNT(DISTINCT x) ... or any DISTINCT aggregate for that matter
>> ... is implemented by a sort-and-uniq step inside the aggregate function
>> itself. You can't see it in the plan.

> Does this mean that the sort-and-uniq will always lead to a full table
> scan?

The sort-and-uniq doesn't care where the data came from. But if we have
to feed it all rows of the table, as we do here, we're going to use a
seqscan. An indexscan can never beat a seqscan for retrieving the whole
table.

> A quick calculation says that the table scan needs to access 32
> million elements (and sort them, and uniq them). An index scan needs
> only 2 million (or 4 million I suppose, if you account for the higher
> levels in the B-tree).

You have a fundamental misunderstanding of how Postgres indexes work.
It is never possible to retrieve data without consulting the table too,
because indexes do not store transaction visibility information.

regards, tom lane


From: "Jussi Pakkanen" <jpakkane(at)gmail(dot)com>
To: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #4462: Adding COUNT to query causes massive slowdown
Date: 2008-10-15 11:02:56
Message-ID: 42d23b2e0810150402r529fbafev70f5b13ea4e333c8@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Sat, Oct 11, 2008 at 12:07 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> The sort-and-uniq doesn't care where the data came from. But if we have
> to feed it all rows of the table, as we do here, we're going to use a
> seqscan. An indexscan can never beat a seqscan for retrieving the whole
> table.

Apologies if you get this more than once, but it seems that my first
mail got lost in the moderation queue somewhere.

Anyway, in this case we should not need to scan the entire table.
There are 32 million rows and 2 million unique 'code's. Thus we should
be able to just walk the index (for a list of potential unique
elements) and then check one of the roughly 15 rows pointed to by the
index leaf nodes to get the visibility info.

To prove that PostgreSQL does not need to do a full table scan and
that it is faster to use the index, I simplified the problem to the
following test case.

From an SQL point of view, the following two queries produce an
identical result:

SELECT COUNT(DISTINCT code) FROM log;
SELECT COUNT(*) FROM (SELECT DISTINCT code FROM log) AS distincttable;

(I don't know about ACID requirements all that much, but they should
be identical for both AFAICS.)

Therefore, an optimal query planner would reduce them to exact same
elemental operations. PostgreSQL does NOT do this. It does a full
table scan for the first one and an index scan for the second one. The
table scan is over two times slower.

If a full table scan is absolutely necessary, then the bug is that the
second query uses the index (and thus may give incorrect results).

If an index scan is sufficient, the bug is that the query planner does
not even consider the index for the first query, which leads to a big
performance penalty on an extremely common operation.