Re: The planner chooses seqscan+sort when there is an

Lists: pgsql-general
From: Csaba Nagy <nagy(at)ecircle-ag(dot)com>
To: Postgres general mailing list <pgsql-general(at)postgresql(dot)org>
Subject: The planner chooses seqscan+sort when there is an index on the sort column
Date: 2006-05-03 15:37:32
Message-ID: 1146670652.14093.171.camel@coppola.muc.ecircle.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Hi all,

I wonder why this happens:

- postgres: 8.1.3
- the table has ~200 million rows;
- there is a primary key on (col_1, col_2);
- the table was ANALYZEd;
- the planner chooses seqscan+sort for the following query even with
enable_seqscan=off:

select * from table order by col_1;

Isn't it supposed to choose the index scan at least when
enable_seqscan=off ? Even if it is indeed not faster to do the index
scan than seqscan+sort.

The actual plan looks like (names changed):

db=# set enable_seqscan = off;
SET
db=# explain select * from table order by col_1;
QUERY PLAN
-----------------------------------------------------------------------------------------
Sort (cost=165585198.70..166102386.06 rows=206874944 width=60)
Sort Key: col_1
-> Seq Scan on table (cost=100000000.00..104552091.44
rows=206874944 width=60)
(3 rows)

db=# \d table
Table "public.table"
Column | Type |
Modifiers
-----------------+-----------------------------+----------------------------------------------------
col_1 | bigint | not null
col_2 | bigint | not null
...
Indexes:
"pk_table" PRIMARY KEY, btree (col_1, col_2)
...

Cheers,
Csaba.


From: "John D(dot) Burger" <john(at)mitre(dot)org>
To: PostgreSQL general <pgsql-general(at)postgresql(dot)org>
Subject: Re: The planner chooses seqscan+sort when there is an index on the sort column
Date: 2006-05-03 15:48:54
Message-ID: 1578e3296e30a207a72f0d8ed613f4a3@mitre.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Csaba Nagy wrote:

> select * from table order by col_1;
>
> Isn't it supposed to choose the index scan at least when
> enable_seqscan=off ? Even if it is indeed not faster to do the index
> scan than seqscan+sort.

I think because you've asked for every row, it's going to have to scan
the whole table anyway, to determine MVCC "liveness" of the rows
(sorry, dunno what the correct word is).

- John Burger
MITRE


From: Andreas Kretschmer <akretschmer(at)spamfence(dot)net>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: The planner chooses seqscan+sort when there is an index on the sort column
Date: 2006-05-03 15:54:33
Message-ID: 20060503155432.GA18128@KanotixBox
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Csaba Nagy <nagy(at)ecircle-ag(dot)com> schrieb:
> select * from table order by col_1;

Without a WHERE you get the whole table.

A Index-Scan is, in this case, expensive.

HTH, Andreas
--
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect. (Linus Torvalds)
"If I was god, I would recompile penguin with --enable-fly." (unknow)
Kaufbach, Saxony, Germany, Europe. N 51.05082°, E 13.56889°


From: Csaba Nagy <nagy(at)ecircle-ag(dot)com>
To: "John D(dot) Burger" <john(at)mitre(dot)org>
Cc: Postgres general mailing list <pgsql-general(at)postgresql(dot)org>
Subject: Re: The planner chooses seqscan+sort when there is an
Date: 2006-05-03 15:57:58
Message-ID: 1146671878.14093.176.camel@coppola.muc.ecircle.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Wed, 2006-05-03 at 17:48, John D. Burger wrote:
> Csaba Nagy wrote:
>
> > select * from table order by col_1;
> >
> > Isn't it supposed to choose the index scan at least when
> > enable_seqscan=off ? Even if it is indeed not faster to do the index
> > scan than seqscan+sort.
>
> I think because you've asked for every row, it's going to have to scan
> the whole table anyway, to determine MVCC "liveness" of the rows
> (sorry, dunno what the correct word is).

But I also asked for _ordered_ results, which the seq scan is not
covering, but the index does... and I specifically disabled sequential
scan. That means the planner is not even considering the primary key
index, and I would like to know why...

Actually this is a problem for me in a more complex query, which also
needs this table ordered by that column, and it results in the same plan
fragment with sequential scan + sort.

Thanks,
Csaba.


From: Csaba Nagy <nagy(at)ecircle-ag(dot)com>
To: Andreas Kretschmer <akretschmer(at)spamfence(dot)net>
Cc: Postgres general mailing list <pgsql-general(at)postgresql(dot)org>
Subject: Re: The planner chooses seqscan+sort when there is an
Date: 2006-05-03 16:01:10
Message-ID: 1146672070.14093.180.camel@coppola.muc.ecircle.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

> Without a WHERE you get the whole table.
>
> A Index-Scan is, in this case, expensive.

I know that, and I would agree if I wouldn't have asked for _ordered_
result, and wouldn't have turned enable_seqscan=off.

In these conditions I would have expected an index scan, even if it is
more expensive than the sequential scan + sort. Actually I wanted to see
how expensive it thinks it is, but I can't get to that plan at all.

Thanks,
Csaba.


From: "John D(dot) Burger" <john(at)mitre(dot)org>
To: PostgreSQL general <pgsql-general(at)postgresql(dot)org>
Subject: Re: The planner chooses seqscan+sort when there is an
Date: 2006-05-03 16:33:02
Message-ID: 1faa2678eba0d4703f1f74a5c4563407@mitre.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

> But I also asked for _ordered_ results, which the seq scan is not
> covering, but the index does... and I specifically disabled sequential
> scan.

Docs say:

> Enables or disables the query planner's use of sequential scan plan
> types. It's not possible to suppress sequential scans entirely, but
> turning this variable off discourages the planner from using one if
> there are other methods available.

Note the second sentence. Again, I think it will have to scan the
whole table anyway, because that's what you've asked for, and given
that, enable_seqscan=off doesn't apply.

- John Burger
MITRE


From: Csaba Nagy <nagy(at)ecircle-ag(dot)com>
To: "John D(dot) Burger" <john(at)mitre(dot)org>
Cc: Postgres general mailing list <pgsql-general(at)postgresql(dot)org>
Subject: Re: The planner chooses seqscan+sort when there is an
Date: 2006-05-03 16:42:00
Message-ID: 1146674520.14093.185.camel@coppola.muc.ecircle.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

> Docs say:
>
> > Enables or disables the query planner's use of sequential scan plan
> > types. It's not possible to suppress sequential scans entirely, but
> > turning this variable off discourages the planner from using one if
> > there are other methods available.
>
> Note the second sentence. Again, I think it will have to scan the
> whole table anyway, because that's what you've asked for, and given
> that, enable_seqscan=off doesn't apply.

OK, maybe that's the point... the "cost bust" given to the sequential
scan by enable_seqscan=off is not enough in this case to exceed the cost
of the index scan ? The table is quite big, might be possible. I still
wonder why would be seqscan+sort faster than index scan... the sort will
for sure have to write to disk too given the size of the table...

Cheers,
Csaba.


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Csaba Nagy <nagy(at)ecircle-ag(dot)com>
Cc: "John D(dot) Burger" <john(at)mitre(dot)org>, Postgres general mailing list <pgsql-general(at)postgresql(dot)org>
Subject: Re: The planner chooses seqscan+sort when there is an
Date: 2006-05-03 16:44:50
Message-ID: 20060503164450.GD27354@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Wed, May 03, 2006 at 06:42:00PM +0200, Csaba Nagy wrote:
> OK, maybe that's the point... the "cost bust" given to the sequential
> scan by enable_seqscan=off is not enough in this case to exceed the cost
> of the index scan ? The table is quite big, might be possible. I still
> wonder why would be seqscan+sort faster than index scan... the sort will
> for sure have to write to disk too given the size of the table...

Have you tuned the values of effective_cache_size and random_page_cost?
These have significant effects on index scans.

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.


From: "Florian G(dot) Pflug" <fgp(at)phlo(dot)org>
To: Csaba Nagy <nagy(at)ecircle-ag(dot)com>
Cc: "John D(dot) Burger" <john(at)mitre(dot)org>, Postgres general mailing list <pgsql-general(at)postgresql(dot)org>
Subject: Re: The planner chooses seqscan+sort when there is an
Date: 2006-05-03 18:20:55
Message-ID: 4458F487.80404@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Csaba Nagy wrote:
>> Docs say:
>>
>>> Enables or disables the query planner's use of sequential scan plan
>>> types. It's not possible to suppress sequential scans entirely, but
>>> turning this variable off discourages the planner from using one if
>>> there are other methods available.
>> Note the second sentence. Again, I think it will have to scan the
>> whole table anyway, because that's what you've asked for, and given
>> that, enable_seqscan=off doesn't apply.
>
> OK, maybe that's the point... the "cost bust" given to the sequential
> scan by enable_seqscan=off is not enough in this case to exceed the cost
> of the index scan ? The table is quite big, might be possible. I still
> wonder why would be seqscan+sort faster than index scan... the sort will
> for sure have to write to disk too given the size of the table...

When using an indexscan, postgres will read the actual rows in
index-order, rathen then in the order they appear on-disk.
For 200 million rows this means doing at least 200 million
disk seeks. Assuming that each seek takes just 1ms, thats
still amount to 200.000 seconds.

greetings, Florian Pflug


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Csaba Nagy <nagy(at)ecircle-ag(dot)com>
Cc: "John D(dot) Burger" <john(at)mitre(dot)org>, Postgres general mailing list <pgsql-general(at)postgresql(dot)org>
Subject: Re: The planner chooses seqscan+sort when there is an
Date: 2006-05-03 18:34:08
Message-ID: 18195.1146681248@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Csaba Nagy <nagy(at)ecircle-ag(dot)com> writes:
> OK, maybe that's the point... the "cost bust" given to the sequential
> scan by enable_seqscan=off is not enough in this case to exceed the cost
> of the index scan ?

Looks that way to me. You could try setting enable_sort off as well,
which will penalize the seqscan+sort plan another 100million cost units.
And maybe try reducing random_page_cost to make the indexscan look
cheaper. However, if there's a 100million delta between the two plans,
I suspect you really really don't want the indexscan anyway ;-)

regards, tom lane


From: Scott Marlowe <smarlowe(at)g2switchworks(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Csaba Nagy <nagy(at)ecircle-ag(dot)com>, "John D(dot) Burger" <john(at)mitre(dot)org>, Postgres general mailing list <pgsql-general(at)postgresql(dot)org>
Subject: Re: The planner chooses seqscan+sort when there is an
Date: 2006-05-03 18:36:36
Message-ID: 1146681396.22037.42.camel@state.g2switchworks.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Wed, 2006-05-03 at 13:34, Tom Lane wrote:
> Csaba Nagy <nagy(at)ecircle-ag(dot)com> writes:
> > OK, maybe that's the point... the "cost bust" given to the sequential
> > scan by enable_seqscan=off is not enough in this case to exceed the cost
> > of the index scan ?
>
> Looks that way to me. You could try setting enable_sort off as well,
> which will penalize the seqscan+sort plan another 100million cost units.
> And maybe try reducing random_page_cost to make the indexscan look
> cheaper. However, if there's a 100million delta between the two plans,
> I suspect you really really don't want the indexscan anyway ;-)

I imagine the followup post:

So, I've had this query running for six weeks now, and...


From: "A(dot) Kretschmer" <andreas(dot)kretschmer(at)schollglas(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: The planner chooses seqscan+sort when there is an
Date: 2006-05-03 18:45:25
Message-ID: 20060503184525.GA14628@webserv.wug-glas.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

am 03.05.2006, um 20:20:55 +0200 mailte Florian G. Pflug folgendes:
> >of the index scan ? The table is quite big, might be possible. I still
> >wonder why would be seqscan+sort faster than index scan... the sort will
> >for sure have to write to disk too given the size of the table...
>
> When using an indexscan, postgres will read the actual rows in index-order,
> rathen then in the order they appear on-disk.
> For 200 million rows this means doing at least 200 million
> disk seeks. Assuming that each seek takes just 1ms, thats
> still amount to 200.000 seconds.

Yepp, it is much cheaper to read the table seq and order later.

Andreas
--
Andreas Kretschmer (Kontakt: siehe Header)
Heynitz: 035242/47215, D1: 0160/7141639
GnuPG-ID 0x3FFF606C http://wwwkeys.de.pgp.net
=== Schollglas Unternehmensgruppe ===


From: Csaba Nagy <nagy(at)ecircle-ag(dot)com>
To: Scott Marlowe <smarlowe(at)g2switchworks(dot)com>
Cc: Postgres general mailing list <pgsql-general(at)postgresql(dot)org>
Subject: Re: The planner chooses seqscan+sort when there is an
Date: 2006-05-04 08:09:33
Message-ID: 1146730172.14093.191.camel@coppola.muc.ecircle.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

> > Looks that way to me. You could try setting enable_sort off as well,
> > which will penalize the seqscan+sort plan another 100million cost units.
> > And maybe try reducing random_page_cost to make the indexscan look
> > cheaper. However, if there's a 100million delta between the two plans,
> > I suspect you really really don't want the indexscan anyway ;-)
>
> I imagine the followup post:
>
> So, I've had this query running for six weeks now, and...

Well, I guess that's it then... I will let the query run with the
seqscan+sort. It will still run 1-2 days, yesterday I stopped it after 6
hours ;-) Actually it would be nice to have some kind of feedback on
what is it doing so I can estimate how long it will still take... cause
I'm not sure the seqscan+sort won't run itself for 6 weeks...

Thanks,
Csaba.


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Csaba Nagy <nagy(at)ecircle-ag(dot)com>
Cc: Scott Marlowe <smarlowe(at)g2switchworks(dot)com>, Postgres general mailing list <pgsql-general(at)postgresql(dot)org>
Subject: Re: The planner chooses seqscan+sort when there is an
Date: 2006-05-04 08:18:17
Message-ID: 20060504081817.GG4752@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Thu, May 04, 2006 at 10:09:33AM +0200, Csaba Nagy wrote:
> Well, I guess that's it then... I will let the query run with the
> seqscan+sort. It will still run 1-2 days, yesterday I stopped it after 6
> hours ;-) Actually it would be nice to have some kind of feedback on
> what is it doing so I can estimate how long it will still take... cause
> I'm not sure the seqscan+sort won't run itself for 6 weeks...

There are a number of ways you can (indirectly) see how far it has got.
If it's in the first phase (the seq scan), by looking at which file it
has open you should be able to see how far along the table it is. Once
it's in the sort stage you should be able to see from the tape files
approimately how far it is. As for the number of sort stages required,
that depends on the size of the data to be sorted, but you should be
able to estimate that...

At least, you should be able to tell the difference between a 6 hour and
6 day query.

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.


From: Csaba Nagy <nagy(at)ecircle-ag(dot)com>
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc: Postgres general mailing list <pgsql-general(at)postgresql(dot)org>
Subject: Re: The planner chooses seqscan+sort when there is an
Date: 2006-05-04 08:36:41
Message-ID: 1146731801.14093.193.camel@coppola.muc.ecircle.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

> There are a number of ways you can (indirectly) see how far it has got.
> If it's in the first phase (the seq scan), by looking at which file it
> has open you should be able to see how far along the table it is. Once
> it's in the sort stage you should be able to see from the tape files
> approimately how far it is. As for the number of sort stages required,
> that depends on the size of the data to be sorted, but you should be
> able to estimate that...
>
> At least, you should be able to tell the difference between a 6 hour and
> 6 day query.

Thanks for the hints, I'll try to look up on the details...

Cheers,
Csaba.