why Sequencial Scan when selecting on primary key of table?

Lists: pgsql-performance
From: "Nikolaus Dilger" <nikolaus(at)dilger(dot)cc>
To: yusuf0478(at)netscape(dot)net
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Enabling and Disabling Sequencial Scan
Date: 2003-05-31 15:07:15
Message-ID: 20030531080716.20944.h004.c001.wm@mail.dilger.cc.criticalpath.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

David,

I say go ahead and use it since you get a significant
performance gain. This is a special case where you
know more about your data than the planer does with
general system wide settings. In Oracle you could use
"hints". Since there are no hints in PostgreSQL
disabling and reenabling an option just before and
after a query has the same effect.

Regards,
Nikolaus

On Fri, 30 May 2003 16:33:07 -0400, Yusuf wrote:

>
> In the application, that I'm working on, I have a
query
> that'll be a lot
> 60% faster if I disable sequential scan forcing it to
> you my index.
>
> Is it bad practice to disable sequential scan ( set
> enable_seqscan=false), run my query then enable
> sequential scan,
> whenever I'm running this query? Why?
>
> Thanks in advance
>
> - David Wendy
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to
> majordomo(at)postgresql(dot)org


From: "Brian Tarbox" <btarbox(at)theworld(dot)com>
To: <pgsql-performance(at)postgresql(dot)org>
Subject: why Sequencial Scan when selecting on primary key of table?
Date: 2003-05-31 16:30:40
Message-ID: 004301c32791$f983c710$01000001@trouble
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

I have a simple table with a dozen integer fields and a primary key.

When I say "explain select * from Patient where Patient_primary_key = 100"

I get sequential scan.

I've just converted my application from MySQL and am seeing everything run
about 3X slower. What do I have to do to get postgres to use indexes?

Brian Tarbox


From: Andrew Sullivan <andrew(at)libertyrms(dot)info>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: why Sequencial Scan when selecting on primary key of table?
Date: 2003-05-31 17:02:08
Message-ID: 20030531170205.GD22469@libertyrms.info
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On Sat, May 31, 2003 at 12:30:40PM -0400, Brian Tarbox wrote:
> I have a simple table with a dozen integer fields and a primary key.
>
> When I say "explain select * from Patient where Patient_primary_key = 100"
>
> I get sequential scan.
>
> I've just converted my application from MySQL and am seeing everything run
> about 3X slower. What do I have to do to get postgres to use indexes?

Usual questions: have you vacuumed? EXPLAIN ANALYSE output, schema,
&c.

A

--
----
Andrew Sullivan 204-4141 Yonge Street
Liberty RMS Toronto, Ontario Canada
<andrew(at)libertyrms(dot)info> M2P 2A8
+1 416 646 3304 x110


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Brian Tarbox" <btarbox(at)theworld(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: why Sequencial Scan when selecting on primary key of table?
Date: 2003-05-31 17:13:47
Message-ID: 21073.1054401227@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

"Brian Tarbox" <btarbox(at)theworld(dot)com> writes:
> When I say "explain select * from Patient where Patient_primary_key = 100"
> I get sequential scan.

Perhaps Patient_primary_key is not an integer field? If not, you need
to cast the constant 100 to the right type. Or write '100' with
single quotes around it, which leaves Postgres to choose the constant's
datatype. (Yeah, I know, it's a pain in the neck. We've had a lot of
discussions about how to fix this without breaking datatype extensibility;
no luck so far.)

regards, tom lane


From: "Brian Tarbox" <btarbox(at)theworld(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: why Sequencial Scan when selecting on primary key of table?
Date: 2003-05-31 17:45:50
Message-ID: 000001c3279c$79dc2d80$01000001@trouble
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

The primary key field is an integer and I have performed vacuum analyse but
that does not seem to change anything.

I've also heard that postgres will not indexes when JOINing tables. Can
that really be true??

Brian

-----Original Message-----
From: pgsql-performance-owner(at)postgresql(dot)org
[mailto:pgsql-performance-owner(at)postgresql(dot)org]On Behalf Of Tom Lane
Sent: Saturday, May 31, 2003 1:14 PM
To: Brian Tarbox
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: [PERFORM] why Sequencial Scan when selecting on primary key
of table?

"Brian Tarbox" <btarbox(at)theworld(dot)com> writes:
> When I say "explain select * from Patient where Patient_primary_key = 100"
> I get sequential scan.

Perhaps Patient_primary_key is not an integer field? If not, you need
to cast the constant 100 to the right type. Or write '100' with
single quotes around it, which leaves Postgres to choose the constant's
datatype. (Yeah, I know, it's a pain in the neck. We've had a lot of
discussions about how to fix this without breaking datatype extensibility;
no luck so far.)

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org


From: Rod Taylor <rbt(at)rbt(dot)ca>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Brian Tarbox <btarbox(at)theworld(dot)com>, Postgresql Performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: why Sequencial Scan when selecting on primary key of table?
Date: 2003-05-31 17:51:11
Message-ID: 1054403470.11968.34.camel@jester
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On Sat, 2003-05-31 at 13:13, Tom Lane wrote:
> "Brian Tarbox" <btarbox(at)theworld(dot)com> writes:
> > When I say "explain select * from Patient where Patient_primary_key = 100"
> > I get sequential scan.
>
> Perhaps Patient_primary_key is not an integer field? If not, you need
> to cast the constant 100 to the right type. Or write '100' with
> single quotes around it, which leaves Postgres to choose the constant's
> datatype.

Out of curiosity, why don't we confirm the unquoted value is an integer,
numeric, etc, then change it into type 'unknown'? From that point
forward it would be treated like it's quoted counterpart.

Is this noticeably slower or am I missing something?

--
Rod Taylor <rbt(at)rbt(dot)ca>

PGP Key: http://www.rbt.ca/rbtpub.asc


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Brian Tarbox" <btarbox(at)theworld(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: why Sequencial Scan when selecting on primary key of table?
Date: 2003-05-31 17:55:01
Message-ID: 21289.1054403701@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

"Brian Tarbox" <btarbox(at)theworld(dot)com> writes:
> The primary key field is an integer and I have performed vacuum analyse but
> that does not seem to change anything.

Hm. So how big is the table, exactly? On small tables a seqscan will
be preferred because the extra I/O to examine the index costs more than
the CPU to examine all the tuples on a disk page.

> I've also heard that postgres will not indexes when JOINing tables. Can
> that really be true??

We have some join methods that like indexes and we have some that find
no benefit in 'em. Again, testing on toy-size tables is not a good
guide to what will happen on larger tables.

regards, tom lane


From: Andrew Sullivan <andrew(at)libertyrms(dot)info>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: why Sequencial Scan when selecting on primary key of table?
Date: 2003-05-31 18:02:35
Message-ID: 20030531180235.GF22469@libertyrms.info
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On Sat, May 31, 2003 at 01:45:50PM -0400, Brian Tarbox wrote:
> The primary key field is an integer and I have performed vacuum analyse but
> that does not seem to change anything.

int4? int8? int2? Makes a difference. Please post the results of
EXPLAIN ANALYSE on the query you're having trouble with, and someone
may be able to help you. (You'll need to show us the table, too.)

> I've also heard that postgres will not indexes when JOINing tables. Can
> that really be true??

No.

A

--
----
Andrew Sullivan 204-4141 Yonge Street
Liberty RMS Toronto, Ontario Canada
<andrew(at)libertyrms(dot)info> M2P 2A8
+1 416 646 3304 x110


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Rod Taylor <rbt(at)rbt(dot)ca>
Cc: Brian Tarbox <btarbox(at)theworld(dot)com>, Postgresql Performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: why Sequencial Scan when selecting on primary key of table?
Date: 2003-05-31 18:12:30
Message-ID: 21406.1054404750@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Rod Taylor <rbt(at)rbt(dot)ca> writes:
> Out of curiosity, why don't we confirm the unquoted value is an integer,
> numeric, etc, then change it into type 'unknown'?

UNKNOWNNUMERIC is one of the ideas that's been proposed, but it's not
clear to me that it is better than other alternatives. In particular,
I don't like losing knowledge of the form and size of the constant.
Something like "WHERE int4col = 4.8" should yield FALSE, not "ERROR:
pg_atoi: unable to parse '4.8'" which is what you're likely to get with
a naive "unknown numeric type" approach. A perhaps-more-realistic
objection is that it only solves the problem for trivial "var = const"
cases. As soon as you look at even slightly more complicated
expressions, it stops doing much good.

I'm still of the opinion that the best solution in the long run is to
get rid of most of the cross-datatype numeric operators, but there are
pitfalls there too. See last thread on the issue:
http://archives.postgresql.org/pgsql-hackers/2002-11/msg00468.php

regards, tom lane


From: "Brian Tarbox" <btarbox(at)theworld(dot)com>
To: <pgsql-performance(at)postgresql(dot)org>
Subject: are views typically any faster/slower than equivilent joins?
Date: 2003-06-01 02:55:18
Message-ID: 001801c327e9$3c01a5b0$01000001@trouble
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

I am working with a highly normalized database. Most of my meaningful
queries involve joins from a primary table to 4-6 secondary tables.

Would my query performance be significantly faster/slower using a View as
opposed to a prepared query using join?

(Assume all join fields are ints, say 10,000 records in main table and a few
dozen records in each of the secondary tables).

Thank you.

Brian Tarbox


From: Rod Taylor <rbt(at)rbt(dot)ca>
To: Brian Tarbox <btarbox(at)theworld(dot)com>
Cc: Postgresql Performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: are views typically any faster/slower than equivilent joins?
Date: 2003-06-01 04:02:39
Message-ID: 1054440158.11968.86.camel@jester
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On Sat, 2003-05-31 at 22:55, Brian Tarbox wrote:
> I am working with a highly normalized database. Most of my meaningful
> queries involve joins from a primary table to 4-6 secondary tables.
>
> Would my query performance be significantly faster/slower using a View as
> opposed to a prepared query using join?

There are some corner cases where a view would be slower than a standard
query in 7.3 (bug fix / disabled optimization -- fixed right in 7.4),
but generally you can assume it will be about the same speed.

Some views such as unions will not be as fast as you would like, but
thats a general issue with PostgreSQLs inability to throw away selects
when it won't find results on one side of a union.

CREATE VIEW sales AS SELECT * FROM sales_archive_2002 UNION ALL SELECT *
FROM sales_current;

SELECT * FROM sales WHERE timestamp => CURRENT_TIMESTAMP - INTERVAL '1
day';

The above query would not be so quick.

--
Rod Taylor <rbt(at)rbt(dot)ca>

PGP Key: http://www.rbt.ca/rbtpub.asc


From: Rod Taylor <rbt(at)rbt(dot)ca>
To: Brian Tarbox <btarbox(at)theworld(dot)com>
Cc: Postgresql Performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: are views typically any faster/slower than equivilent joins?
Date: 2003-06-01 04:05:56
Message-ID: 1054440355.11968.90.camel@jester
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

> Would my query performance be significantly faster/slower using a View as
> opposed to a prepared query using join?

I missed this part. Views and prepared queries are not the same time.
Use of a view still needs to be optimized.

Prepared queries will run the optimization portion on the entire query
including the view segments of it. Think of a view as a MACRO.
Depending on the context of what surrounds it, the view may be executed
very differently.

--
Rod Taylor <rbt(at)rbt(dot)ca>

PGP Key: http://www.rbt.ca/rbtpub.asc


From: Bruno Wolff III <bruno(at)wolff(dot)to>
To: Rod Taylor <rbt(at)rbt(dot)ca>
Cc: Brian Tarbox <btarbox(at)theworld(dot)com>, Postgresql Performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: are views typically any faster/slower than equivilent joins?
Date: 2003-06-01 05:43:37
Message-ID: 20030601054337.GD16259@wolff.to
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On Sun, Jun 01, 2003 at 00:02:39 -0400,
Rod Taylor <rbt(at)rbt(dot)ca> wrote:
>
> Some views such as unions will not be as fast as you would like, but
> thats a general issue with PostgreSQLs inability to throw away selects
> when it won't find results on one side of a union.
>
> CREATE VIEW sales AS SELECT * FROM sales_archive_2002 UNION ALL SELECT *
> FROM sales_current;
>
>
> SELECT * FROM sales WHERE timestamp => CURRENT_TIMESTAMP - INTERVAL '1
> day';
>
> The above query would not be so quick.

I thought some work had been done on pushing where conditions down into
unions? If so the above wouldn't be too bad. It would still look at
the archive table, but it should return no rows relatively quickly
assuming an appropiate index exists.


From: Rod Taylor <rbt(at)rbt(dot)ca>
To: Bruno Wolff III <bruno(at)wolff(dot)to>
Cc: Brian Tarbox <btarbox(at)theworld(dot)com>, Postgresql Performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: are views typically any faster/slower than
Date: 2003-06-01 11:56:15
Message-ID: 1054468575.11968.95.camel@jester
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On Sun, 2003-06-01 at 01:43, Bruno Wolff III wrote:
> On Sun, Jun 01, 2003 at 00:02:39 -0400,
> Rod Taylor <rbt(at)rbt(dot)ca> wrote:
> >
> > Some views such as unions will not be as fast as you would like, but
> > thats a general issue with PostgreSQLs inability to throw away selects
> > when it won't find results on one side of a union.
> >
> > CREATE VIEW sales AS SELECT * FROM sales_archive_2002 UNION ALL SELECT *
> > FROM sales_current;
> >
> >
> > SELECT * FROM sales WHERE timestamp => CURRENT_TIMESTAMP - INTERVAL '1
> > day';
> >
> > The above query would not be so quick.
>
> I thought some work had been done on pushing where conditions down into
> unions? If so the above wouldn't be too bad. It would still look at
> the archive table, but it should return no rows relatively quickly
> assuming an appropiate index exists.

Certainly, if the index exists it won't be so bad (for any single
archive table). It's when the index doesn't exist or there are several
hundred archive tables then it starts to get a little worse.

Anyway, anyone doing the above in PostgreSQL should probably be looking
at partial indexes and merging the information back into a single table
again.

--
Rod Taylor <rbt(at)rbt(dot)ca>

PGP Key: http://www.rbt.ca/rbtpub.asc