Re: Query plan not using index for some reason.

Lists: pgsql-general
From: Jean-Christian Imbeault <jc(at)mega-bucks(dot)co(dot)jp>
To: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Query plan not using index for some reason.
Date: 2002-10-01 15:56:10
Message-ID: 3D99C59A.6090503@mega-bucks.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

One of my SQL is is slow so I tried using EXPLAIN to find out why but
the query plan is gives seems bad ... it's not using indexes ...

The query is on two tables, both of which have indexes. When I check
EXPLAIN for the query without the OR clause the planner uses the index.
When I add the OR clause it uses a seq scan ...

Is the planner right in choosing a seq scan?

Here is the relevant data:

$ psql TMP -c "vacuum analyze"
VACUUM

$ psql TMP -c "explain select products.id as pid from
products,rel_genres_movies where maker_id='53' OR
(rel_genres_movies.minor_id='11' AND rel_genres_movies.prod_id=products.id)"
NOTICE: QUERY PLAN:

Nested Loop (cost=0.00..8906651.40 rows=2677 width=40)
-> Seq Scan on products (cost=0.00..953.85 rows=14285 width=20)
-> Seq Scan on rel_genres_movies (cost=0.00..289.81 rows=16681
width=20)

EXPLAIN

#BUT ... removing either side of the OR clause gives an index scan ...

$ psql TMP -c "explain select products.id as pid from
products,rel_genres_movies where (rel_genres_movies.minor_id='11' AND
rel_genres_movies.prod_id=products.id)"
NOTICE: QUERY PLAN:

Nested Loop (cost=0.00..975.45 rows=145 width=32)
-> Seq Scan on rel_genres_movies (cost=0.00..331.51 rows=145 width=16)
-> Index Scan using products_pkey on products (cost=0.00..4.43
rows=1 width=16)

EXPLAIN

$ psql TMP -c "explain select products.id as pid from
products,rel_genres_movies where maker_id='53'"
NOTICE: QUERY PLAN:

Nested Loop (cost=0.00..7100.10 rows=257505 width=16)
-> Index Scan using idx_products_maker_id on products
(cost=0.00..51.25 rows=15 width=16)
-> Seq Scan on rel_genres_movies (cost=0.00..289.81 rows=16681 width=0)

EXPLAIN

Jc


From: Stephan Szabo <sszabo(at)megazone23(dot)bigpanda(dot)com>
To: Jean-Christian Imbeault <jc(at)mega-bucks(dot)co(dot)jp>
Cc: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Query plan not using index for some reason.
Date: 2002-10-01 16:25:39
Message-ID: 20021001091756.B94756-100000@megazone23.bigpanda.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Wed, 2 Oct 2002, Jean-Christian Imbeault wrote:

> One of my SQL is is slow so I tried using EXPLAIN to find out why but
> the query plan is gives seems bad ... it's not using indexes ...
>
> The query is on two tables, both of which have indexes. When I check
> EXPLAIN for the query without the OR clause the planner uses the index.
> When I add the OR clause it uses a seq scan ...
>
> Is the planner right in choosing a seq scan?

Given it wants to use one index in a scan, probably, since I don't think
either of those indexes will help that full condition if I'm guessing the
schema correctly. I can't see an index on id helping when you also
need to get all the maker_id=53 rows and the one on maker_id doesn't
seem like it'd help the joining of the tables on id.

Hmm, maybe
select products.id as pid from products where maker_id='53'
union
select products.id as pid from products, rel_genres_movies where
(rel_genres_movies.minor_id='11' AND rel_genres_movies.prod_id=products.id)"
would run better (or union all if it's safe to have duplicates).

An index on id doesn't help since you also need to get
all the rows where maker_id is 53 and one on maker_id doesn't help join
ids.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jean-Christian Imbeault <jc(at)mega-bucks(dot)co(dot)jp>
Cc: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Query plan not using index for some reason.
Date: 2002-10-01 17:13:45
Message-ID: 5864.1033492425@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Jean-Christian Imbeault <jc(at)mega-bucks(dot)co(dot)jp> writes:
> $ psql TMP -c "explain select products.id as pid from
> products,rel_genres_movies where maker_id='53' OR
> (rel_genres_movies.minor_id='11' AND rel_genres_movies.prod_id=products.id)"

Is that *really* the query you want to run?

Rows having maker_id=53 will be joined to *every* row of the other
table. I'm suspecting you really wanted
where (maker_id='53' OR rel_genres_movies.minor_id='11') AND
rel_genres_movies.prod_id=products.id
so that the join behaves sensibly.

regards, tom lane


From: Jean-Christian Imbeault <jc(at)mega-bucks(dot)co(dot)jp>
To: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Query plan not using index for some reason.
Date: 2002-10-01 17:40:13
Message-ID: 3D99DDFD.1030703@mega-bucks.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Tom Lane wrote:
>
>>$ psql TMP -c "explain select products.id as pid from
>>products,rel_genres_movies where maker_id='53' OR
>>(rel_genres_movies.minor_id='11' AND rel_genres_movies.prod_id=products.id)"
>
>
> Is that *really* the query you want to run?
>
> Rows having maker_id=53 will be joined to *every* row of the other
> table. I'm suspecting you really wanted
> where (maker_id='53' OR rel_genres_movies.minor_id='11') AND
> rel_genres_movies.prod_id=products.id
> so that the join behaves sensibly.

As you can see my SQL ability is poor ... you are of course absolutely
right ...

Which query would be most efficient, the one you gave or

select id from products where maker_id='53' UNION select prod_id from
rel_genres_movies where minor_id='11'

I think they are both equivalent but I am wondering which is most
efficient ...

Jc