Re: Searching union views not using indices

Lists: pgsql-performance
From: Michal Taborsky <michal(dot)taborsky(at)mall(dot)cz>
To: pgsql-performance(at)postgresql(dot)org
Subject: Searching union views not using indices
Date: 2005-11-04 11:38:30
Message-ID: 436B4836.7030300@mall.cz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Hello everyone.

We are facing a performance problem with views consisting of several
unioned tables. The simplified schema is as follows:

CREATE TABLE foo (
foo_object_id bigint,
link_id bigint,
somedata text,
PRIMARY KEY (foo_object_id) );

CREATE TABLE bar (
bar_object_id bigint,
link_id bigint,
otherdata real,
PRIMARY KEY (bar_object_id) );

There are actually five of such tables, all having two common attributes
*_object_id and link_id. All tables have indices on link_id, which is
very selective, close to unique. The *_object_id is unique within this
scope across all tables, but that's not important.

Then we have a view:

CREATE VIEW commonview AS
SELECT foo_object_id as object_id, link_id, 'It is in foo' as loc
FROM foo

UNION

SELECT bar_object_id as object_id, link_id, 'It is in bar' as loc
FROM bar

We commonly do this:

SELECT object_id FROM commonview WHERE link_id=1234567

The result is sequential scan on all tables, append, sort and then
filter scan on this whole thing. Which of course is slow as hell. We use
version 8.0.2.

And now the question: Is there a way to force the planner to push the
condition lower, so it will use the index? Or do you use some tricks in
this scenario? Thanks for your suggestions.

Bye.

--
Michal Táborský
CTO, Internet Mall, a.s.

Internet Mall - obchody, které si oblíbíte
<http://www.MALL.cz>


From: Richard Huxton <dev(at)archonet(dot)com>
To: Michal Taborsky <michal(dot)taborsky(at)mall(dot)cz>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Searching union views not using indices
Date: 2005-11-04 15:01:05
Message-ID: 436B77B1.7090602@archonet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Michal Taborsky wrote:
...
> UNION
...
> The result is sequential scan on all tables, append, sort and then
> filter scan on this whole thing. Which of course is slow as hell. We use
> version 8.0.2.
>
> And now the question: Is there a way to force the planner to push the
> condition lower, so it will use the index? Or do you use some tricks in
> this scenario? Thanks for your suggestions.

Try "UNION ALL", since UNION is defined as removing duplicates, which
probably accounts for the sort.

--
Richard Huxton
Archonet Ltd


From: Michael Fuhr <mike(at)fuhr(dot)org>
To: Michal Taborsky <michal(dot)taborsky(at)mall(dot)cz>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Searching union views not using indices
Date: 2005-11-04 15:12:57
Message-ID: 20051104151257.GA89760@winnie.fuhr.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On Fri, Nov 04, 2005 at 12:38:30PM +0100, Michal Taborsky wrote:
> SELECT object_id FROM commonview WHERE link_id=1234567
>
> The result is sequential scan on all tables, append, sort and then
> filter scan on this whole thing. Which of course is slow as hell. We use
> version 8.0.2.

I couldn't duplicate this in 8.0.4; I don't know if anything's
changed since 8.0.2 that would affect the query plan. Could you
post the EXPLAIN ANALYZE output? It might also be useful to see
the output with enable_seqscan disabled.

Have the tables been vacuumed and analyzed recently?

--
Michael Fuhr


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Michal Taborsky <michal(dot)taborsky(at)mall(dot)cz>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Searching union views not using indices
Date: 2005-11-04 15:31:34
Message-ID: 13815.1131118294@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Michal Taborsky <michal(dot)taborsky(at)mall(dot)cz> writes:
> We are facing a performance problem with views consisting of several
> unioned tables. The simplified schema is as follows:

Perhaps you should show us the real schema, because I cannot duplicate
your complaint on the toy case you show.

regression=# explain SELECT object_id FROM commonview WHERE link_id=1234567;
QUERY PLAN
------------------------------------------------------------------------------------------------
Subquery Scan commonview (cost=41.40..41.66 rows=13 width=8)
-> Unique (cost=41.40..41.53 rows=13 width=16)
-> Sort (cost=41.40..41.43 rows=13 width=16)
Sort Key: object_id, link_id, loc
-> Append (cost=0.00..41.16 rows=13 width=16)
-> Subquery Scan "*SELECT* 1" (cost=0.00..17.12 rows=5 width=16)
-> Index Scan using fooi on foo (cost=0.00..17.07 rows=5 width=16)
Index Cond: (link_id = 1234567)
-> Subquery Scan "*SELECT* 2" (cost=0.00..24.04 rows=8 width=16)
-> Index Scan using bari on bar (cost=0.00..23.96 rows=8 width=16)
Index Cond: (link_id = 1234567)
(11 rows)

(I had to add indexes on link_id to the example, of course.)

As noted by others, you probably want to be using UNION ALL not UNION,
but that's not the crux of the issue.

regards, tom lane


From: Michal Taborsky <michal(dot)taborsky(at)mall(dot)cz>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Searching union views not using indices
Date: 2005-11-04 15:55:59
Message-ID: 436B848F.5030402@mall.cz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Tom Lane napsal(a):
> Michal Taborsky <michal(dot)taborsky(at)mall(dot)cz> writes:
>
>>We are facing a performance problem with views consisting of several
>>unioned tables. The simplified schema is as follows:
>
>
> Perhaps you should show us the real schema, because I cannot duplicate
> your complaint on the toy case you show.
> As noted by others, you probably want to be using UNION ALL not UNION,
> but that's not the crux of the issue.

OK. Mystery (sort of) solved. After you told me it works for you I had
to assume the problem was somewhere else. And, indeed, it was, though
it's not too obvious.

The two attributes are actually not of tybe bigint, but of type
"crm_object_id", which is created as follows:

CREATE DOMAIN "public"."crm_object_id" AS
bigint NULL;

Everything started working perfectly after I modified the view like this:

CREATE VIEW commonview AS
SELECT foo_object_id::bigint as object_id, link_id::bigint, 'It is in
foo' as loc FROM foo
UNION
SELECT bar_object_id::bigint as object_id, link_id::bigint, 'It is in
bar' as loc FROM bar

Not even modifying the select as this did not help:

explain SELECT object_id FROM commonview WHERE
link_id=1234567::crm_object_id;

Is this a bug or feature?

--
Michal Táborský
CTO, Internet Mall, a.s.

Internet Mall - obchody, které si oblíbíte
<http://www.MALL.cz>


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Michal Taborsky <michal(dot)taborsky(at)mall(dot)cz>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Searching union views not using indices
Date: 2005-11-04 17:53:07
Message-ID: 15833.1131126787@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Michal Taborsky <michal(dot)taborsky(at)mall(dot)cz> writes:
> OK. Mystery (sort of) solved. After you told me it works for you I had
> to assume the problem was somewhere else. And, indeed, it was, though
> it's not too obvious.

> The two attributes are actually not of tybe bigint, but of type
> "crm_object_id", which is created as follows:

> CREATE DOMAIN "public"."crm_object_id" AS
> bigint NULL;

Ah. The problem is that the UNION's output column is bigint, and the
type discrepancy (bigint above, domain below) discourages the planner
from pushing down the WHERE condition.

There's a related complaint here:
http://archives.postgresql.org/pgsql-bugs/2005-10/msg00227.php

If we were to change things so that the result of the UNION were still
the domain, not plain bigint, then your example would be optimized the
way you want. I'm unsure about what other side-effects that would have
though.

regards, tom lane