Re: Speed of exist

Lists: pgsql-performance
From: Bastiaan Olij <bastiaan(at)basenlily(dot)me>
To: Postgres performance mailing list <pgsql-performance(at)postgresql(dot)org>
Subject: Speed of exist
Date: 2013-02-19 06:34:56
Message-ID: 51231D10.1070101@basenlily.me
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Hi All,

Hope someone can help me a little bit here:

I've got a query like the following:
--
select Column1, Column2, Column3
from Table1
where exists (select 1 from Table2 where Table2.ForeignKey =
Table1.PrimaryKey)
or exists (select 1 from Table3 where Table3.ForeignKey = Table1.PrimaryKey)
--

Looking at the query plan it is doing a sequential scan on both Table2
and Table3.

If I remove one of the subqueries and turn the query into:
--
select Column1, Column2, Column3
from Table1
where exists (select 1 from Table2 where Table2.ForeignKey =
Table1.PrimaryKey)
--

It is nicely doing an index scan on the index that is on Table2.ForeignKey.

As Table2 and Table3 are rather large the first query takes minutes
while the second query takes 18ms.

Is there a way to speed this up or an alternative way of selecting
records from Table1 which have related records in Table2 or Table3 which
is faster?

Kindest Regards,

Bastiaan Olij


From: Andy <andy(dot)gumbrecht(at)orprovision(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Speed of exist
Date: 2013-02-19 07:31:02
Message-ID: 51232A36.6000701@orprovision.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Limit the sub-queries to 1, i.e. :

select 1 from Table2 where Table2.ForeignKey = Table1.PrimaryKey fetch first 1 rows only

Andy.

On 19.02.2013 07:34, Bastiaan Olij wrote:
> Hi All,
>
> Hope someone can help me a little bit here:
>
> I've got a query like the following:
> --
> select Column1, Column2, Column3
> from Table1
> where exists (select 1 from Table2 where Table2.ForeignKey =
> Table1.PrimaryKey)
> or exists (select 1 from Table3 where Table3.ForeignKey = Table1.PrimaryKey)
> --
>
> Looking at the query plan it is doing a sequential scan on both Table2
> and Table3.
>
> If I remove one of the subqueries and turn the query into:
> --
> select Column1, Column2, Column3
> from Table1
> where exists (select 1 from Table2 where Table2.ForeignKey =
> Table1.PrimaryKey)
> --
>
> It is nicely doing an index scan on the index that is on Table2.ForeignKey.
>
> As Table2 and Table3 are rather large the first query takes minutes
> while the second query takes 18ms.
>
> Is there a way to speed this up or an alternative way of selecting
> records from Table1 which have related records in Table2 or Table3 which
> is faster?
>
> Kindest Regards,
>
> Bastiaan Olij
>
>
>

--
------------------------------------------------------------------------------------------------------------------------

*Andy Gumbrecht*
Research & Development
Orpro Vision GmbH
Hefehof 24, 31785, Hameln

+49 (0) 5151 809 44 21
+49 (0) 1704 305 671
andy(dot)gumbrecht(at)orprovision(dot)com
www.orprovision.com

Orpro Vision GmbH
Sitz der Gesellschaft: 31785, Hameln
USt-Id-Nr: DE264453214
Amtsgericht Hannover HRB204336
Geschaeftsfuehrer: Roberto Gatti, Massimo Gatti, Adam Shaw

------------------------------------------------------------------------------------------------------------------------

Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte Informationen. Wenn Sie nicht der richtige
Adressat sind oder diese E-Mail irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und
vernichten Sie diese Mail. Das unerlaubte Kopieren, jegliche anderweitige Verwendung sowie die unbefugte
Weitergabe dieser Mail ist nicht gestattet.

------------------------------------------------------------------------------------------------------------------------

This e-mail may contain confidential and/or privileged information. If you are not the intended recipient
(or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any
unauthorized copying, disclosure, distribution or other use of the material or parts thereof is strictly
forbidden.

------------------------------------------------------------------------------------------------------------------------


From: Bastiaan Olij <bastiaan(at)basenlily(dot)me>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Speed of exist
Date: 2013-02-19 07:36:48
Message-ID: 51232B90.2080504@basenlily.me
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Hi Andy,

I've tried that with the same result. One subquery works beautifully,
two subqueries with an OR and it starts to do a sequential scan...

Thanks,

Bastiaan Olij

On 19/02/13 6:31 PM, Andy wrote:
> Limit the sub-queries to 1, i.e. :
>
> select 1 from Table2 where Table2.ForeignKey = Table1.PrimaryKey fetch
> first 1 rows only
>
> Andy.
>
> On 19.02.2013 07:34, Bastiaan Olij wrote:
>> Hi All,
>>
>> Hope someone can help me a little bit here:
>>
>> I've got a query like the following:
>> --
>> select Column1, Column2, Column3
>> from Table1
>> where exists (select 1 from Table2 where Table2.ForeignKey =
>> Table1.PrimaryKey)
>> or exists (select 1 from Table3 where Table3.ForeignKey =
>> Table1.PrimaryKey)
>> --
>>
>> Looking at the query plan it is doing a sequential scan on both Table2
>> and Table3.
>>
>> If I remove one of the subqueries and turn the query into:
>> --
>> select Column1, Column2, Column3
>> from Table1
>> where exists (select 1 from Table2 where Table2.ForeignKey =
>> Table1.PrimaryKey)
>> --
>>
>> It is nicely doing an index scan on the index that is on
>> Table2.ForeignKey.
>>
>> As Table2 and Table3 are rather large the first query takes minutes
>> while the second query takes 18ms.
>>
>> Is there a way to speed this up or an alternative way of selecting
>> records from Table1 which have related records in Table2 or Table3 which
>> is faster?
>>
>> Kindest Regards,
>>
>> Bastiaan Olij
>>
>>
>>
>


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Bastiaan Olij <bastiaan(at)basenlily(dot)me>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Speed of exist
Date: 2013-02-19 07:39:31
Message-ID: CAFj8pRAUp0SNc5Ur-aYaUpDUXjjRPxC0TozRPwGZ3v4fZFEEqA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

2013/2/19 Bastiaan Olij <bastiaan(at)basenlily(dot)me>:
> Hi Andy,
>
> I've tried that with the same result. One subquery works beautifully,
> two subqueries with an OR and it starts to do a sequential scan...

try to rewrite OR to two SELECTs joined by UNION ALL

Pavel

>
> Thanks,
>
> Bastiaan Olij
>
> On 19/02/13 6:31 PM, Andy wrote:
>> Limit the sub-queries to 1, i.e. :
>>
>> select 1 from Table2 where Table2.ForeignKey = Table1.PrimaryKey fetch
>> first 1 rows only
>>
>> Andy.
>>
>> On 19.02.2013 07:34, Bastiaan Olij wrote:
>>> Hi All,
>>>
>>> Hope someone can help me a little bit here:
>>>
>>> I've got a query like the following:
>>> --
>>> select Column1, Column2, Column3
>>> from Table1
>>> where exists (select 1 from Table2 where Table2.ForeignKey =
>>> Table1.PrimaryKey)
>>> or exists (select 1 from Table3 where Table3.ForeignKey =
>>> Table1.PrimaryKey)
>>> --
>>>
>>> Looking at the query plan it is doing a sequential scan on both Table2
>>> and Table3.
>>>
>>> If I remove one of the subqueries and turn the query into:
>>> --
>>> select Column1, Column2, Column3
>>> from Table1
>>> where exists (select 1 from Table2 where Table2.ForeignKey =
>>> Table1.PrimaryKey)
>>> --
>>>
>>> It is nicely doing an index scan on the index that is on
>>> Table2.ForeignKey.
>>>
>>> As Table2 and Table3 are rather large the first query takes minutes
>>> while the second query takes 18ms.
>>>
>>> Is there a way to speed this up or an alternative way of selecting
>>> records from Table1 which have related records in Table2 or Table3 which
>>> is faster?
>>>
>>> Kindest Regards,
>>>
>>> Bastiaan Olij
>>>
>>>
>>>
>>
>
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance


From: Bastiaan Olij <bastiaan(at)basenlily(dot)me>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Speed of exist
Date: 2013-02-19 10:18:55
Message-ID: 5123518F.30206@basenlily.me
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Hi Pavel,

That is what I've done in this particular case but there are parts where
I use exist checks in this way that are very cumbersome to write out
like that so I'm hoping there is a way to make the optimizer work with
existence checks in this way.

Cheers,

Bastiaan Olij

On 19/02/13 6:39 PM, Pavel Stehule wrote:
> 2013/2/19 Bastiaan Olij <bastiaan(at)basenlily(dot)me>:
>> Hi Andy,
>>
>> I've tried that with the same result. One subquery works beautifully,
>> two subqueries with an OR and it starts to do a sequential scan...
> try to rewrite OR to two SELECTs joined by UNION ALL
>
> Pavel
>