Re: help with query

From: Jean-Luc Lachance <jllachan(at)sympatico(dot)ca>
To: pg(at)fastcrypt(dot)com
Cc: Brad Bulger <brad(at)madfish(dot)com>, perform <pgsql-performance(at)postgresql(dot)org>
Subject: Re: help with query
Date: 2004-08-19 14:22:45
Message-ID: 4124B7B5.6050409@sympatico.ca
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

how about:

SELECT distinct main.oid,main.* FROM Tickets main
WHERE main.EffectiveId = main.id
AND main.Status != 'deleted'
AND ( main.Type = 'ticket' OR main.Type = 'subticket' )
AND ( main.Queue = '9' )
AND ( main.id = '17417'
OR main.id IN (
SELECT DISTINCT LocalTarget from Links
where Type = 'MemberOf' and LocalTarget = '17417')
OR main.id IN (
SELECT DISTINCT LocalBase from Links
where Type = 'MemberOf' and LocalTarget = '17417'))

Dave Cramer wrote:

> Brad,
>
> Thanks, that runs on the same order of magnitude as the subqueries.
>
> DAve
> On Thu, 2004-08-19 at 09:38, Brad Bulger wrote:
>
>>You're doing a join except not, is the trouble, looks like. The query is really
>>"FROM Tickets main, Links", but when Tickets.id is 17417, you've got no join
>>to the Links table. So you end up getting every row in Links for each row in
>>Tickets with id = 17417.
>>
>>I'd think this wants to be two queries or a union:
>>
>>SELECT distinct main.oid,main.* FROM Tickets main
>>WHERE (main.EffectiveId = main.id)
>>AND (main.Status != 'deleted')
>>AND ( (main.Type = 'ticket') OR (main.Type = 'subticket') )
>>AND ( (main.Queue = '9') )
>>AND ( (main.id = '17417'))
>>union
>>SELECT distinct main.oid,main.* FROM Tickets main, Links
>>WHERE (main.EffectiveId = main.id)
>>AND (main.Status != 'deleted')
>>AND ( (main.Type = 'ticket') OR (main.Type = 'subticket') )
>>AND ( (main.Queue = '9') )
>>AND ( (Links.Type = 'MemberOf') )
>>AND ( (Links.LocalTarget = '17417') )
>>AND ( (main.id = Links.LocalBase) ) OR (main.id = Links.LocalTarget) )
>>;
>>
>>or else, yah, a subquery:
>>
>>[...]
>>AND (
>> main.id = '17417'
>> or
>> exists(
>> select true from Links
>> where Type = 'MemberOf' and LocalTarget = '17417'
>> and (LocalBase = main.id or LocalTarget = main.id)
>> )
>>)
>>
>>Those are the only things I can think of to make it work, anyways.
>>
>>Dave Cramer wrote:
>>
>>
>>>RT uses a query like:
>>>
>>>SELECT distinct main.oid,main.* FROM Tickets main
>>>WHERE
>>>(main.EffectiveId = main.id)
>>>AND
>>>(main.Status != 'deleted')
>>>AND
>>> ( (main.Type = 'ticket') OR (main.Type = 'subticket') )
>>>AND
>>> ( (main.Queue = '9') )
>>>AND ((
>>> ( (Links.Type = 'MemberOf') AND (Links.LocalTarget = '17417') AND (main.id = Links.LocalBase) )
>>> OR
>>> ( (Links.Type = 'MemberOf') AND (Links.LocalBase = '17417') AND (main.id = Links.LocalTarget) )
>>> or
>>> (main.id = '17417')
>>> )
>>> );
>>>
>>>
>>>which produces a query plan:
>>>
>>>Nested Loop (cost=0.00..813.88 rows=1 width=169)
>>> Join Filter: (((("inner"."type")::text = 'MemberOf'::text) OR ("outer".id = 17417)) AND (("inner".localbase = 17417) OR (("inner"."type")::text = 'MemberOf'::text) OR ("outer".id
>>>= 17417)) AND (("outer".id = "inner".localtarget) OR (("inner"."type")::text = 'MemberOf'::text) OR ("outer".id = 17417)) AND ((("inner"."type")::text = 'MemberOf'::text) OR ("inner"
>>>.localtarget = 17417) OR ("outer".id = 17417)) AND (("inner".localbase = 17417) OR ("inner".localtarget = 17417) OR ("outer".id = 17417)) AND (("outer".id = "inner".localtarget) OR (
>>>"inner".localtarget = 17417) OR ("outer".id = 17417)) AND ((("inner"."type")::text = 'MemberOf'::text) OR ("outer".id = "inner".localbase) OR ("outer".id = 17417)) AND (("inner".loca
>>>lbase = 17417) OR ("outer".id = "inner".localbase) OR ("outer".id = 17417)) AND (("outer".id = "inner".localtarget) OR ("outer".id = "inner".localbase) OR ("outer".id = 17417)))
>>> -> Index Scan using tickets1 on tickets main (cost=0.00..657.61 rows=1 width=169)
>>> Index Cond: (queue = 9)
>>> Filter: ((effectiveid = id) AND ((status)::text <> 'deleted'::text) AND ((("type")::text = 'ticket'::text) OR (("type")::text = 'subticket'::text)))
>>> -> Seq Scan on links (cost=0.00..46.62 rows=1462 width=20)
>>>
>>>If I rewrite the query as:
>>>
>>>SELECT main.* FROM Tickets main
>>>WHERE
>>>(main.EffectiveId = main.id)
>>>AND
>>>(main.Status != 'deleted')
>>>AND
>>> ( (main.Type = 'ticket') OR (main.Type = 'subticket') )
>>>AND
>>> ( (main.Queue = '9') )
>>>AND (
>>> 17417 in (select links.localtarget from links where links.type='MemberOf' and main.id=links.localbase)
>>> or
>>> 17417 in ( select links.localbase from links where links.type='MemberOf' and main.id=links.localtarget)
>>> or
>>> main.id = '17417'
>>> )
>>> ;
>>>
>>>The time for the query goes from 1500ms to 15ms. The two OR clauses
>>>
>>> ( (Links.Type = 'MemberOf') AND (Links.LocalTarget = '17417') AND (main.id = Links.LocalBase) )
>>> OR
>>> ( (Links.Type = 'MemberOf') AND (Links.LocalBase = '17417') AND (main.id = Links.LocalTarget) )
>>>
>>>don't contribute to the result set in this particular dataset, which is why the speed increases so dramatically.
>>>
>>>Is there a way to rewrite the top query to get the same results? I have already talked to Best Practical,
>>>and subqueries are not easily embraced.
>>>
>>>Dave
>>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Josh Berkus 2004-08-19 16:18:31 Re: Help specifying new machine
Previous Message Dave Cramer 2004-08-19 14:17:51 Re: help with query