how to tune this query.

Lists: pgsql-performance
From: Luckys <plpgsql(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: how to tune this query.
Date: 2006-07-04 16:00:02
Message-ID: b80d582c0607040900i2e1aba6fy35c2acac27427202@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Hi all,
I got this query, I'm having indexes for PropertyId and Dates columns across
all the tables, but still it takes ages to get me the result. What indexes
would be proposed on this, or I'm helpless?

FROM STG_Property a

FULL OUTER JOIN
STG_PropConfirmedLogs b
ON (a.PropertyId = b.PropertyId AND a.p_LastModified = b.p_Modified_Date
)

FULL OUTER JOIN
STG_PropConnectionFeesLogs c
ON ((a.PropertyId = c.PropertyId AND a.p_LastModified = c.p_ChangedOn)
OR (b.PropertyId = c.PropertyId AND b.p_Modified_Date = c.p_ChangedOn))

FULL OUTER JOIN
STG_PropDeletedLogs d
ON ((a.PropertyId = d.PropertyId AND a.p_LastModified = d.p_DeletedOn)
OR (b.PropertyId = d.PropertyId AND b.p_Modified_Date = d.p_DeletedOn)
OR (c.PropertyId = d.PropertyId AND c.p_ChangedOn = d.p_DeletedOn))

FULL OUTER JOIN
STG_PropFEWALogs e
ON ((a.PropertyId = e.PropertyId AND a.p_LastModified =
e.p_Modified_Date)
OR (b.PropertyId = e.PropertyId AND b.p_Modified_Date =
e.p_Modified_Date) OR (c.PropertyId = e.PropertyId AND c.p_ChangedOn =
e.p_Modified_Date)
OR (d.PropertyId = e.PropertyId AND d.p_DeletedOn = e.p_Modified_Date))

FULL OUTER JOIN
STG_PropInSewerNetworkLogs f
ON ((a.PropertyId = f.PropertyId AND a.p_LastModified =
f.p_Modified_Date)
OR (b.PropertyId = f.PropertyId AND b.p_Modified_Date =
f.p_Modified_Date)
OR (c.PropertyId = f.PropertyId AND c.p_ChangedOn = f.p_Modified_Date)
OR (d.PropertyId = f.PropertyId AND d.p_DeletedOn = f.p_Modified_Date)
OR (e.PropertyId = f.PropertyId AND e.p_Modified_Date =
f.p_Modified_Date)) FULL OUTER JOIN
STG_PropTypeLogs g
ON ((a.PropertyId = g.PropertyId AND a.p_LastModified = g
.p_LastModified)
OR (b.PropertyId = g.PropertyId AND b.p_Modified_Date = g
.p_LastModified)
OR (c.PropertyId = g.PropertyId AND c.p_ChangedOn = g.p_LastModified)
OR (d.PropertyId = g.PropertyId AND d.p_DeletedOn = g.p_LastModified)
OR (e.PropertyId = g.PropertyId AND e.p_Modified_Date = g
.p_LastModified)
OR (f.PropertyId = g.PropertyId AND f.p_Modified_Date = g
.p_LastModified))

-- Luckys


From: Nolan Cafferky <Nolan(dot)Cafferky(at)rbsinteractive(dot)com>
To: Luckys <plpgsql(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: how to tune this query.
Date: 2006-07-04 16:18:43
Message-ID: 44AA94E3.9040301@rbsinteractive.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

I don't think indexes are going to help you here - with the FULL OUTER
JOINs, the query will have to look at and include each row from each
table you query from anyway, so it's going to choose sequential scans.
In addition, some of the lower join conditions are going to take forever.

What's is your goal? The volume of data that I imagine this query would
produce can't possibly be useful. I'm guessing at the very least you'll
want to LEFT OUTER JOIN everything back against STG_Property, and leave
the other join conditions out of each ON statement.

Luckys wrote:

> Hi all,
> I got this query, I'm having indexes for PropertyId and Dates columns
> across all the tables, but still it takes ages to get me the result.
> What indexes would be proposed on this, or I'm helpless?
>
> FROM STG_Property a
>
> FULL OUTER JOIN
> STG_PropConfirmedLogs b
> ON (a.PropertyId = b.PropertyId AND a.p_LastModified =
> b.p_Modified_Date)
>
> FULL OUTER JOIN
> STG_PropConnectionFeesLogs c
> ON ((a.PropertyId = c.PropertyId AND a.p_LastModified = c.p_ChangedOn)
> OR (b.PropertyId = c.PropertyId AND b.p_Modified_Date =
> c.p_ChangedOn))
>
> FULL OUTER JOIN
> STG_PropDeletedLogs d
> ON ((a.PropertyId = d.PropertyId AND a.p_LastModified = d.p_DeletedOn)
> OR (b.PropertyId = d.PropertyId AND b.p_Modified_Date =
> d.p_DeletedOn)
> OR ( c.PropertyId = d.PropertyId AND c.p_ChangedOn = d.p_DeletedOn))
>
> FULL OUTER JOIN
> STG_PropFEWALogs e
> ON ((a.PropertyId = e.PropertyId AND a.p_LastModified =
> e.p_Modified_Date)
> OR (b.PropertyId = e.PropertyId AND b.p_Modified_Date =
> e.p_Modified_Date) OR (c.PropertyId = e.PropertyId AND c.p_ChangedOn
> = e.p_Modified_Date)
> OR (d.PropertyId = e.PropertyId AND d.p_DeletedOn =
> e.p_Modified_Date))
>
> FULL OUTER JOIN
> STG_PropInSewerNetworkLogs f
> ON ((a.PropertyId = f.PropertyId AND a.p_LastModified =
> f.p_Modified_Date)
> OR (b.PropertyId = f.PropertyId AND b.p_Modified_Date =
> f.p_Modified_Date)
> OR (c.PropertyId = f.PropertyId AND c.p_ChangedOn =
> f.p_Modified_Date)
> OR (d.PropertyId = f.PropertyId AND d.p_DeletedOn =
> f.p_Modified_Date)
> OR (e.PropertyId = f.PropertyId AND e.p_Modified_Date =
> f.p_Modified_Date)) FULL OUTER JOIN
> STG_PropTypeLogs g
> ON ((a.PropertyId = g .PropertyId AND a.p_LastModified =
> g.p_LastModified)
> OR (b.PropertyId = g.PropertyId AND b.p_Modified_Date =
> g.p_LastModified)
> OR (c.PropertyId = g.PropertyId AND c.p_ChangedOn = g.p_LastModified)
> OR (d.PropertyId = g.PropertyId AND d.p_DeletedOn = g.p_LastModified)
> OR (e.PropertyId = g.PropertyId AND e.p_Modified_Date =
> g.p_LastModified)
> OR (f.PropertyId = g.PropertyId AND f.p_Modified_Date =
> g.p_LastModified))
>
> -- Luckys

--

Nolan Cafferky
Software Developer
IT Department
RBS Interactive
nolan(dot)cafferky(at)rbsinteractive(dot)com


From: "Merlin Moncure" <mmoncure(at)gmail(dot)com>
To: Luckys <plpgsql(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: how to tune this query.
Date: 2006-07-07 17:42:16
Message-ID: b42b73150607071042w623615a9ucde3a17c1770c5b@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On 7/4/06, Luckys <plpgsql(at)gmail(dot)com> wrote:
>
> Hi all,
> I got this query, I'm having indexes for PropertyId and Dates columns across
> all the tables, but still it takes ages to get me the result. What indexes
> would be proposed on this, or I'm helpless?
>

I would suggest posting your table schemas and describe what you want
the results to look like. After years of following this list, I
regard your query as something of a classic. There simply has to be
an easier way of writing it.

merlin