Re: Performance Optimization for Dummies 2 - the SQL

From: "Merlin Moncure" <mmoncure(at)gmail(dot)com>
To: "Carlo Stonebanks" <stonec(dot)register(at)sympatico(dot)ca>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Performance Optimization for Dummies 2 - the SQL
Date: 2006-10-05 13:30:45
Message-ID: b42b73150610050630t76e9eedeh45a382729d35f2f6@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 10/5/06, Carlo Stonebanks <stonec(dot)register(at)sympatico(dot)ca> wrote:
> > do we have an multi-column index on
> > facility_address(facility_id, address_id)? did you run analyze?
>
> There is an index on facility_address on facility_id.
>
> I didn't create an index on facility_address.address_id because I expected
> joins to go in the other direction (from facility_address to address).
> Nor did I create a multi-column index on facility_id, address_id because I
> had yet to come up with a query that required that.

right. well, since you are filtering on address, I would consider
added an index on address_id or a multi column on address_id,
facility_id (in addition to facility_id). also, I'd consider removing
all the explicit joins like this:

explain analyze select
f.facility_id,
fa.facility_address_id,
a.address_id,
f.facility_type_code,
f.name,
a.address,
a.city,
a.state_code,
a.postal_code,
a.country_code
from
mdx_core.facility f,
mdx_core.facility_address fa,
mdx_core.address a
where
fa.facility_id = f.facility_id and
a.address_id = fa.address_id and
a.country_code = 'US' and
a.state_code = 'IL' and
a.postal_code like '60640-5759'||'%'
order by facility_id;

yet another way to write that where clause is:

(fa_address_id, fa.facility_id) = (a.address_id, f.facility_id) and
a.country_code = 'US' and
a.state_code = 'IL' and
a.postal_code like '60640-5759'||'%'
order by facility_id;

I personally only use explicit joins when doing outer joins and even
them push them out as far as possible.

I like the row constructor style better because it shows the key
relationships more clearly. I don't think it makes a difference in
execution (go ahead and try it). If you do make a multi column key on
facility_address, though, make sure to put they key fields in left to
right order in the row constructor. Try adding a multi key on
address_id and facility_id and run it this way. In a proper design
you would have a primary key on these fields but with imported data
you obviously have to make compromises :).

> However, I still have a lot to learn about how SQL chooses its indexes, how
> multi-column indexes are used, and when to use them (other than the
> obvious - i.e. sort orders or relational expressions which request those
> columns in one search expression)

well, it's kind of black magic but if the database is properly laid
out the function usually follows form pretty well.

> Analyse is actually run every time a page of imported data loads into the
> client program. This is currently set at 500 rows.

ok.

merlin

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Carlo Stonebanks 2006-10-05 16:29:42 Re: Performance Optimization for Dummies 2 - the SQL
Previous Message Tom Lane 2006-10-05 13:24:35 Re: Performance Optimization for Dummies 2 - the SQL