Re: Tuning/performance issue...

Lists: pgsql-performance
From: David Griffiths <dgriffiths(at)boats(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Tuning/performance issue...
Date: 2003-09-30 20:24:24
Message-ID: 057501c38790$d9b9aad0$6501a8c0@griffiths2
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

We're having a problem with a query during our investigation into
Postgres (as an Oracle replacement). This query Postgres takes 20-40
seconds (multiple runs). Tom Lan recommended I post it here, with an
explain-analyze.

Here's the query:

EXPLAIN ANALYZE SELECT company_name, address_1, address_2, address_3,
city,
address_list.state_province_id, state_province_short_desc, country_desc,
zip_code, address_list.country_id,
contact_info.email, commercial_entity.user_account_id, phone_num_1,
phone_num_fax, website, boats_website
FROM commercial_entity, country, user_account,
address_list LEFT JOIN state_province ON address_list.state_province_id
= state_province.state_province_id
LEFT JOIN contact_info ON address_list.contact_info_id =
contact_info.contact_info_id
WHERE address_list.address_type_id = 101
AND commercial_entity.commercial_entity_id=225528
AND commercial_entity.commercial_entity_id =
address_list.commercial_entity_id
AND address_list.country_id = country.country_id
AND commercial_entity.user_account_id = user_account.user_account_id
AND user_account.user_role_id IN (101, 101);

Here's the explain:

------------------------------------------------------------------------
------------------------------------------------------------------------
------------------------------------
Nested Loop (cost=0.00..64570.33 rows=1 width=385) (actual
time=42141.08..42152.06 rows=1 loops=1)
-> Nested Loop (cost=0.00..64567.30 rows=1 width=361) (actual
time=42140.80..42151.77 rows=1 loops=1)
-> Nested Loop (cost=0.00..64563.97 rows=1 width=349) (actual
time=42140.31..42151.27 rows=1 loops=1)
Join Filter: ("outer".commercial_entity_id =
"inner".commercial_entity_id)
-> Index Scan using commercial_entity_pkey on
commercial_entity (cost=0.00..5.05 rows=1 width=94) (actual
time=0.57..0.58 rows=1 loops=1)
Index Cond: (commercial_entity_id =
225528::numeric)
-> Materialize (cost=63343.66..63343.66 rows=97221
width=255) (actual time=41741.96..41901.17 rows=90527 loops=1)
-> Merge Join (cost=0.00..63343.66 rows=97221
width=255) (actual time=1.44..41387.68 rows=90527 loops=1)
Merge Cond: ("outer".contact_info_id =
"inner".contact_info_id)
-> Nested Loop (cost=0.00..830457.52
rows=97221 width=222) (actual time=0.95..39178.32 rows=90527 loops=1)
Join Filter: ("outer".state_province_id
= "inner".state_province_id)
-> Index Scan using addr_list_ci_id_i
on address_list (cost=0.00..586676.65 rows=97221 width=205) (actual
time=0.49..2159.90 rows=90527 loops=1)
Filter: (address_type_id =
101::numeric)
-> Seq Scan on state_province
(cost=0.00..1.67 rows=67 width=17) (actual time=0.00..0.21 rows=67
loops=90527)
-> Index Scan using contact_info_pkey on
contact_info (cost=0.00..3366.76 rows=56435 width=33) (actual
time=0.44..395.75 rows=55916 loops=1)
-> Index Scan using user_account_pkey on user_account
(cost=0.00..3.32 rows=1 width=12) (actual time=0.46..0.46 rows=1
loops=1)
Index Cond: ("outer".user_account_id =
user_account.user_account_id)
Filter: (user_role_id = 101::numeric)
-> Index Scan using country_pkey on country (cost=0.00..3.01 rows=1
width=24) (actual time=0.25..0.25 rows=1 loops=1)
Index Cond: ("outer".country_id = country.country_id)
Total runtime: 42165.44 msec
(21 rows)


I will post the schema in a seperate email - the list has rejected one
big email 3 times now.

David


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: David Griffiths <dgriffiths(at)boats(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Tuning/performance issue...
Date: 2003-10-01 04:28:50
Message-ID: 25573.1064982530@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

David Griffiths <dgriffiths(at)boats(dot)com> writes:
> ... FROM commercial_entity, country, user_account,
> address_list LEFT JOIN state_province ON address_list.state_province_id
> = state_province.state_province_id
> LEFT JOIN contact_info ON address_list.contact_info_id =
> contact_info.contact_info_id
> WHERE ...

I believe what you're getting burnt by is that PG's planner interprets
this as forcing the address_list * state_province * contact_info join
to be done before it joins those tables to commercial_entity, country,
and user_account --- for discussion see
http://www.postgresql.org/docs/7.3/static/explicit-joins.html

Unfortunately your WHERE-clause restriction conditions are on
address_list, commercial_entity, and user_account; and it seems the
address_list constraint is very weak. So the plan ends up forming a
large fraction of the address_list * state_province * contact_info join,
only to throw it away again when there's no matching rows selected from
commercial_entity and user_account. The actual runtime and actual row
counts from the EXPLAIN ANALYZE output show that this is what's
happening.

The most efficient way to handle this query would probably be to join
the three tables with restrictions first, and then join the other tables
to those. You could force this with not too much rewriting using
something like (untested, but I think it's right)

... FROM commercial_entity CROSS JOIN user_account CROSS JOIN
address_list LEFT JOIN state_province ON address_list.state_province_id
= state_province.state_province_id
LEFT JOIN contact_info ON address_list.contact_info_id =
contact_info.contact_info_id
CROSS JOIN country
WHERE ...

The explicit JOINs associate left-to-right, so this gives the intended
join order. (In your original query, explicit JOIN binds more tightly
than commas do.)

The reason PG's planner doesn't discover this join order for itself
is that it's written to not attempt to re-order outer joins from the
syntactically defined ordering. In general, such reordering would
change the results. It is possible to analyze the query and prove that
certain reorderings are valid (don't change the results), but we don't
currently have code to do that.

> As a reference, our production Oracle database (exactly the same
> hardware, but RAID-mirroring) with way more load can handle the query in
> 1-2 seconds. I have MySQL 4.0.14 with InnoDB on the same machine
> (shutdown when I am testing Postgres, and visa versa) and it does the
> query in 0.20 seconds.

I'm prepared to believe that Oracle contains code that actually does the
analysis about which outer-join reorderings are valid, and is then able
to find the right join order by deduction. The last I heard about
MySQL, they have no join-order analysis at all; they unconditionally
interpret this type of query left-to-right, ie as

... FROM ((((commercial_entity CROSS JOIN country) CROSS JOIN
user_account) CROSS JOIN address_list)
LEFT JOIN state_province ON ...)
LEFT JOIN contact_info ON ...
WHERE ...

This is clearly at odds with the SQL spec's syntactically defined join
order semantics. It's possible that it always yields the same results
as the spec requires, but I'm not at all sure about that. In any case
this strategy is certainly not "better" than ours, it just performs
poorly on a different set of queries. Would I be out of line to
speculate that your query was previously tuned to work well in MySQL?

regards, tom lane


From: David Griffiths <dgriffiths(at)boats(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Tuning/performance issue...
Date: 2003-10-01 05:48:54
Message-ID: 074601c387df$b5939270$6501a8c0@griffiths2
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

> The most efficient way to handle this query would probably be to join
> the three tables with restrictions first, and then join the other tables
> to those. You could force this with not too much rewriting using
> something like (untested, but I think it's right)
>
> ... FROM commercial_entity CROSS JOIN user_account CROSS JOIN
> address_list LEFT JOIN state_province ON address_list.state_province_id
> = state_province.state_province_id
> LEFT JOIN contact_info ON address_list.contact_info_id =
> contact_info.contact_info_id
> CROSS JOIN country
> WHERE ...
>
> The explicit JOINs associate left-to-right, so this gives the intended
> join order. (In your original query, explicit JOIN binds more tightly
> than commas do.)

Ok - that's interesting - I'll have to do some reading and more testing.

> The reason PG's planner doesn't discover this join order for itself
> is that it's written to not attempt to re-order outer joins from the
> syntactically defined ordering. In general, such reordering would
> change the results. It is possible to analyze the query and prove that
> certain reorderings are valid (don't change the results), but we don't
> currently have code to do that.

Not sure I follow. Are you saying that, depending on when the outer-join is
applied to the rows found at the time, you may end up with a different set
of rows? I would have expected the optimizer to do the outer-joins last, as
the extra data received by the outer-joins is not mandatory, and won't
affect
the rows that were retreived by joining user_account, address_list, and
commercial_entity.

An outer join would *never* be the most restrictive
join in a query. I thought (from my readings on Oracle query tuning) that
finding the most restrictive table/index was the first task of an optimizer.
Reduce the result set as quickly as possible. That query has the line,

"AND commercial_entity.commercial_entity_id=225528",

which uses an index (primary key) and uses an "=". I would have expected
that to be done first, then joined with the other inner-join tables, and
finally
have the outer-joins applied to the final result set to fill in the "might
be there" data.

Anyway, if the optimizer does the outer-joins first (address_list with
state_province
and contact_info), then it's picking the table with the most rows
(address_list has
200K+ rows, where the other 3 big tables have 70K-90K). Would re-ordering
the FROM clause (and LEFT JOIN portions) help?

Could you give an example where applying an outer-join at a different time
could
result in different results? I think I can see at situation where you use
part of the results
in the outer-join in the where clause, but I am not sure.

> I'm prepared to believe that Oracle contains code that actually does the
> analysis about which outer-join reorderings are valid, and is then able
> to find the right join order by deduction.

I'm not sure about Oracle (other than what I stated above). In fact, about
half
the time, updating table stats to try to get the Oracle optimizer to do a
better
job on a query results in even worse performance.

> ... FROM ((((commercial_entity CROSS JOIN country) CROSS JOIN
> user_account) CROSS JOIN address_list)
> LEFT JOIN state_province ON ...)
> LEFT JOIN contact_info ON ...
> WHERE ...
>
> This is clearly at odds with the SQL spec's syntactically defined join
> order semantics. It's possible that it always yields the same results
> as the spec requires, but I'm not at all sure about that.

Again, I don't know. On the 3 queries based on these tables, Postgres
and MySQL return the exact same data (they use the same data set).

Do you have a link to the SQL spec's join-order requirements?

> In any case
> this strategy is certainly not "better" than ours, it just performs
> poorly on a different set of queries. Would I be out of line to
> speculate that your query was previously tuned to work well in MySQL?

The query was pulled from our codebase (written for Oracle). I added a bit
to it
to make it slower, and then ported to MySQL and tested there first (just
re-wrote
the outer-join syntax). I found that re-ordering the tables in the
from-clause on
MySQL changed the time by 45-ish% (0.36 seconds to .20 seconds), but that's
because I had forgotten to re-analyze the tables after refreshing the
dataset.
Now, table order doesn't make a difference in speed (or results).

If anything, I've done more tuning for Postgres - added some extra indexes
to try to help
(country.country_id had a composite index with another column, but not an
index for
just it), etc.

The dataset and schema is pure-Oracle. I extracted it out of the database,
removed all
Oracle-specific extensions, changed the column types, and migrated the
indexes and
foreign keys to MySQL and Postgres. Nothing more (other than an extra index
or two for Postgres - nada for MySQL).

This is all part of a "migrate away from Oracle" project. We are looking at
3 databases -
MySQL (InnoDB), Postgres and Matisse (object oriented). We have alot of
queries like this
or worse, and I'm worried that many of them would need to be re-written. The
developers
know SQL, but nothing about tuning, etc.

Thanks for the quick response - I will try explicit joining, and I'm looking
forward to
your comments on outer-joins and the optmizer (and anything else I've
written).

David.


From: Jeff <threshar(at)torgo(dot)978(dot)org>
To: David Griffiths <dgriffiths(at)boats(dot)com>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Tuning/performance issue...
Date: 2003-10-01 12:23:10
Message-ID: Pine.BSF.4.44.0310010817001.22676-100000@torgo.978.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On Tue, 30 Sep 2003, David Griffiths wrote:

>
> This is all part of a "migrate away from Oracle" project. We are looking at
> 3 databases -
> MySQL (InnoDB), Postgres and Matisse (object oriented). We have alot of
> queries like this
> or worse, and I'm worried that many of them would need to be re-written. The
> developers
> know SQL, but nothing about tuning, etc.
>

There's a movement at my company to ditch several commercial db's in favor
of a free one. I'm currently the big pg fan around here and I've actually
written a rather lengthy presentation about pg features, why, tuning, etc.
but another part was some comparisons to other db's..

I decided so I wouldn't be blinding flaming mysql to give it a whirl and
loaded it up with the same dataset as pg. First thing I hit was lack of
stored procedures. But I decided to code around that, giving mysql the
benefit of the doubt. What I found was interesting.

For 1-2 concurrent
'beaters' it screamed. ultra-fast. But.. If you increase the concurrent
beaters up to say, 20 Mysql comes to a grinding halt.. Mysql and the
machine itself become fairly unresponsive. And if you do cache unfriendly
queries it becomes even worse. On PG - no problems at all. Scaled fine
and dandy up. And with 40 concurrent beaters the machine was still
responsive. (The numbers for 20 client was 220 seconds (pg) and 650
seconds (mysql))

So that is another test to try out - Given your configuration I expect you
have lots of concurrent activity.

--
Jeff Trout <jeff(at)jefftrout(dot)com>
http://www.jefftrout.com/
http://www.stuarthamm.net/


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: David Griffiths <dgriffiths(at)boats(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Tuning/performance issue...
Date: 2003-10-01 14:14:26
Message-ID: 28832.1065017666@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

David Griffiths <dgriffiths(at)boats(dot)com> writes:
>> The reason PG's planner doesn't discover this join order for itself
>> is that it's written to not attempt to re-order outer joins from the
>> syntactically defined ordering. In general, such reordering would
>> change the results. It is possible to analyze the query and prove that
>> certain reorderings are valid (don't change the results), but we don't
>> currently have code to do that.

> Not sure I follow. Are you saying that, depending on when the outer-join is
> applied to the rows found at the time, you may end up with a different set
> of rows?

Here's an example showing that it's not always safe to rearrange join
order in the presence of outer joins:

jtest=# create table a (f1 int);
CREATE TABLE
jtest=# create table b (f1 int, f2 int);
CREATE TABLE
jtest=# create table c(f1 int, f2 int);
CREATE TABLE
jtest=# insert into a values (1);
INSERT 431307 1
jtest=# insert into b values (10,10);
INSERT 431308 1
jtest=# insert into b values (11,11);
INSERT 431309 1
jtest=# insert into c values (1,10);
INSERT 431310 1
jtest=# insert into c values (2,11);
INSERT 431311 1

jtest=# SELECT * FROM a, b LEFT JOIN c ON b.f2 = c.f2 WHERE a.f1 = c.f1;
f1 | f1 | f2 | f1 | f2
----+----+----+----+----
1 | 10 | 10 | 1 | 10
(1 row)

Per spec the JOIN operator binds more tightly than comma, so this is
equivalent to:

jtest=# SELECT * FROM a JOIN (b LEFT JOIN c ON b.f2 = c.f2) ON a.f1 = c.f1;
f1 | f1 | f2 | f1 | f2
----+----+----+----+----
1 | 10 | 10 | 1 | 10
(1 row)

Now suppose we try to join A and C before joining to B:

jtest=# SELECT * FROM b LEFT JOIN (a join c ON a.f1 = c.f1) ON b.f2 = c.f2;
f1 | f2 | f1 | f1 | f2
----+----+----+----+----
10 | 10 | 1 | 1 | 10
11 | 11 | | |
(2 rows)

We get a different answer, because some C rows are eliminated before
reaching the left join, causing null-extended B rows to be added.

(I don't have a MySQL installation here to try, but if they still work
the way they used to, they get the wrong answer on the first query.)

The point of this example is just that there are cases where it'd be
incorrect for the planner to change the ordering of joins from what
is implied by the query syntax. It is always safe to change the join
order when only inner joins are involved. There are cases where outer
join order is safe to change too, but you need analysis code that checks
the query conditions to prove that a particular rearrangement is safe.
Right now, we don't have such code, and so we just follow the simple
rule "never rearrange any outer joins".

> I would have expected the optimizer to do the outer-joins last, as the
> extra data received by the outer-joins is not mandatory, and won't
> affect the rows that were retreived by joining user_account,
> address_list, and commercial_entity.

I think your example falls into the category of provably-safe
rearrangements ... but as I said, the planner doesn't know that.

> An outer join would *never* be the most restrictive
> join in a query.

Sure it can, if the restriction conditions are mainly on the outer
join's tables. But that's not really the issue here. As best I can
tell without seeing your data statistics, the most restrictive
conditions in your query are the ones on
commercial_entity.commercial_entity_id and user_account.user_role_id.
The trick is to apply those before joining any other tables.

regards, tom lane