Query performance. 7.2.3 Vs. 7.3

Lists: pgsql-hackers
From: wade <wade(at)wavefire(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Query performance. 7.2.3 Vs. 7.3
Date: 2002-11-29 01:04:37
Message-ID: 3.0.32.20021128170437.015d7b20@mail.wavefire.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

While playing with one of my DBs under 7.3 to make use of its better
explain features, I came across a query that runs significantly slower
under 7.3 than
7.2.3. At first, I thought it would be a hardware issue, so i installed both
versions on the same box.
7.2.3 tends to run the query in 80% of the time 7.3 does.
Explain output can be found at http://arch.wavefire.com/72v73a.txt

Please don't hesitate to drop me a line if you require more info.
-Wade Klaver


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: wade <wade(at)wavefire(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Query performance. 7.2.3 Vs. 7.3
Date: 2002-11-29 02:23:00
Message-ID: 18853.1038536580@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

wade <wade(at)wavefire(dot)com> writes:
> While playing with one of my DBs under 7.3 to make use of its better
> explain features, I came across a query that runs significantly slower
> under 7.3 than
> 7.2.3. At first, I thought it would be a hardware issue, so i installed both
> versions on the same box.
> 7.2.3 tends to run the query in 80% of the time 7.3 does.
> Explain output can be found at http://arch.wavefire.com/72v73a.txt

The difference evidently is that 7.3 chooses a mergejoin where 7.2
picks a hashjoin.

AFAICT this must be a consequence of the reduction in mergejoin
estimated costs associated with this patch:

2002-02-28 23:09 tgl

* src/: backend/executor/nodeMergejoin.c,
backend/optimizer/path/costsize.c, backend/utils/adt/selfuncs.c,
backend/utils/cache/lsyscache.c, include/utils/lsyscache.h,
include/utils/selfuncs.h: Teach planner about the idea that a
mergejoin won't necessarily read both input streams to the end. If
one variable's range is much less than the other, an
indexscan-based merge can win by not scanning all of the other
table. Per example from Reinhard Max.

since we really didn't do anything else in 7.3 that changed the behavior
of costsize.c.

I can't get totally excited about a 20% estimation error (if the planner
was never off by more than that, I'd be overjoyed ;-)) ... but if you
want to dig into the statistics and try to figure out why this added
logic is misestimating in your particular case, I'd be interested to
hear. Probably the first thing to look at is why the estimated row
counts are off by almost a factor of 3 for that join.

regards, tom lane


From: Neil Conway <neilc(at)samurai(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: wade <wade(at)wavefire(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Query performance. 7.2.3 Vs. 7.3
Date: 2002-11-29 02:38:59
Message-ID: 1038537539.379.61.camel@tokyo
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, 2002-11-28 at 21:23, Tom Lane wrote:
> wade <wade(at)wavefire(dot)com> writes:
> > Explain output can be found at http://arch.wavefire.com/72v73a.txt
>
> The difference evidently is that 7.3 chooses a mergejoin where 7.2
> picks a hashjoin.

I was looking at this a bit in IRC, and I was more concerned by the fact
that 7.3 was 20% than 7.2 on the same hardware, when they both used the
same query plan (consider the data at the end of the URL above, after
the execution of 'SET enable_mergejoin = off;').

Also, is it expected that the cardinality estimates for join steps won't
be very accurate, right? (estimated: 19 rows, actual: 765 rows)

Cheers,

Neil
--
Neil Conway <neilc(at)samurai(dot)com> || PGP Key ID: DB3C29FC


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Neil Conway <neilc(at)samurai(dot)com>
Cc: wade <wade(at)wavefire(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Query performance. 7.2.3 Vs. 7.3
Date: 2002-11-29 02:58:14
Message-ID: 19117.1038538694@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Neil Conway <neilc(at)samurai(dot)com> writes:
> I was looking at this a bit in IRC, and I was more concerned by the fact
> that 7.3 was 20% than 7.2 on the same hardware, when they both used the
> same query plan (consider the data at the end of the URL above, after
> the execution of 'SET enable_mergejoin = off;').

Hm. Are we sure that both versions were built with the same
optimization level, etc? (My private bet is that Wade's 7.2 didn't
have multibyte or locale support --- but that's a long shot when we
don't know the datatypes of the columns being joined on...)

> Also, is it expected that the cardinality estimates for join steps won't
> be very accurate, right? (estimated: 19 rows, actual: 765 rows)

Well, it'd be nice to do better --- I was hoping Wade would look into
why the row estimates were off so much.

regards, tom lane