Re: Encouraging multi-table join order

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Dan Harris <fbsd(at)drivefaster(dot)net>
Cc: PostgreSQL Performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Encouraging multi-table join order
Date: 2006-04-11 21:08:53
Message-ID: 10770.1144789733@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Dan Harris <fbsd(at)drivefaster(dot)net> writes:
> Tom Lane wrote:
>> What does the pg_stats entry for eventactivity.incidentid
>> contain?

> {P043190299,P051560740,P052581036,P052830218,P053100679,P053190889,P060370845,P042070391,P042690319,P043290117}
> |
> {0.00166667,0.00166667,0.00166667,0.00166667,0.00166667,0.00166667,0.00166667,0.00133333,0.00133333,0.00133333}

> How high should I set this? I read the default is 10, but I'm not sure
> if doubling this would make a difference or if I should be doing a much
> larger number. There's approx 45 million rows in the table, if that matters.

What the stats entry is saying is that the most common entries occur
about 75000 times apiece (0.00166667 * 45e6), which is what's scaring
the planner here ;-). I think those frequencies are artificially high
though. The default statistics sample size is 3000 rows (300 *
statistics target, actually), so those numbers correspond to 5 or 4
rows in the sample, which is probably just random chance.

Try increasing the stats targets for this table to 100, then re-ANALYZE
and see what you get. The most_common_freqs entries might drop as much
as a factor of 10.

regards, tom lane

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Dan Harris 2006-04-11 21:21:48 Re: Encouraging multi-table join order
Previous Message Merlin Moncure 2006-04-11 21:02:00 Re: Stored Procedure Performance