Re: hash join vs nested loop join

From: Huan Ruan <huan(dot)ruan(dot)it(at)gmail(dot)com>
To: Kevin Grittner <kgrittn(at)mail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: hash join vs nested loop join
Date: 2012-12-20 21:16:24
Message-ID: CAD1stZtH9WbvHnKtcRVD_w5OYabXh9k18vUhygvYBe8yozPk6A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 21 December 2012 01:06, Kevin Grittner <kgrittn(at)mail(dot)com> wrote:

> Huan Ruan wrote:
> > Kevin Grittner wrote:
>
> >> Frankly, at 12 microseconds per matched pair of rows, I think
> >> you're doing OK.
> >
> > This plan is the good one, I want the indexscan nested loop join and this
> > is only achieved after making all these costing factors change. Before
> > that, it was hash join and was very slow.
> >
> > However, I'm worried about the config changes being too 'extreme', i.e.
> > both sequential I/O and random I/O have the same cost and being only 0.1.
> > So, I was more wondering why I have to make such dramatic changes to
> > convince the optimiser to use NL join instead of hash join. And also, I'm
> > not sure what impact will these changes have on other queries yet. e.g.
> > will a query that's fine with hash join now choose NL join and runs
> slower?
>
> I understand the concern, but PostgreSQL doesn't yet have a knob to
> turn for "cache hit ratio". You essentially need to build that into
> the page costs. Since your cache hit ratio (between shared buffers
> and the OS) is so high, the cost of page access relative to CPU
> costs has declined and there isn't any effective difference between
> sequential and random access. As the level of caching changes, you
> may need to adjust. In one production environment where there was
> significant caching, but far enough from 100% to matter, we tested
> various configurations and found the fastest plans being chosen
> with seq_page_cost = 0.3 and random_page_cost = 0.5. Tune to your
> workload.
>
>
> Thanks Kevin. I think I get some ideas now that I can try on the
production server when we switch.

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Jeff Janes 2012-12-20 22:31:44 Re: Performance on Bulk Insert to Partitioned Table
Previous Message Charles Gomes 2012-12-20 20:08:33 Re: Performance on Bulk Insert to Partitioned Table