Quick Links

Re: PATCH: hashjoin - gracefully increasing NTUP_PER_BUCKET instead of batching

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Tomas Vondra <tv(at)fuzzy(dot)cz>
Cc:	"pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: PATCH: hashjoin - gracefully increasing NTUP_PER_BUCKET instead of batching
Date:	2014-12-11 21:16:47
Message-ID:	CA+TgmoZVLCxa4OaRTh3Qa2BZq75=zA+3-awM-pUi6kzBWswt7A@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Thu, Dec 11, 2014 at 2:51 PM, Tomas Vondra <tv(at)fuzzy(dot)cz> wrote:
> No, it's not rescanned. It's scanned only once (for the batch #0), and
> tuples belonging to the other batches are stored in files. If the number
> of batches needs to be increased (e.g. because of incorrect estimate of
> the inner table), the tuples are moved later.

Yeah, I think I sort of knew that, but I got confused. Thanks for clarifying.

> The idea was that if we could increase the load a bit (e.g. using 2
> tuples per bucket instead of 1), we will still use a single batch in
> some cases (when we miss the work_mem threshold by just a bit). The
> lookups will be slower, but we'll save the I/O.

Yeah. That seems like a valid theory, but your test results so far
seem to indicate that it's not working out like that - which I find
quite surprising, but, I mean, it is what it is, right?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Re: PATCH: hashjoin - gracefully increasing NTUP_PER_BUCKET instead of batching at 2014-12-11 19:51:31 from Tomas Vondra

Responses

Re: PATCH: hashjoin - gracefully increasing NTUP_PER_BUCKET instead of batching at 2014-12-11 22:46:17 from Tomas Vondra

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Mark Dilger	2014-12-11 21:25:00	Re: WIP patch for Oid formatting in printf/elog strings
Previous Message	Joe Conway	2014-12-11 21:12:30	Re: Commitfest problems