Re: bad estimation together with large work_mem generates terrible slow hash joins

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tomas Vondra <tv(at)fuzzy(dot)cz>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: bad estimation together with large work_mem generates terrible slow hash joins
Date: 2014-09-08 20:44:29
Message-ID: CA+TgmobAYGMuqvEPxANphEOLKjXzOKCZU79i1YfChmVmdtEEeg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Sep 5, 2014 at 3:23 PM, Tomas Vondra <tv(at)fuzzy(dot)cz> wrote:
> as Heikki mentioned in his "commitfest status" message, this patch still
> has no reviewers :-( Is there anyone willing to pick up that, together
> with the 'dense allocation' patch (as those two are closely related)?
>
> I'm looking in Robert's direction, as he's the one who came up with the
> dense allocation idea, and also commented on the hasjoin bucket resizing
> patch ...
>
> I'd ask Pavel Stehule, but as he's sitting next to me in the office he's
> not really independent ;-) I'll do some reviews on the other patches
> over the weekend, to balance this of course.

Will any of those patches to be, heh heh, mine?

I am a bit confused by the relationship between the two patches you
posted. The "combined" patch applies, but the other one does not. If
this is a sequence of two patches, it seems like it would be more
helpful to post A and B rather than B and A+B. Perhaps the other
patch is on a different thread but there's not an obvious reference to
said thread here.

In ExecChooseHashTableSize(), if buckets_bytes is independent of
nbatch, could we just compute it before working out dbatch, and just
deduct it from hash_table_bytes? That seems like it'd do the same
thing for less work. Either way, what happens if buckets_bytes is
already bigger than hash_table_bytes?

A few more stylistic issues that I see:

+ if ((hashtable->nbatch == 1) &&
+ if (hashtable->nbuckets_optimal <= (INT_MAX/2))
+ if (size > (HASH_CHUNK_SIZE/8))

While I'm all in favor of using parentheses generously where it's
needed to add clarity, these and similar usages seem over the top to
me.

On a related note, the first of these reads like this if (stuff) { if
(more stuff) { do things } } which makes one wonder why we need two if
statements.

+
+ /* used for dense allocation of tuples (into linked chunks) */
+ HashChunk chunks_batch; /* one list for the whole batch */
+
} HashJoinTableData;

If the original coding didn't have a blank line between the last
structure member and the closing brace, it's probably not right to
make it that way now. There are similar issues at the end of some
functions you modified, and a few other places (like the new code in
ExecChooseHashTableSize and chunk_alloc) where there are extra blank
lines at the starts and ends of blocks.

+char * chunk_alloc(HashJoinTable hashtable, int size) {

Eh... no.

+ /* XXX maybe we should use MAXALIGN(size) here ... */

+1.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2014-09-08 21:01:26 Re: pg_background (and more parallelism infrastructure patches)
Previous Message Alvaro Herrera 2014-09-08 20:42:14 Re: ALTER TABLESPACE MOVE command tag tweak