Re: Improving executor performance - tidbitmap

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Improving executor performance - tidbitmap
Date: 2016-07-17 12:32:17
Message-ID: CA+Tgmob0wCGxr4byD08HCP6Km-k-qFtUja0MPDRn1bFw9Ov+UQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jul 13, 2016 at 11:06 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> On 2016-06-24 16:29:53 -0700, Andres Freund wrote:
>> 4) Various missing micro optimizations have to be performed, for more
>> architectural issues to become visible. E.g. [2] causes such bad
>> slowdowns in hash-agg workloads, that other bottlenecks are hidden.
>
> One such issue is the usage of dynahash.c in tidbitmap.c. In many
> queries, e.g. tpch q7, the bitmapscan is often the bottleneck. Profiling
> shows that this is largely due to dynahash.c being slow. Primary issues
> are: a) two level structure doubling the amount of indirect lookups b)
> indirect function calls c) using separate chaining based conflict
> resolution d) being too general.
>
> I've quickly hacked up an alternative linear addressing hashtable
> implementation. And the improvements are quite remarkable.

Nice!

> I'm wondering whether we can do 'macro based templates' or
> something. I.e. have something like the simplehash in the patch in
> simplehash.h, but the key/value widths, the function names, are all
> determined by macros (oh, this would be easier with C++ templates...).
>
> Does anybody have a better idea?

No.

> The major issue with the simplehash implementation in the path is
> probably the deletion; which should rather move cells around, rather
> than use toombstones. But that was too complex for a POC ;). Also, it'd
> likely need a proper iterator interface.

Do we ever need to delete from a TIDBitmap? Probably not, but I'm
guessing you have other uses for this in mind.

> FWIW, the dynahash usage in nodeAgg.c is a major bottleneck in a number
> of other queries.

Can we use this implementation for that as well, or are we going to
need yet another one?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2016-07-17 12:55:33 Re: rethinking dense_alloc (HashJoin) as a memory context
Previous Message Robert Haas 2016-07-17 12:27:33 Re: sslmode=require fallback