Re: WIP: Faster Expression Processing v4

From: Andres Freund <andres(at)anarazel(dot)de>
To: Ants Aasma <ants(dot)aasma(at)eesti(dot)ee>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>,Heikki Linnakangas <hlinnaka(at)iki(dot)fi>,Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>,PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: Faster Expression Processing v4
Date: 2017-03-26 00:23:21
Message-ID: 3CC5B4C5-3ADB-4296-A04B-64F594A4CE4F@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On March 25, 2017 4:56:11 PM PDT, Ants Aasma <ants(dot)aasma(at)eesti(dot)ee> wrote:
>On Sun, Mar 26, 2017 at 12:22 AM, Andres Freund <andres(at)anarazel(dot)de>
>wrote:
>>> At least with current gcc (6.3.1 on Fedora 25) at -O2,
>>> what I see is multiple places jumping to the same indirect jump
>>> instruction :-(. It's not a total disaster: as best I can tell, all
>the
>>> uses of EEO_JUMP remain distinct. But gcc has chosen to implement
>about
>>> 40 of the 71 uses of EEO_NEXT by jumping to the same couple of
>>> instructions that increment the "op" register and then do an
>indirect
>>> jump :-(.
>>
>> Yea, I see some of that too - "usually" when there's more than just
>the
>> jump in common. I think there's some gcc variables that influence
>this
>> (min-crossjump-insns (5), max-goto-duplication-insns (8)). Might be
>> worthwhile experimenting with setting them locally via a pragma or
>such.
>> I think Aants wanted to experiment with that, too.
>
>I haven't had the time to research this properly, but initial tests
>show that with GCC 6.2 adding
>
>#pragma GCC optimize ("no-crossjumping")
>
>fixes merging of the op tail jumps.
>
>Some quick and dirty benchmarking suggests that the benefit for the
>interpreter is about 15% (5% speedup on a workload that spends 1/3 in
>ExecInterpExpr). My idea of prefetching op->resnull/resvalue to local
>vars before the indirect jump is somewhere between a tiny benefit and
>no effect, certainly not worth introducing extra complexity. Clang 3.8
>does the correct thing out of the box and is a couple of percent
>faster than GCC with the pragma.

That's large enough to be worth doing (although I recall you seeing all jumps commonalized). We should probably do this on a per function basis however (either using pragma push option, or function attributes).

Andres

--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joe Conway 2017-03-26 00:33:33 Re: Re: [COMMITTERS] pgsql: Faster expression evaluation and targetlist projection.
Previous Message Andres Freund 2017-03-26 00:21:18 Re: Re: [COMMITTERS] pgsql: Faster expression evaluation and targetlist projection.