Re: Inlining comparators as a performance optimisation

From: karavelov(at)mail(dot)bg
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Inlining comparators as a performance optimisation
Date: 2011-09-21 00:48:16
Message-ID: 4f1fa2fe44a6bfd798dbe5af7a5f1524.mailbg@mail.bg
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

----- Цитат от Peter Geoghegan (peter(at)2ndquadrant(dot)com), на 21.09.2011 в 02:53 -----

> On 20 September 2011 03:51, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Considering that -O2 is our standard optimization level, that
>> observation seems to translate to "this patch will be useless in
>> practice".  I think you had better investigate that aspect in some
>> detail before spending more effort.
>
> I don't think that the fact that that happens is at all significant at
> this early stage, and it never even occurred to me that you'd think
> that it might be. I was simply disclosing a quirk of this POC patch.
> The workaround is probably to use a macro instead. For the benefit of
> those that didn't follow the other threads, the macro-based qsort
> implementation, which I found to perform significantly better than
> regular qsort(), runs like this on my laptop when I built at 02 with
> GCC 4.6 just now:
>
> C stdlib quick-sort time elapsed: 2.092451 seconds
> Inline quick-sort time elapsed: 1.587651 seconds
>
> Does *that* look attractive to you? I've attached source code of the
> program that produced these figures, which has been ported to C from
> C++.
>
> When I #define LARGE_SIZE 100000000, here's what I see:
>
> [peter(at)peter inline_compar_test]$ ./a.out
> C stdlib quick-sort time elapsed: 23.659411 seconds
> Inline quick-sort time elapsed: 18.470611 seconds
>
> Here, sorting with the function pointer/stdlib version takes about
> 1.28 times as long. In the prior test (with the smaller LARGE_SIZE),
> it took about 1.32 times as long. Fairly predictable, linear, and not
> to be sniffed at.
>
> The variance I'm seeing across runs is low - a couple of hundredths of
> a second at most. This is a Fedora 15 " Intel(R) Core(TM) i5-2540M CPU
> @ 2.60GHz" machine. I'm not sure right now why the inline quick-sort
> is less of a win than on my old Fedora 14 desktop (where it was 3.24
> Vs 2.01), but it's still a significant win. Perhaps others can build
> this simple program and tell me what they come up with.
>
Run it here.

Intel(R) Core(TM)2 Duo CPU E8200 @ 2.66GHz
gcc version 4.6.1 (Debian 4.6.1-10)

g++ -O2 qsort-inline-benchmark.c
./a.out
C stdlib quick-sort time elapsed: 1.942686 seconds
Inline quick-sort time elapsed: 1.126508 seconds

With #define LARGE_SIZE 100000000

C stdlib quick-sort time elapsed: 22.158207 seconds
Inline quick-sort time elapsed: 12.861018 seconds

with g++ -O0
C stdlib quick-sort time elapsed: 2.736360 seconds
Inline quick-sort time elapsed: 2.045619 seconds

On server hardware:
Intel(R) Xeon(R) CPU E5405 @ 2.00GHz
gcc version 4.4.5 (Debian 4.4.5-8)

/a.out
C stdlib quick-sort time elapsed: 2.610150 seconds
Inline quick-sort time elapsed: 1.494198 seconds

All -O2 version show 42% speedup with inlined qsort.
-O0 showed 25% speedup.

Best regards

--
Luben Karavelov

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2011-09-21 01:00:47 Re: WIP: Collecting statistics on CSV file data
Previous Message Tom Lane 2011-09-21 00:30:42 Isolation tests still falling over routinely