Re: [PATCH v3] Avoid manual shift-and-test logic in AllocSetFreeIndex

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jeremy Kerr <jk(at)ozlabs(dot)org>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH v3] Avoid manual shift-and-test logic in AllocSetFreeIndex
Date: 2009-07-19 11:22:05
Message-ID: 603c8f070907190422r13c19deaw74b1d9fb95e4602f@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jun 30, 2009 at 3:08 AM, Jeremy Kerr<jk(at)ozlabs(dot)org> wrote:
> Move the shift-and-test login into a separate fls() function, which
> can use __builtin_clz() if it's available.
>
> This requires a new check for __builtin_clz in the configure script.
>
> Results in a ~2% performance increase on PowerPC.

There are some comments on this patch that have been posted to
commitfest.postgresql.org rather than emailed to -hackers. Note to
all: please don't do this. The purpose of commitfest.postgresql.org
is to summarize the mailing list discussion for easy reference, not to
replace the mailing lists.

That having been said, Jeremy, you probably want to take a look at
those comments and I have a few responses to them as well.

Dan Colish, the round-robin reviewer assigned to this patch, added the
following comment:
> Applied and built cleanly. Regress passes. Trying to hunt down ppc box to see if performance enhancement
> can be seen.

Question: are we only doing this because of PowerPC? What is going to
happen on x86 and other architectures? x86 is not the center of the
universe, but at a very minimum we need to make sure that things don't
go backwards on what is a very common platform. Has anyone done any
benchmarking of this?

A related question: the original email for this patch says that it
results in a performance increase of about 2% on PPC. But since it
gives no details on exactly what the test was that improved by 2%,
it's hard to judge the impact of this. If this means that
AllocSetFreeIndex() is 2% faster, I think we should reject this patch
and move on. It's not worth introducing architecture-specific code to
get a performance improvement that probably won't even move the needle
on a macro-benchmark. On the other hand, if the test was something
like a pgbench run, then this is really very impressive. But we don't
know which it is.

Zdenek Kotala added this comment:
> I have several objections:
>
> - inline is forbidden to use in PostgreSQL - you need exception or do it differently
>
> - type mismatch - Size vs unsigned int vs 32. you should use only Size and sizeof(Size)
>
> And general comment:
>
> Do we want to have this kind of code which is optimized for one compiler and platform. See openSSL or
> MySQL, they have many optimizations which work fine on one platform with specific version of compiler and
> specific version of OS. But if you select different version of compiler or different compiler you can get very
> different performance result (e.g. MySQL 30% degradation, OpenSSL RSA 3x faster and so on).
>
> I think in this case, it makes sense to have optimization here, but we should do it like spinlocks are
> implemented and put this code into /port. It should help to implemented special code for SPARC and SUN
> Studio for example.

I don't have any thoughts on this part beyond what I stated above, but
hopefully Jeremy does.

I am going to mark this "Waiting on author" pending a response to all
of the above, though hopefully Dan Colish will continue with his
reviewing efforts in the meantime.

Thanks,

...Robert

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2009-07-19 11:28:35 Re: Review: Revise parallel pg_restore's scheduling heuristic
Previous Message Andres Freund 2009-07-19 10:54:21 Re: machine-readable explain output v2