Re: Cost of XLogInsert CRC calculations

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Mark Cave-Ayland" <m(dot)cave-ayland(at)webbased(dot)co(dot)uk>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Cost of XLogInsert CRC calculations
Date: 2005-05-16 16:35:35
Message-ID: 18629.1116261335@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

"Mark Cave-Ayland" <m(dot)cave-ayland(at)webbased(dot)co(dot)uk> writes:
> I didn't post the sources to the list originally as I wasn't sure if the
> topic were of enough interest to warrant a larger email. I've attached the
> two corrected programs as a .tar.gz - crctest.c uses uint32, whereas
> crctest64.c uses uint64.

I did some experimentation and concluded that gcc is screwing up
big-time on optimizing the CRC64 code for 32-bit Intel. It does much
better on every other architecture though.

Here are some numbers with gcc 3.2.3 on an Intel Xeon machine. (I'm
showing the median of three trials in each case, but the numbers were
pretty repeatable. I also tried gcc 4.0.0 on this machine and got
similar numbers.)

gcc -O1 crctest.c 0.328571 s
gcc -O2 crctest.c 0.297978 s
gcc -O3 crctest.c 0.306894 s

gcc -O1 crctest64.c 0.358263 s
gcc -O2 crctest64.c 0.773544 s
gcc -O3 crctest64.c 0.770945 s

When -O2 is slower than -O1, you know the compiler is blowing it :-(.
I fooled around with non-default -march settings but didn't see much
change.

Similar tests on a several-year-old Pentium 4 machine, this time with
gcc version 3.4.3:

gcc -O1 -march=pentium4 crctest.c 0.486266 s
gcc -O2 -march=pentium4 crctest.c 0.520237 s
gcc -O3 -march=pentium4 crctest.c 0.520299 s

gcc -O1 -march=pentium4 crctest64.c 0.928107 s
gcc -O2 -march=pentium4 crctest64.c 1.247673 s
gcc -O3 -march=pentium4 crctest64.c 1.654102 s

Here are some comparisons showing that the performance difference is
not inherent:

IA64 (Itanium 2), gcc 3.2.3:

gcc -O1 crctest.c 0.898595 s
gcc -O2 crctest.c 0.599005 s
gcc -O3 crctest.c 0.598824 s

gcc -O1 crctest64.c 0.524257 s
gcc -O2 crctest64.c 0.524168 s
gcc -O3 crctest64.c 0.524140 s

X86_64 (Opteron), gcc 3.2.3:

gcc -O1 crctest.c 0.460000 s
gcc -O2 crctest.c 0.460000 s
gcc -O3 crctest.c 0.460000 s

gcc -O1 crctest64.c 0.410000 s
gcc -O2 crctest64.c 0.410000 s
gcc -O3 crctest64.c 0.410000 s

PPC64 (IBM POWER4+), gcc 3.2.3

gcc -O1 crctest.c 0.819492 s
gcc -O2 crctest.c 0.819427 s
gcc -O3 crctest.c 0.820616 s

gcc -O1 crctest64.c 0.751639 s
gcc -O2 crctest64.c 0.894250 s
gcc -O3 crctest64.c 0.888959 s

PPC (Mac G4), gcc 3.3

gcc -O1 crctest.c 0.949094 s
gcc -O2 crctest.c 1.011220 s
gcc -O3 crctest.c 1.013847 s
gcc -O1 crctest64.c 1.314093 s
gcc -O2 crctest64.c 1.015367 s
gcc -O3 crctest64.c 1.011468 s

HPPA, gcc 2.95.3:

gcc -O1 crctest.c 1.796604 s
gcc -O2 crctest.c 1.676023 s
gcc -O3 crctest.c 1.676476 s
gcc -O1 crctest64.c 2.022798 s
gcc -O2 crctest64.c 1.916185 s
gcc -O3 crctest64.c 1.904094 s

Given the lack of impressive advantage to the 64-bit code even on 64-bit
architectures, it might be best to go with the 32-bit code everywhere,
but I also think we have grounds to file a gcc bug report.

Anyone want to try it with non-gcc compilers? I attach a slightly
cleaned-up version of Mark's original (doesn't draw compiler warnings
or errors on what I tried it on).

regards, tom lane

Attachment Content-Type Size
crctest.tar.gz application/octet-stream 6.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2005-05-16 16:40:36 Re: SO_KEEPALIVE
Previous Message Josh Berkus 2005-05-16 16:12:39 Re: postgreSQL as deductive DBMS