[RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up

Lists: pgsql-hackers
From: Andres Freund <andres(at)anarazel(dot)de>
To: pgsql-hackers(at)postgresql(dot)org
Subject: [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up
Date: 2010-05-20 20:27:48
Message-ID: 201005202227.49990.andres@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

I started to analyze XLogInsert because it was the major bottleneck when
creating some materialized view/cached tables/whatever.
Analyzing it I could see that content of the COMP_CRC32 macro was taking most
of the time which isn't immediately obvious when you profile because it
obviously doesn't show up as a separate function.
I first put it into functions to make it easier to profile. I couldn't measure
any difference for COPY, CTAS and a simple pgbench run on 3 kinds of hardware
(Core2, older Xeon, older Sparc systems).

I looked a bit around for faster implementations of CRC32 and found one in
zlib. After adapting it (pg uses slightly different computation (non-
inverted)) I found that it increases the speed of the CRC32 calculation itself
3 fold.
It does that by not only using one lookup table but four (one for each byte of
a word). Those four calculations are independent and thus are considerably
faster on somewhat recent hardware.
Also it does memory lookups in 4 byte steps instead of 1 byte as the pg
version (thats only about ~8% benefit in itself).

I wrote a preliminary patch which includes both, the original implementation
and the new one switchable via an #define.

I tested performance differences in a small number of scenarios:
- CTAS/INSERT ... SELECT (8-30%)
- COPY (3-20%)
- pgbench (no real difference unless directly after a checkpoint)

Setup:

CREATE TABLE blub (ai int, bi int, aibi int);
CREATE TABLE speedtest (ai int, bi int, aibi int);

INSERT ... SELECT:

Statement:
INSERT INTO blub SELECT a.i, b.i, a.i *b.i FROM generate_series(1, 10000)
a(i), generate_series(1, 1000) b(i);

legacy crc:

11526.588
11406.518
11412.182
11430.245

zlib:
9977.394
9945.408
9840.907
9842.875

COPY:
Statement:
('blub' enlarged here 4 times, as otherwise the variances were to large)

COPY blub TO '/tmp/b' BINARY;
...
CHECKPOINT;TRUNCATE speedtest; COPY speedtest FROM '/tmp/b' BINARY;

legacy:
44835.840
44832.876

zlib:
39530.549
39365.109
39295.167

The performance differences are bigger if the table rows are significantly
bigger.

Do you think something like that is sensible? If yes, I will make it into a
proper patch and such.

Thanks,

Andres

INSERT ... SELECT profile before patch:

20.22% postgres postgres [.] comp_crc32
5.77% postgres postgres [.] XLogInsert
5.55% postgres postgres [.] LWLockAcquire
5.21% postgres [kernel. [k] copy_user_generic_string
4.64% postgres postgres [.] LWLockRelease
4.39% postgres postgres [.] ReadBuffer_common
2.75% postgres postgres [.] heap_insert
2.22% postgres libc-2.1 [.] memcpy
2.09% postgres postgres [.] UnlockReleaseBuffer
1.85% postgres postgres [.] hash_any
1.77% postgres [kernel. [k] clear_page_c
1.69% postgres postgres [.] hash_search_with_hash_value
1.61% postgres postgres [.] heapgettup_pagemode
1.50% postgres postgres [.] PageAddItem
1.42% postgres postgres [.] MarkBufferDirty
1.28% postgres postgres [.] RelationGetBufferForTuple
1.15% postgres postgres [.] ExecModifyTable
1.06% postgres postgres [.] RelationPutHeapTuple

After:

9.97% postgres postgres [.] comp_crc32
5.95% postgres [kernel. [k] copy_user_generic_string
5.94% postgres postgres [.] LWLockAcquire
5.64% postgres postgres [.] XLogInsert
5.11% postgres postgres [.] LWLockRelease
4.63% postgres postgres [.] ReadBuffer_common
3.45% postgres postgres [.] heap_insert
2.54% postgres libc-2.1 [.] memcpy
2.03% postgres postgres [.] UnlockReleaseBuffer
1.94% postgres postgres [.] hash_search_with_hash_value
1.84% postgres postgres [.] hash_any
1.73% postgres [kernel. [k] clear_page_c
1.68% postgres postgres [.] PageAddItem
1.62% postgres postgres [.] heapgettup_pagemode
1.52% postgres postgres [.] RelationGetBufferForTuple
1.47% postgres postgres [.] MarkBufferDirty
1.30% postgres postgres [.] ExecModifyTable
1.23% postgres postgres [.] RelationPutHeapTuple

Attachment Content-Type Size
0001-Preliminary-patch-using-an-improved-out-of-line-crc3.patch text/x-patch 12.4 KB

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up
Date: 2010-05-20 20:39:26
Message-ID: 20100520203926.GN21875@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andres,

* Andres Freund (andres(at)anarazel(dot)de) wrote:
> Statement:
> INSERT INTO blub SELECT a.i, b.i, a.i *b.i FROM generate_series(1, 10000)
> a(i), generate_series(1, 1000) b(i);
>
> legacy crc:
>
> zlib:

Is this legacy crc using the function-based calls, or the macro? Do you
have statistics for the zlib approach vs unmodified PG?

> Do you think something like that is sensible? If yes, I will make it into a
> proper patch and such.

I think that in general we're typically looking for ways to improve
performance, yes.. :)

Thanks,

Stephen


From: Andres Freund <andres(at)anarazel(dot)de>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Stephen Frost <sfrost(at)snowman(dot)net>
Subject: Re: [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up
Date: 2010-05-20 20:49:04
Message-ID: 201005202249.04962.andres@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi Stephen,

On Thursday 20 May 2010 22:39:26 Stephen Frost wrote:
> * Andres Freund (andres(at)anarazel(dot)de) wrote:
> > Statement:
> > INSERT INTO blub SELECT a.i, b.i, a.i *b.i FROM generate_series(1, 10000)
> > a(i), generate_series(1, 1000) b(i);
> >
> > legacy crc:
> Is this legacy crc using the function-based calls, or the macro? Do you
> have statistics for the zlib approach vs unmodified PG?
'legacy' is out of line as well. I couldn't find a real performance difference
above noise between out of line (function) and inline (macro). If anything out
of line was a bit faster (instruction cache usage could cause that).

So vanilla<->zlib should be the same as legacy<->zlib

Greetings,
Andres


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up
Date: 2010-05-21 03:40:03
Message-ID: AANLkTikqy8DmqU2WNlDRuFpiKOr_BLEAxkJbvKO-sqsE@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, May 20, 2010 at 4:27 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> I looked a bit around for faster implementations of CRC32 and found one in
> zlib. After adapting it (pg uses slightly different computation (non-
> inverted)) I found that it increases the speed of the CRC32 calculation itself
> 3 fold.

But zlib is not under the PostgreSQL license.

...Robert


From: Andres Freund <andres(at)anarazel(dot)de>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up
Date: 2010-05-21 05:11:34
Message-ID: 201005210711.35247.andres@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Friday 21 May 2010 05:40:03 Robert Haas wrote:
> On Thu, May 20, 2010 at 4:27 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> > I looked a bit around for faster implementations of CRC32 and found one
> > in zlib. After adapting it (pg uses slightly different computation (non-
> > inverted)) I found that it increases the speed of the CRC32 calculation
> > itself 3 fold.
>
> But zlib is not under the PostgreSQL license.
Yes. But:
1. the zlib license shouldn't be a problem in itself - pg_dump also already
links to zlib
2. I planned to ask Mark Adler whether he would support relicising those bits.
I have read some other discussions where he was supportive of doing such a
thing
3. Given that idea was posted publically on the usenet it is not hard to
produce an independent implementation.

So I do not see any big problems there... Or am I missing something?

Greetings,

Andres

/* zlib.h -- interface of the 'zlib' general purpose compression library
version 1.2.2, October 3rd, 2004

Copyright (C) 1995-2004 Jean-loup Gailly and Mark Adler

This software is provided 'as-is', without any express or implied
warranty. In no event will the authors be held liable for any damages
arising from the use of this software.

Permission is granted to anyone to use this software for any purpose,
including commercial applications, and to alter it and redistribute it
freely, subject to the following restrictions:

1. The origin of this software must not be misrepresented; you must not
claim that you wrote the original software. If you use this software
in a product, an acknowledgment in the product documentation would be
appreciated but is not required.
2. Altered source versions must be plainly marked as such, and must not be
misrepresented as being the original software.
3. This notice may not be removed or altered from any source distribution.

Jean-loup Gailly jloup(at)gzip(dot)org
Mark Adler madler(at)alumni(dot)caltech(dot)edu

*/


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up
Date: 2010-05-30 02:33:16
Message-ID: 201005300233.o4U2XGp14143@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Added to TODO:

Consider a faster CRC32 algorithm

* http://archives.postgresql.org/pgsql-hackers/2010-05/msg01112.php

---------------------------------------------------------------------------

Andres Freund wrote:
> Hi,
>
> I started to analyze XLogInsert because it was the major bottleneck when
> creating some materialized view/cached tables/whatever.
> Analyzing it I could see that content of the COMP_CRC32 macro was taking most
> of the time which isn't immediately obvious when you profile because it
> obviously doesn't show up as a separate function.
> I first put it into functions to make it easier to profile. I couldn't measure
> any difference for COPY, CTAS and a simple pgbench run on 3 kinds of hardware
> (Core2, older Xeon, older Sparc systems).
>
> I looked a bit around for faster implementations of CRC32 and found one in
> zlib. After adapting it (pg uses slightly different computation (non-
> inverted)) I found that it increases the speed of the CRC32 calculation itself
> 3 fold.
> It does that by not only using one lookup table but four (one for each byte of
> a word). Those four calculations are independent and thus are considerably
> faster on somewhat recent hardware.
> Also it does memory lookups in 4 byte steps instead of 1 byte as the pg
> version (thats only about ~8% benefit in itself).
>
> I wrote a preliminary patch which includes both, the original implementation
> and the new one switchable via an #define.
>
>
> I tested performance differences in a small number of scenarios:
> - CTAS/INSERT ... SELECT (8-30%)
> - COPY (3-20%)
> - pgbench (no real difference unless directly after a checkpoint)
>
> Setup:
>
> CREATE TABLE blub (ai int, bi int, aibi int);
> CREATE TABLE speedtest (ai int, bi int, aibi int);
>
>
> INSERT ... SELECT:
>
> Statement:
> INSERT INTO blub SELECT a.i, b.i, a.i *b.i FROM generate_series(1, 10000)
> a(i), generate_series(1, 1000) b(i);
>
> legacy crc:
>
> 11526.588
> 11406.518
> 11412.182
> 11430.245
>
> zlib:
> 9977.394
> 9945.408
> 9840.907
> 9842.875
>
>
> COPY:
> Statement:
> ('blub' enlarged here 4 times, as otherwise the variances were to large)
>
> COPY blub TO '/tmp/b' BINARY;
> ...
> CHECKPOINT;TRUNCATE speedtest; COPY speedtest FROM '/tmp/b' BINARY;
>
> legacy:
> 44835.840
> 44832.876
>
> zlib:
> 39530.549
> 39365.109
> 39295.167
>
> The performance differences are bigger if the table rows are significantly
> bigger.
>
> Do you think something like that is sensible? If yes, I will make it into a
> proper patch and such.
>
> Thanks,
>
> Andres
>
> INSERT ... SELECT profile before patch:
>
> 20.22% postgres postgres [.] comp_crc32
> 5.77% postgres postgres [.] XLogInsert
> 5.55% postgres postgres [.] LWLockAcquire
> 5.21% postgres [kernel. [k] copy_user_generic_string
> 4.64% postgres postgres [.] LWLockRelease
> 4.39% postgres postgres [.] ReadBuffer_common
> 2.75% postgres postgres [.] heap_insert
> 2.22% postgres libc-2.1 [.] memcpy
> 2.09% postgres postgres [.] UnlockReleaseBuffer
> 1.85% postgres postgres [.] hash_any
> 1.77% postgres [kernel. [k] clear_page_c
> 1.69% postgres postgres [.] hash_search_with_hash_value
> 1.61% postgres postgres [.] heapgettup_pagemode
> 1.50% postgres postgres [.] PageAddItem
> 1.42% postgres postgres [.] MarkBufferDirty
> 1.28% postgres postgres [.] RelationGetBufferForTuple
> 1.15% postgres postgres [.] ExecModifyTable
> 1.06% postgres postgres [.] RelationPutHeapTuple
>
>
> After:
>
> 9.97% postgres postgres [.] comp_crc32
> 5.95% postgres [kernel. [k] copy_user_generic_string
> 5.94% postgres postgres [.] LWLockAcquire
> 5.64% postgres postgres [.] XLogInsert
> 5.11% postgres postgres [.] LWLockRelease
> 4.63% postgres postgres [.] ReadBuffer_common
> 3.45% postgres postgres [.] heap_insert
> 2.54% postgres libc-2.1 [.] memcpy
> 2.03% postgres postgres [.] UnlockReleaseBuffer
> 1.94% postgres postgres [.] hash_search_with_hash_value
> 1.84% postgres postgres [.] hash_any
> 1.73% postgres [kernel. [k] clear_page_c
> 1.68% postgres postgres [.] PageAddItem
> 1.62% postgres postgres [.] heapgettup_pagemode
> 1.52% postgres postgres [.] RelationGetBufferForTuple
> 1.47% postgres postgres [.] MarkBufferDirty
> 1.30% postgres postgres [.] ExecModifyTable
> 1.23% postgres postgres [.] RelationPutHeapTuple

[ Attachment, skipping... ]

>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com


From: Greg Stark <gsstark(at)mit(dot)edu>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up
Date: 2010-05-30 02:56:09
Message-ID: AANLkTilatqmH4FoyP88K6sTBD2Bx42bd7gq1dd4k8rQH@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

This sounds familiar. If you search back in the archives around 2004
or so I think you'll find a similar discussion when we replaced the
crc32 implementation with what we have now. We put a fair amount of
effort into searching for faster implementations so if you've found
one 3x faster I'm pretty startled. Are you sure it's faster on all
architectures and not a win sometimes and a loss other times? And are
you sure it's faster in our use case where we're crcing small
sequences of data often and not crcing a large block?


From: Greg Stark <gsstark(at)mit(dot)edu>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up
Date: 2010-05-30 03:01:13
Message-ID: AANLkTinhppRHAv6TbS2L4LYsK0bkaYF7G75qhKOajioa@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, May 30, 2010 at 3:56 AM, Greg Stark <gsstark(at)mit(dot)edu> wrote:
> This sounds familiar. If you search back in the archives around 2004
> or so I think you'll find a similar discussion when we replaced the
> crc32 implementation with what we have now.

Fwiw here's the thread (from 2005):

http://thread.gmane.org/gmane.comp.db.postgresql.devel.general/43811
--
greg


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up
Date: 2010-05-30 03:54:52
Message-ID: 28755.1275191692@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Stark <gsstark(at)mit(dot)edu> writes:
> On Sun, May 30, 2010 at 3:56 AM, Greg Stark <gsstark(at)mit(dot)edu> wrote:
>> This sounds familiar. If you search back in the archives around 2004
>> or so I think you'll find a similar discussion when we replaced the
>> crc32 implementation with what we have now.

> Fwiw here's the thread (from 2005):
> http://thread.gmane.org/gmane.comp.db.postgresql.devel.general/43811

I read through that thread and couldn't find much discussion of
alternative CRC implementations --- we spent all our time on arguing
about whether we needed 64-bit CRC or not.

regards, tom lane


From: Andres Freund <andres(at)anarazel(dot)de>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up
Date: 2010-05-30 09:56:16
Message-ID: 201005301156.16956.andres@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sunday 30 May 2010 04:56:09 Greg Stark wrote:
> This sounds familiar. If you search back in the archives around 2004
> or so I think you'll find a similar discussion when we replaced the
> crc32 implementation with what we have now. We put a fair amount of
> effort into searching for faster implementations so if you've found
> one 3x faster I'm pretty startled.
All of those didnt think of computing more than one byte at the same time.
Most if not all current architectures are more or less superscalar (explictly
by the compiler or implicitly by somewhat intelligent silicon) - the current
algorithm has an ordering restrictions that prevent any benefit from that.
Basically it needs the CRC of the last byte for the next one - the zlib/my
version computes 4 bytes independently and then squashes them together which
results in way much better overall usage.

> Are you sure it's faster on all
> architectures and not a win sometimes and a loss other times? And are
> you sure it's faster in our use case where we're crcing small
> sequences of data often and not crcing a large block?
I tried on several and it was never a loss at 16+ bytes, never worse at 8, and
most of the time equal if not better at 4. Sizes of 1-4 are somewhat slower as
they use the same algorithm as the old version but do have an additional jump.
Thats a difference of about 3-4cycles.

I will try to implement an updated patch sometime these days.

Andres


From: Greg Stark <gsstark(at)mit(dot)edu>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up
Date: 2010-05-30 16:29:31
Message-ID: AANLkTimbsXVc9SUXl6YtbCtHe-ck2sJZmEY4b4aAjjWv@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, May 30, 2010 at 4:54 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I read through that thread and couldn't find much discussion of
> alternative CRC implementations --- we spent all our time on arguing
> about whether we needed 64-bit CRC or not.

Alright, how about this thread?

http://thread.gmane.org/gmane.comp.db.postgresql.devel.general/71741

This actually sounds like precisely the same algorithm. Perhaps this
implementation is much better but your tests on the old one showed a
big difference between smaller and larger data sequences.

--
greg


From: Greg Stark <gsstark(at)mit(dot)edu>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up
Date: 2010-05-30 16:43:12
Message-ID: AANLkTimcH8spYd8qhrpG-ROF17iPYByVqNROGLc-fGUB@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, May 30, 2010 at 5:29 PM, Greg Stark <gsstark(at)mit(dot)edu> wrote:
> On Sun, May 30, 2010 at 4:54 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> I read through that thread and couldn't find much discussion of
>> alternative CRC implementations --- we spent all our time on arguing
>> about whether we needed 64-bit CRC or not.
>
> Alright, how about this thread?
>
> http://thread.gmane.org/gmane.comp.db.postgresql.devel.general/71741

Huh, actually apparently this is right about on schedule for
reconsidering this topic:

http://article.gmane.org/gmane.comp.db.postgresql.devel.general/71903

:)

--
greg


From: Andres Freund <andres(at)anarazel(dot)de>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <bruce(at)momjian(dot)us>, pgsql-hackers(at)postgresql(dot)org, singh(dot)gurjeet(at)gmail(dot)com
Subject: Re: Re: [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up
Date: 2010-05-30 19:43:43
Message-ID: 201005302143.43734.andres@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sunday 30 May 2010 18:29:31 Greg Stark wrote:
> On Sun, May 30, 2010 at 4:54 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > I read through that thread and couldn't find much discussion of
> > alternative CRC implementations --- we spent all our time on arguing
> > about whether we needed 64-bit CRC or not.
>
> Alright, how about this thread?
>
> http://thread.gmane.org/gmane.comp.db.postgresql.devel.general/71741
>
> This actually sounds like precisely the same algorithm. Perhaps this
> implementation is much better but your tests on the old one showed a
> big difference between smaller and larger data sequences.
I haven't yet had a chance to read the intel paper (I am in the train and
latency is 30s+ and the original link is dead), but I got the sf.net
implementation.

Seeing it I think I might know the reason why it wasn't as much faster as
promised - it introduces ordering constraints by avoiding shifts by using
term2. Not sure though.

Anybody got the implementation by Gurjeet? I couldn't find it online (within
the constraints of the connection).

Greetings,

Andres


From: Andres Freund <andres(at)anarazel(dot)de>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Greg Stark <gsstark(at)mit(dot)edu>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: Re: Re: [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up
Date: 2010-05-30 20:48:37
Message-ID: 201005302248.37614.andres@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sunday 30 May 2010 18:43:12 Greg Stark wrote:
> On Sun, May 30, 2010 at 5:29 PM, Greg Stark <gsstark(at)mit(dot)edu> wrote:
> > On Sun, May 30, 2010 at 4:54 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> I read through that thread and couldn't find much discussion of
> >> alternative CRC implementations --- we spent all our time on arguing
> >> about whether we needed 64-bit CRC or not.
> >
> > Alright, how about this thread?
> >
> > http://thread.gmane.org/gmane.comp.db.postgresql.devel.general/71741
>
> Huh, actually apparently this is right about on schedule for
> reconsidering this topic:
>
> http://article.gmane.org/gmane.comp.db.postgresql.devel.general/71903
Oh, and the first zlib version sporting the 4 separate shifted tables approach
was 1.2.0 (9 March 2003) ;-)

Andres


From: "Pierre C" <lists(at)peufeu(dot)com>
To: "Andres Freund" <andres(at)anarazel(dot)de>, "Greg Stark" <gsstark(at)mit(dot)edu>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Bruce Momjian" <bruce(at)momjian(dot)us>, pgsql-hackers(at)postgresql(dot)org, singh(dot)gurjeet(at)gmail(dot)com
Subject: Re: Re: [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up
Date: 2010-06-07 10:37:13
Message-ID: op.vdxegbnreorkce@apollo13
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> On Sunday 30 May 2010 18:29:31 Greg Stark wrote:
>> On Sun, May 30, 2010 at 4:54 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> > I read through that thread and couldn't find much discussion of
>> > alternative CRC implementations --- we spent all our time on arguing
>> > about whether we needed 64-bit CRC or not.

SSE4.2 has a hardware CRC32 instruction, this might be interesting to
use...


From: Andres Freund <andres(at)anarazel(dot)de>
To: "Pierre C" <lists(at)peufeu(dot)com>
Cc: "Greg Stark" <gsstark(at)mit(dot)edu>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Bruce Momjian" <bruce(at)momjian(dot)us>, pgsql-hackers(at)postgresql(dot)org, singh(dot)gurjeet(at)gmail(dot)com
Subject: Re: Re: [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up
Date: 2010-06-07 10:45:58
Message-ID: 201006071245.58506.andres@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Monday 07 June 2010 12:37:13 Pierre C wrote:
> > On Sunday 30 May 2010 18:29:31 Greg Stark wrote:
> >> On Sun, May 30, 2010 at 4:54 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> > I read through that thread and couldn't find much discussion of
> >> > alternative CRC implementations --- we spent all our time on arguing
> >> > about whether we needed 64-bit CRC or not.
>
> SSE4.2 has a hardware CRC32 instruction, this might be interesting to
> use...
Different polynom unfortunately...

Andres


From: Florian Pflug <fgp(at)phlo(dot)org>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: "Pierre C" <lists(at)peufeu(dot)com>, "Greg Stark" <gsstark(at)mit(dot)edu>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Bruce Momjian" <bruce(at)momjian(dot)us>, pgsql-hackers(at)postgresql(dot)org, singh(dot)gurjeet(at)gmail(dot)com
Subject: Re: Re: [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up
Date: 2010-06-07 12:10:30
Message-ID: B0CDD01A-6928-4A92-80D3-E7A728480F7B@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Jun 7, 2010, at 12:45 , Andres Freund wrote:
> On Monday 07 June 2010 12:37:13 Pierre C wrote:
>>> On Sunday 30 May 2010 18:29:31 Greg Stark wrote:
>>>> On Sun, May 30, 2010 at 4:54 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>>>> I read through that thread and couldn't find much discussion of
>>>>> alternative CRC implementations --- we spent all our time on arguing
>>>>> about whether we needed 64-bit CRC or not.
>>
>> SSE4.2 has a hardware CRC32 instruction, this might be interesting to
>> use...
> Different polynom unfortunately...

Since only the WAL uses CRC, I guess the polynomial could be changed though. pg_upgrade for example shouldn't care.

RFC3385 compares different checksumming methods for use in iSCSI, and CRC32c (which uses the same polynomial as the SSE4.2 instruction) wins. Here's
a link: http://www.faqs.org/rfcs/rfc3385.html

best regards,
Florian Pflug


From: Yeb Havinga <yebhavinga(at)gmail(dot)com>
To: Florian Pflug <fgp(at)phlo(dot)org>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Pierre C <lists(at)peufeu(dot)com>, Greg Stark <gsstark(at)mit(dot)edu>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <bruce(at)momjian(dot)us>, pgsql-hackers(at)postgresql(dot)org, singh(dot)gurjeet(at)gmail(dot)com
Subject: Re: Re: [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up
Date: 2010-06-07 19:20:55
Message-ID: 4C0D4697.2030501@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Florian Pflug wrote:
> On Jun 7, 2010, at 12:45 , Andres Freund wrote:
>
>> On Monday 07 June 2010 12:37:13 Pierre C wrote:
>>
>>>> On Sunday 30 May 2010 18:29:31 Greg Stark wrote:
>>>>
>>>>> On Sun, May 30, 2010 at 4:54 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>>>>
>>>>>> I read through that thread and couldn't find much discussion of
>>>>>> alternative CRC implementations --- we spent all our time on arguing
>>>>>> about whether we needed 64-bit CRC or not.
>>>>>>
>>> SSE4.2 has a hardware CRC32 instruction, this might be interesting to
>>> use...
>>>
>> Different polynom unfortunately...
>>
>
> Since only the WAL uses CRC, I guess the polynomial could be changed though. pg_upgrade for example shouldn't care.
>
> RFC3385 compares different checksumming methods for use in iSCSI, and CRC32c (which uses the same polynomial as the SSE4.2 instruction) wins. Here's
> a link: http://www.faqs.org/rfcs/rfc3385.html
>
The linux kernel also uses it when it's availabe, see e.g.
http://tomoyo.sourceforge.jp/cgi-bin/lxr/source/arch/x86/crypto/crc32c-intel.c

regards,
Yeb Havinga


From: "Pierre C" <lists(at)peufeu(dot)com>
To: "Yeb Havinga" <yebhavinga(at)gmail(dot)com>, "Florian Pflug" <fgp(at)phlo(dot)org>
Cc: "Andres Freund" <andres(at)anarazel(dot)de>, "Greg Stark" <gsstark(at)mit(dot)edu>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Bruce Momjian" <bruce(at)momjian(dot)us>, pgsql-hackers(at)postgresql(dot)org, singh(dot)gurjeet(at)gmail(dot)com
Subject: Re: Re: [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up
Date: 2010-06-08 10:10:20
Message-ID: op.vdy7vioxeorkce@apollo13
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


> The linux kernel also uses it when it's availabe, see e.g.
> http://tomoyo.sourceforge.jp/cgi-bin/lxr/source/arch/x86/crypto/crc32c-intel.c

If you guys are interested I have a Core i7 here, could run a little
benchmark.


From: Andres Freund <andres(at)anarazel(dot)de>
To: pgsql-hackers(at)postgresql(dot)org, Greg Stark <gsstark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Stephen Frost <sfrost(at)snowman(dot)net>
Subject: Re: [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up
Date: 2010-10-30 09:05:17
Message-ID: 201010301105.17697.andres@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

This thread died after me not implementing a new version and some potential
license problems.

I still think its worthwile (and I used it in production for some time) so I
would like to implement a version fit for the next commitfest.

The code where I started out from is under the zlib license - which is to my
knowledge compatible with PGs licence. Whats the position of HACKERS there?
There already is some separately licenced code around and were already linking
to zlib licenced code...

For simplicitly I asked Mark Adler (the original Copyright Owner) if he would
be willing to relicence - he is not.

For anybody not hording all old mail like me here is a link to the archives
about my old patch:

http://archives.postgresql.org/message-
id/201005202227(dot)49990(dot)andres(at)anarazel(dot)de

Andres


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers(at)postgresql(dot)org, Greg Stark <gsstark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Stephen Frost <sfrost(at)snowman(dot)net>
Subject: Re: [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up
Date: 2010-10-31 19:58:58
Message-ID: AANLkTi=sOV_Y+7GxorMLNjD66u4=pY5f5L_rT5iAjHBJ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Oct 30, 2010 at 5:05 AM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> This thread died after me not implementing a new version and some potential
> license problems.
>
> I still think its worthwile (and I used it in production for some time) so I
> would like to implement a version fit for the next commitfest.
>
> The code where I started out from is under the zlib license - which is to my
> knowledge compatible with PGs licence. Whats the position of HACKERS there?
> There already is some separately licenced code around and were already linking
> to zlib licenced code...
>
> For simplicitly I asked Mark Adler (the original Copyright Owner) if he would
> be willing to relicence - he is not.
>
> For anybody not hording all old mail like me here is a link to the archives
> about my old patch:
>
> http://archives.postgresql.org/message-id/201005202227.49990.andres@anarazel.de

IANAL, but the license doesn't appear incompatible to me.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Marc Cousin <cousinmarc(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up
Date: 2010-11-03 08:03:21
Message-ID: 201011030903.21524.cousinmarc@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

The Saturday 30 October 2010 11:05:17, Andres Freund wrote :
> Hi,
>
> This thread died after me not implementing a new version and some potential
> license problems.
>
> I still think its worthwile (and I used it in production for some time) so
> I would like to implement a version fit for the next commitfest.
>
> The code where I started out from is under the zlib license - which is to
> my knowledge compatible with PGs licence. Whats the position of HACKERS
> there? There already is some separately licenced code around and were
> already linking to zlib licenced code...
>
> For simplicitly I asked Mark Adler (the original Copyright Owner) if he
> would be willing to relicence - he is not.
>
> For anybody not hording all old mail like me here is a link to the archives
> about my old patch:
>
> http://archives.postgresql.org/message-
> id/201005202227(dot)49990(dot)andres(at)anarazel(dot)de
>
>
> Andres

I forgot to report this a few months ago:

I had a very intensive COPY load, and this patch helped. The context was a
server that was CPU bound on loading data (8 COPY on the same table in
parallel, not indexed). This patch gave me a 10% boost in load time. I don't
have the figures right now, but I could try to do this test again if this can
help. At that time, I just tried it out of curiosity, but the load time was
sufficient without it, so I didn't spend more time on it.