Re: Optimizing pglz compressor

Lists: pgsql-hackers
From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Optimizing pglz compressor
Date: 2013-03-01 15:28:20
Message-ID: 5130C914.8080106@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I spotted some low-hanging fruit in the pglz compression routine. It
uses a hash table to keep track of string prefixes it has seen:

#define PGLZ_HISTORY_LISTS 8192 /* must be power of 2 */
#define PGLZ_HISTORY_SIZE 4096

/* ----------
* Statically allocated work arrays for history
* ----------
*/
static PGLZ_HistEntry *hist_start[PGLZ_HISTORY_LISTS];
static PGLZ_HistEntry hist_entries[PGLZ_HISTORY_SIZE];

The hist_start array has to be zeroed at every invocation of
pglz_compress(). That's relatively expensive, when compressing small
values; on a 64-bit machine, 8192 * 8 = 64 kB.

The pointers in hist_start point to entries in hist_entries. If we
replace the pointers with int16 indexes into hist_entries, we can shrink
hist_start array to 8192 * 2 = 16 kB.

Also, in PGLZ_HistEntry:

typedef struct PGLZ_HistEntry
{
struct PGLZ_HistEntry *next; /* links for my hash key's list */
struct PGLZ_HistEntry *prev;
int hindex; /* my current hash key */
const char *pos; /* my input position */
} PGLZ_HistEntry;

we can replace next and prev with indexes into the hist_entries array,
halving the size of that struct, and thus the hist_entries array from
32*4096=128kB to 64kB. I'm not sure how significant that is -
hist_entries array doesn't need to be zeroed out like hist_start does -
but it might buy us something in CPU cache efficiency.

I ran some performance tests with this:

testname | unpatched | patched
-------------------+-----------+---------
5k text | 1.55933 | 1.57337
512b random | 6.70911 | 5.41420
100k of same byte | 1.92319 | 1.94880
2k random | 4.29494 | 3.88782
100k random | 1.10400 | 1.07982
512b text | 9.26736 | 7.55828
(6 rows)

The datum used in the "5k text" test was the first 5k from the
"doc/src/sgml/func.sgml" file in the source tree, and "512b text" was
the first 512 bytes from the same file. The data in the random tests was
completely random, and in the "100k of same byte" test, a single byte
was repeated 100000 times (ie. compresses really well).

Each test inserts rows into a temporary table. The table has 10 bytea
columns, and the same value is inserted in every column. The number of
rows inserted was scaled so that each test inserts roughly 500 MB of
data. Each test was repeated 20 times, and the values listed above are
the minimum runtimes in seconds.

I spotted this while looking at Amit's WAL update delta encoding patch.
My earlier suggestion to just use the pglz compressor for the delta
encoding didn't work too well because the pglz compressor was too
expensive, especially for small values. This patch might help with that..

In summary, this seems like a pretty clear win for short values, and a
wash for long values. Not surprising, as this greatly lowers the startup
cost of pglz_compress(). We're past feature freeze, but how would people
feel about committing this?

- Heikki

--
- Heikki

Attachment Content-Type Size
pglz-replace-pointers-with-indexes.patch text/x-diff 4.0 KB
compress-tests.sql text/x-sql 1.1 KB
compress-tests-init.sql text/x-sql 11.0 KB

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: Optimizing pglz compressor
Date: 2013-03-01 15:37:43
Message-ID: 20130301153743.GU9507@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Heikki Linnakangas wrote:

> I spotted this while looking at Amit's WAL update delta encoding
> patch. My earlier suggestion to just use the pglz compressor for the
> delta encoding didn't work too well because the pglz compressor was
> too expensive, especially for small values. This patch might help
> with that..
>
> In summary, this seems like a pretty clear win for short values, and
> a wash for long values. Not surprising, as this greatly lowers the
> startup cost of pglz_compress(). We're past feature freeze, but how
> would people feel about committing this?

Surely we're not past feature freeze. If we were, we'd have to reject
all remaining patches from the commitfest, which is not what we want to
do at this stage, is it?

My take on this is that if this patch is necessary to get Amit's patch
to a commitable state, it's fair game.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: Optimizing pglz compressor
Date: 2013-03-01 15:42:30
Message-ID: 5130CC66.6020802@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 01.03.2013 17:37, Alvaro Herrera wrote:
> Heikki Linnakangas wrote:
>
>> In summary, this seems like a pretty clear win for short values, and
>> a wash for long values. Not surprising, as this greatly lowers the
>> startup cost of pglz_compress(). We're past feature freeze, but how
>> would people feel about committing this?
>
> Surely we're not past feature freeze. If we were, we'd have to reject
> all remaining patches from the commitfest, which is not what we want to
> do at this stage, is it?

Right, no, that's not what I meant by feature freeze. I don't know what
the correct term here would be, but what I meant is that we're not
considering new features for 9.3 anymore, except those that are in the
commitfest.

> My take on this is that if this patch is necessary to get Amit's patch
> to a commitable state, it's fair game.

I don't think it's necessary for that, but let's see..

- Heikki


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: Optimizing pglz compressor
Date: 2013-03-01 15:44:49
Message-ID: 20130301154449.GV16142@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Alvaro Herrera (alvherre(at)2ndquadrant(dot)com) wrote:
> Surely we're not past feature freeze. If we were, we'd have to reject
> all remaining patches from the commitfest, which is not what we want to
> do at this stage, is it?

Actually, I think we're getting very close to exactly that point- we're
not making progress through the CommitFest and if we don't triage and
close it out, we're never going to get a release out.

Thanks,

Stephen


From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Optimizing pglz compressor
Date: 2013-03-05 13:32:43
Message-ID: 5135F3FB.7040706@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I spent some more time on this, and came up with the attached patch. It
includes the changes I posted earlier, to use indexes instead of
pointers in the hash table. In addition, it makes the hash table size
variable, depending on the length of the input. This further reduces the
startup cost on small inputs. I changed the hash method slightly,
because the old method would not use any bits from the 3rd byte with a
small hash table size, but fortunately that didn't seem to negative
impact with larger hash table sizes either.

I wrote a little C extension to test this. It contains a function, which
runs pglz_compress() on a bytea input, N times. I ran that with
different kinds of inputs, and got the following results:

unpatched:

testname | auto
-------------------+-----------
5k text | 1425.750
512b text | 1287.413
2k random | -1053.160
100k random | -1056.379
512b random | -1018.474
64b of text | -2547.106
64b random | -3731.700
100k of same byte | 1093.885
100k text | 849.430
(9 rows)

pglz-replace-pointers-with-indexes.patch (the patch I posted earlier):

testname | auto
-------------------+-----------
5k text | 1251.576
512b text | 946.348
2k random | -815.095
100k random | -808.356
512b random | -614.418
64b of text | -744.382
64b random | -1060.758
100k of same byte | 1192.397
100k text | 796.530
(9 rows)

pglz-variable-size-hash-table.patch:

testname | auto
-------------------+----------
5k text | 1276.905
512b text | 823.796
2k random | -844.484
100k random | -848.080
512b random | -615.039
64b of text | -393.448
64b random | -568.685
100k of same byte | 1186.759
100k text | 799.978
(9 rows)

These values are from a single run of the test, but I did repeat them
several times to make sure there isn't too much variability in them to
render the results meaningless. The negative values mean that
pglz_compress gave up on the compression in the test, ie. it shows how
long it took for pglz_compress to realize that it can't compress the
input. Compare the abs() values.

With the variable-size hash table, I'm not sure how significant the
earlier patch to use indexes instead of pointers is. But I don't think
it makes things any worse, so it's included in this.

On 01.03.2013 17:42, Heikki Linnakangas wrote:
> On 01.03.2013 17:37, Alvaro Herrera wrote:
>> My take on this is that if this patch is necessary to get Amit's patch
>> to a commitable state, it's fair game.
>
> I don't think it's necessary for that, but let's see..

With these tweaks, I was able to make pglz-based delta encoding perform
roughly as well as Amit's patch. So I think that's the approach we
should take, as it's a bit simpler and more versatile. I'll follow up on
that in the other thread.

- Heikki

Attachment Content-Type Size
pglz-variable-size-hash-table.patch text/x-diff 8.3 KB
compresstest.tar.gz application/x-gzip 2.0 KB

From: Joachim Wieland <joe(at)mcknight(dot)de>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Optimizing pglz compressor
Date: 2013-03-06 14:32:10
Message-ID: CACw0+13=ojScDLObdfCO1gi8kZ1x3_5ybqtcOs4n2MZ75=ph0g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Mar 5, 2013 at 8:32 AM, Heikki Linnakangas
<hlinnakangas(at)vmware(dot)com> wrote:
> With these tweaks, I was able to make pglz-based delta encoding perform
> roughly as well as Amit's patch.

Out of curiosity, do we know how pglz compares with other algorithms, e.g. lz4 ?


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Joachim Wieland <joe(at)mcknight(dot)de>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Optimizing pglz compressor
Date: 2013-03-06 15:36:19
Message-ID: CAHyXU0ws0escPwTHejbPrjP1RhUmRqAwOBFqQBSUZTsOJz6Nzg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Mar 6, 2013 at 8:32 AM, Joachim Wieland <joe(at)mcknight(dot)de> wrote:
> On Tue, Mar 5, 2013 at 8:32 AM, Heikki Linnakangas
> <hlinnakangas(at)vmware(dot)com> wrote:
>> With these tweaks, I was able to make pglz-based delta encoding perform
>> roughly as well as Amit's patch.
>
> Out of curiosity, do we know how pglz compares with other algorithms, e.g. lz4 ?

This has been a subject of much recent discussion. It compares very
poorly, but installing a new compressor tends to be problematic due to
patent concerns (something which I disagree with but it's there). All
that said, Heikki's proposed changes seem to be low risk and quite
fast.

merlin


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Joachim Wieland <joe(at)mcknight(dot)de>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Optimizing pglz compressor
Date: 2013-03-06 16:53:29
Message-ID: 20130306165329.GC4970@alap2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2013-03-06 09:36:19 -0600, Merlin Moncure wrote:
> On Wed, Mar 6, 2013 at 8:32 AM, Joachim Wieland <joe(at)mcknight(dot)de> wrote:
> > On Tue, Mar 5, 2013 at 8:32 AM, Heikki Linnakangas
> > <hlinnakangas(at)vmware(dot)com> wrote:
> >> With these tweaks, I was able to make pglz-based delta encoding perform
> >> roughly as well as Amit's patch.
> >
> > Out of curiosity, do we know how pglz compares with other algorithms, e.g. lz4 ?
>
> This has been a subject of much recent discussion. It compares very
> poorly, but installing a new compressor tends to be problematic due to
> patent concerns (something which I disagree with but it's there). All
> that said, Heikki's proposed changes seem to be low risk and quite
> fast.

Imo the licensing part is by far the smaller one. The interesting part
is making a compatible change to the way toast compression works that
supports multiple compression schemes. Afaics nobody has done that work.
After that the choice of to-be-integrated compression schemes needs to
be discussed, sure.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, Joachim Wieland <joe(at)mcknight(dot)de>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Optimizing pglz compressor
Date: 2013-03-06 17:08:10
Message-ID: CAMkU=1ypB4oBxEhVMjs_dSSgHAfdjgMiq0LEQH8TRx1QVEOQfw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Mar 6, 2013 at 8:53 AM, Andres Freund <andres(at)2ndquadrant(dot)com>wrote:

> On 2013-03-06 09:36:19 -0600, Merlin Moncure wrote:
> > On Wed, Mar 6, 2013 at 8:32 AM, Joachim Wieland <joe(at)mcknight(dot)de> wrote:
> > > On Tue, Mar 5, 2013 at 8:32 AM, Heikki Linnakangas
> > > <hlinnakangas(at)vmware(dot)com> wrote:
> > >> With these tweaks, I was able to make pglz-based delta encoding
> perform
> > >> roughly as well as Amit's patch.
> > >
> > > Out of curiosity, do we know how pglz compares with other algorithms,
> e.g. lz4 ?
> >
> > This has been a subject of much recent discussion. It compares very
> > poorly, but installing a new compressor tends to be problematic due to
> > patent concerns (something which I disagree with but it's there). All
> > that said, Heikki's proposed changes seem to be low risk and quite
> > fast.
>
> Imo the licensing part is by far the smaller one. The interesting part
> is making a compatible change to the way toast compression works that
> supports multiple compression schemes. Afaics nobody has done that work.
> After that the choice of to-be-integrated compression schemes needs to
> be discussed, sure.
>

Another thing to consider would be some way of recording an exemplar value
for each column which is used to seed whatever compression algorithm is
used. I think there often a lot of redundancy that does not appear within
any given value, but does appear when viewing all the values of a given
column. Finding some way to take advantage of that could give a big
improvement in compression ratio.

Cheers,

Jeff


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, Joachim Wieland <joe(at)mcknight(dot)de>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Optimizing pglz compressor
Date: 2013-03-06 17:12:26
Message-ID: 20130306171226.GE4970@alap2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2013-03-06 09:08:10 -0800, Jeff Janes wrote:
> On Wed, Mar 6, 2013 at 8:53 AM, Andres Freund <andres(at)2ndquadrant(dot)com>wrote:
>
> > On 2013-03-06 09:36:19 -0600, Merlin Moncure wrote:
> > > On Wed, Mar 6, 2013 at 8:32 AM, Joachim Wieland <joe(at)mcknight(dot)de> wrote:
> > > > On Tue, Mar 5, 2013 at 8:32 AM, Heikki Linnakangas
> > > > <hlinnakangas(at)vmware(dot)com> wrote:
> > > >> With these tweaks, I was able to make pglz-based delta encoding
> > perform
> > > >> roughly as well as Amit's patch.
> > > >
> > > > Out of curiosity, do we know how pglz compares with other algorithms,
> > e.g. lz4 ?
> > >
> > > This has been a subject of much recent discussion. It compares very
> > > poorly, but installing a new compressor tends to be problematic due to
> > > patent concerns (something which I disagree with but it's there). All
> > > that said, Heikki's proposed changes seem to be low risk and quite
> > > fast.
> >
> > Imo the licensing part is by far the smaller one. The interesting part
> > is making a compatible change to the way toast compression works that
> > supports multiple compression schemes. Afaics nobody has done that work.
> > After that the choice of to-be-integrated compression schemes needs to
> > be discussed, sure.
> >
>
>
> Another thing to consider would be some way of recording an exemplar value
> for each column which is used to seed whatever compression algorithm is
> used. I think there often a lot of redundancy that does not appear within
> any given value, but does appear when viewing all the values of a given
> column. Finding some way to take advantage of that could give a big
> improvement in compression ratio.

But then that value cannot be changed/removed because existing values
depend on it. So either its a one of thing - which means you can only
compute it after the table is filled to some extent - or you need to
have a growing list of such values somewhere (refcounting it would be
hard).
I think the more reasonable route for such a thing would be to try and
get page-level compression working. Which is a pretty hard project, I
admit.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Joachim Wieland <joe(at)mcknight(dot)de>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Optimizing pglz compressor
Date: 2013-03-06 17:31:06
Message-ID: CAHyXU0yEEJP2J=GOi2wSAF5xEScoxyfXYj-9uu4A=KOTXELvTw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Mar 6, 2013 at 10:53 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> On 2013-03-06 09:36:19 -0600, Merlin Moncure wrote:
>> On Wed, Mar 6, 2013 at 8:32 AM, Joachim Wieland <joe(at)mcknight(dot)de> wrote:
>> > On Tue, Mar 5, 2013 at 8:32 AM, Heikki Linnakangas
>> > <hlinnakangas(at)vmware(dot)com> wrote:
>> >> With these tweaks, I was able to make pglz-based delta encoding perform
>> >> roughly as well as Amit's patch.
>> >
>> > Out of curiosity, do we know how pglz compares with other algorithms, e.g. lz4 ?
>>
>> This has been a subject of much recent discussion. It compares very
>> poorly, but installing a new compressor tends to be problematic due to
>> patent concerns (something which I disagree with but it's there). All
>> that said, Heikki's proposed changes seem to be low risk and quite
>> fast.
>
> Imo the licensing part is by far the smaller one. The interesting part
> is making a compatible change to the way toast compression works that
> supports multiple compression schemes. Afaics nobody has done that work.
> After that the choice of to-be-integrated compression schemes needs to
> be discussed, sure.

That wasn't the conversation I remember. lz4 completely smokes pglz
(Heikki's changes notwithstanding) and is bsd licensed so AFAICS there
it no reason to support multiple compression technologies except for
migration purposes (and even if you did want to, a plausible way to do
that was identified).

...but that's a separate topic for another day. Heikki's proposal
seems like a win either way.

merlin


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Joachim Wieland <joe(at)mcknight(dot)de>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: Optimizing pglz compressor
Date: 2013-03-06 17:42:17
Message-ID: 20130306174217.GG4970@alap2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2013-03-06 11:31:06 -0600, Merlin Moncure wrote:
> On Wed, Mar 6, 2013 at 10:53 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> > On 2013-03-06 09:36:19 -0600, Merlin Moncure wrote:
> >> On Wed, Mar 6, 2013 at 8:32 AM, Joachim Wieland <joe(at)mcknight(dot)de> wrote:
> >> > On Tue, Mar 5, 2013 at 8:32 AM, Heikki Linnakangas
> >> > <hlinnakangas(at)vmware(dot)com> wrote:
> >> >> With these tweaks, I was able to make pglz-based delta encoding perform
> >> >> roughly as well as Amit's patch.
> >> >
> >> > Out of curiosity, do we know how pglz compares with other algorithms, e.g. lz4 ?
> >>
> >> This has been a subject of much recent discussion. It compares very
> >> poorly, but installing a new compressor tends to be problematic due to
> >> patent concerns (something which I disagree with but it's there). All
> >> that said, Heikki's proposed changes seem to be low risk and quite
> >> fast.
> >
> > Imo the licensing part is by far the smaller one. The interesting part
> > is making a compatible change to the way toast compression works that
> > supports multiple compression schemes. Afaics nobody has done that work.
> > After that the choice of to-be-integrated compression schemes needs to
> > be discussed, sure.
>
> That wasn't the conversation I remember. lz4 completely smokes pglz
> (Heikki's changes notwithstanding) and is bsd licensed so AFAICS there
> it no reason to support multiple compression technologies except for
> migration purposes (and even if you did want to, a plausible way to do
> that was identified).

Well, we need to permanently support at least two technologies -
otherwise we will break pg_upgrade.
And there is absolutely no reason to believe nothing better than lz4
will come along in the future so just supporting two seems like a bad
idea. And sure, there are rough ideas on how to support different
compression schemes in toast, but nobody has implemented anything
afaics.
It would be possible to reuse what I proposed for indirect toast tuples
for that purpose although I am not sure whether I like using
varatt_external's in multiple types as indicated by
varatt1_be.va_len_1be (renamed to va_type or such) apointing to
different types. Which means that an uncompressible datum would
potentially have multiple representations.

> ...but that's a separate topic for another day. Heikki's proposal
> seems like a win either way.

Yes.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Daniel Farina <daniel(at)heroku(dot)com>
To:
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Optimizing pglz compressor
Date: 2013-03-19 05:27:23
Message-ID: CAAZKuFZCOCHsswQM60ioDO_hk12tA7OG3YcJA8v=4YebMOA-wA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Mar 6, 2013 at 6:32 AM, Joachim Wieland <joe(at)mcknight(dot)de> wrote:
> On Tue, Mar 5, 2013 at 8:32 AM, Heikki Linnakangas
> <hlinnakangas(at)vmware(dot)com> wrote:
>> With these tweaks, I was able to make pglz-based delta encoding perform
>> roughly as well as Amit's patch.
>
> Out of curiosity, do we know how pglz compares with other algorithms, e.g. lz4 ?

This one is for the archives, as I thought it surprising: there can be
a surprisingly huge magnitude of performance difference of these
algorithms depending on architecture. Here's a table reproduced from:
http://www.reddit.com/r/programming/comments/1aim6s/lz4_extremely_fast_compression_algorithm/c8y0ew9

"""
testdata/alice29.txt :
ZLIB: [b 1M] bytes 152089 -> 54404 35.8% comp 0.8 MB/s uncomp 8.1 MB/s
LZO: [b 1M] bytes 152089 -> 82721 54.4% comp 14.5 MB/s uncomp 43.0 MB/s
CSNAPPY: [b 1M] bytes 152089 -> 90965 59.8% comp 2.1 MB/s uncomp 4.4 MB/s
SNAPPY: [b 4M] bytes 152089 -> 90965 59.8% comp 1.8 MB/s uncomp 2.8 MB/s
testdata/asyoulik.txt :
ZLIB: [b 1M] bytes 125179 -> 48897 39.1% comp 0.8 MB/s uncomp 7.7 MB/s
LZO: [b 1M] bytes 125179 -> 73224 58.5% comp 15.3 MB/s uncomp 42.4 MB/s
CSNAPPY: [b 1M] bytes 125179 -> 80207 64.1% comp 2.0 MB/s uncomp 4.2 MB/s
SNAPPY: [b 4M] bytes 125179 -> 80207 64.1% comp 1.7 MB/s uncomp 2.7 MB/s

LZO was ~8x faster compressing and ~16x faster decompressing. Only on
uncompressible data was Snappy was faster:

testdata/house.jpg :
ZLIB: [b 1M] bytes 126958 -> 126513 99.6% comp 1.2 MB/s uncomp 9.6 MB/s
LZO: [b 1M] bytes 126958 -> 127173 100.2% comp 4.2 MB/s uncomp
74.9 MB/s
CSNAPPY: [b 1M] bytes 126958 -> 126803 99.9% comp 24.6 MB/s uncomp 381.2 MB/s
SNAPPY: [b 4M] bytes 126958 -> 126803 99.9% comp 22.8 MB/s uncomp 354.4 MB/s
"""

So that's one more gotcha to worry about, since I surmise most numbers
are being taken on x86. Apparently this has something to do with
alignment of accesses. Some of it may be fixable by tweaking the
implementation rather than the compression encoding, although I am no
expert in the matter.

--
fdr


From: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
To: "'Heikki Linnakangas'" <hlinnakangas(at)vmware(dot)com>, "'Alvaro Herrera'" <alvherre(at)2ndquadrant(dot)com>
Cc: "'PostgreSQL-development'" <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: Optimizing pglz compressor
Date: 2013-06-19 11:01:13
Message-ID: 008501ce6cdc$51981790$f4c846b0$@kapila@huawei.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tuesday, March 05, 2013 7:03 PM Heikki Linnakangas wrote:

> I spent some more time on this, and came up with the attached patch. It
> includes the changes I posted earlier, to use indexes instead of
> pointers in the hash table. In addition, it makes the hash table size
> variable, depending on the length of the input. This further reduces
> the startup cost on small inputs. I changed the hash method slightly,
> because the old method would not use any bits from the 3rd byte with a
> small hash table size, but fortunately that didn't seem to negative
> impact with larger hash table sizes either.
>
> I wrote a little C extension to test this. It contains a function,
> which runs pglz_compress() on a bytea input, N times. I ran that with
> different kinds of inputs, and got the following results:
>

The purpose of this patch is to improve LZ compression speed by reducing the
startup cost of initialization of hash_start array.
To achieve the same it uses variable hash and reduced the size of each
history entry by replacing pointers with int16 indexes.
It achieves it's purpose for small data, but for large data in some cases
performance is degaraded, refer second set of performance data.

1. Patch compiles cleanly and all regression tests passed.
2. Change in pglz_hist_idx macro is not very clear to me, neither it is
mentioned in comments
3. Why first entry is kept as INVALID_ENTRY? It appears to me, it is for
cleaner checks in code.

Performance Data
------------------
I have used pglz-variable-size-hash-table.patch to collect all performance
data:

Results of compress-tests.sql -- inserting large data into tmp table
------------------------------

testname |unpatched | patched
-------------------+----------+------------
5k text | 4.8932 | 4.9014
512b text | 22.6209 | 18.6849
256b text | 13.9784 | 8.9342
1K text | 20.4969 | 20.5988
2k random | 10.5826 | 10.0758
100k random | 3.9056 | 3.8200
500k random | 22.4078 | 22.1971
512b random | 15.7788 | 12.9575
256b random | 18.9213 | 12.5209
1K random | 11.3933 | 9.8853
100k of same byte | 5.5877 | 5.5960
500k of same byte | 2.6853 | 2.6500

Observation
-------------
1. This clearly shows that the patch improves performance for small data
without any impact for large data.

Performance data for directly calling lz_compress function (tests.sql)
---------------------------------------------------------------------------
select testname,
(compresstest(data, nrows, 8192)::numeric / 1000)::numeric(10,3) as
auto
from tests;

Head
testname | auto
-------------------+-----------
5k text | 3511.879
512b text | 1430.990
256b text | 1768.796
1K text | 1390.134
3K text | 4099.304
2k random | -402.916
100k random | -10.311
500k random | -2.019
512b random | -753.317
256b random | -1096.999
1K random | -559.931
10k of same byte | 3548.651
100k of same byte | 36037.280
500k of same byte | 25565.195
(14 rows)

Patch(pglz-variable-size-hash-table.patch)

testname | auto
-------------------+-----------
5k text | 3840.207
512b text | 1088.897
256b text | 982.172
1K text | 1402.488
3K text | 4334.802
2k random | -333.100
100k random | -8.390
500k random | -1.672
512b random | -499.251
256b random | -524.889
1K random | -453.177
10k of same byte | 4754.367
100k of same byte | 50427.735
500k of same byte | 36163.265
(14 rows)

Observations
--------------
1. For small data perforamce is always good with patch.
2. For random small/large data performace is good.
3. For medium and large text and same byte data(3K,5K text, 10K,100K,500K
same byte), performance is degraded.

I have used attached compress-tests-init.sql to generate data.
I am really not sure why the data you reported and what I taken differ in
few cases. I had tried multiple times but the result is same.
Kindly let me know if you think I am doing something wrong.

Note - To generate data in randomhex, I used Copy from file. I used same
command you provided to generate a file.

With Regards,
Amit Kapila.

Attachment Content-Type Size
compress-tests-init.sql application/octet-stream 12.2 KB

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
Cc: 'Alvaro Herrera' <alvherre(at)2ndquadrant(dot)com>, 'PostgreSQL-development' <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: Optimizing pglz compressor
Date: 2013-06-25 20:45:09
Message-ID: 51CA0155.9030903@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 19.06.2013 14:01, Amit Kapila wrote:
> Observations
> --------------
> 1. For small data perforamce is always good with patch.
> 2. For random small/large data performace is good.
> 3. For medium and large text and same byte data(3K,5K text, 10K,100K,500K
> same byte), performance is degraded.

Wow, that's strange. What platform and CPU did you test on? Are you sure
you used the same compiler flags with and without the patch?

Can you also try the attached patch, please? It's the same as before,
but in this version, I didn't replace the prev and next pointers in
PGLZ_HistEntry struct with int16s. That avoids some table lookups, at
the expense of using more memory. It's closer to what we have without
the patch, so maybe that helps on your system.

- Heikki

Attachment Content-Type Size
pglz-variable-size-hash-table-2.patch text/x-diff 7.7 KB

From: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
To: "'Heikki Linnakangas'" <hlinnakangas(at)vmware(dot)com>
Cc: "'Alvaro Herrera'" <alvherre(at)2ndquadrant(dot)com>, "'PostgreSQL-development'" <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: Optimizing pglz compressor
Date: 2013-06-26 13:37:50
Message-ID: 01fa01ce7272$5b359700$11a0c500$@kapila@huawei.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wednesday, June 26, 2013 2:15 AM Heikki Linnakangas wrote:
> On 19.06.2013 14:01, Amit Kapila wrote:
> > Observations
> > --------------
> > 1. For small data perforamce is always good with patch.
> > 2. For random small/large data performace is good.
> > 3. For medium and large text and same byte data(3K,5K text,
> > 10K,100K,500K same byte), performance is degraded.
>
> Wow, that's strange. What platform and CPU did you test on?

CPU - 4 core
RAM - 24GB
OS - SUSE 11 SP2
Kernel version - 3.0.13

> Are you
> sure you used the same compiler flags with and without the patch?

Yes.

> Can you also try the attached patch, please? It's the same as before,
> but in this version, I didn't replace the prev and next pointers in
> PGLZ_HistEntry struct with int16s. That avoids some table lookups, at
> the expense of using more memory. It's closer to what we have without
> the patch, so maybe that helps on your system.

Yes it helped a lot on my system.

Head:
testname | auto
-------------------+-----------
5k text | 3499.888
512b text | 1425.106
256b text | 1769.126
1K text | 1378.151
3K text | 4081.254
2k random | -410.928
100k random | -10.235
500k random | -2.094
512b random | -770.665
256b random | -1120.173
1K random | -570.351
10k of same byte | 3602.610
100k of same byte | 36187.863
500k of same byte | 26055.472

After your Patch (pglz-variable-size-hash-table-2.patch)

testname | auto
-------------------+-----------
5k text | 3524.306
512b text | 954.962
256b text | 832.527
1K text | 1273.970
3K text | 3963.329
2k random | -300.516
100k random | -7.538
500k random | -1.525
512b random | -439.726
256b random | -440.154
1K random | -391.070
10k of same byte | 3570.921
100k of same byte | 37498.502
500k of same byte | 26904.426

There was minor problem in you patch, in one of experiments it crashed.
Fix is not to access 0th history entry in function pglz_find_match(),
modified patch is attached.

After fix, readings are almost similar:

testname | auto
-------------------+-----------
5k text | 3347.961
512b text | 938.442
256b text | 834.496
1K text | 1243.618
3K text | 3790.835
2k random | -306.470
100k random | -7.589
500k random | -1.517
512b random | -442.649
256b random | -438.781
1K random | -392.106
10k of same byte | 3565.449
100k of same byte | 37355.595
500k of same byte | 26776.076

I guess some difference might be due to different way of accessing in
pglz_hist_add().

With Regards,
Amit Kapila.

Attachment Content-Type Size
pglz-variable-size-hash-table-3.patch application/octet-stream 7.8 KB

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
Cc: 'Alvaro Herrera' <alvherre(at)2ndquadrant(dot)com>, 'PostgreSQL-development' <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: Optimizing pglz compressor
Date: 2013-07-01 08:05:37
Message-ID: 51D13851.9000100@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 26.06.2013 16:37, Amit Kapila wrote:
> On Wednesday, June 26, 2013 2:15 AM Heikki Linnakangas wrote:
>> Can you also try the attached patch, please? It's the same as before,
>> but in this version, I didn't replace the prev and next pointers in
>> PGLZ_HistEntry struct with int16s. That avoids some table lookups, at
>> the expense of using more memory. It's closer to what we have without
>> the patch, so maybe that helps on your system.
>
> Yes it helped a lot on my system.

Ok, good. Strange, I did not expect such a big difference.

> There was minor problem in you patch, in one of experiments it crashed.
> Fix is not to access 0th history entry in function pglz_find_match(),
> modified patch is attached.

Thanks, good catch! I thought that a pointer to the 0th entry would
never make it into the prev/next fields, but it does. In fact, we never
store a NULL there anymore, a pointer to the 0th entry is now always
used to mean 'invalid'. I adjusted the patch to remove the NULL check,
and only check for the 0th entry.

Committed.

- Heikki


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, 'Alvaro Herrera' <alvherre(at)2ndquadrant(dot)com>, 'PostgreSQL-development' <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: Optimizing pglz compressor
Date: 2013-07-01 16:51:38
Message-ID: 20130701165138.GC16348@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jul 1, 2013 at 11:05:37AM +0300, Heikki Linnakangas wrote:
> On 26.06.2013 16:37, Amit Kapila wrote:
> >On Wednesday, June 26, 2013 2:15 AM Heikki Linnakangas wrote:
> >>Can you also try the attached patch, please? It's the same as before,
> >>but in this version, I didn't replace the prev and next pointers in
> >>PGLZ_HistEntry struct with int16s. That avoids some table lookups, at
> >>the expense of using more memory. It's closer to what we have without
> >>the patch, so maybe that helps on your system.
> >
> >Yes it helped a lot on my system.
>
> Ok, good. Strange, I did not expect such a big difference.
>
> >There was minor problem in you patch, in one of experiments it crashed.
> >Fix is not to access 0th history entry in function pglz_find_match(),
> >modified patch is attached.
>
> Thanks, good catch! I thought that a pointer to the 0th entry would
> never make it into the prev/next fields, but it does. In fact, we
> never store a NULL there anymore, a pointer to the 0th entry is now
> always used to mean 'invalid'. I adjusted the patch to remove the
> NULL check, and only check for the 0th entry.
>
> Committed.

Does someone have new tests comparing this patch with the new
compression methods being considered?

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +


From: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
To: "'Heikki Linnakangas'" <hlinnakangas(at)vmware(dot)com>
Cc: "'Alvaro Herrera'" <alvherre(at)2ndquadrant(dot)com>, "'PostgreSQL-development'" <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: Optimizing pglz compressor
Date: 2013-07-02 06:41:35
Message-ID: 01d401ce76ef$32cefb90$986cf2b0$@kapila@huawei.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Monday, July 01, 2013 1:36 PM Heikki Linnakangas wrote:
> On 26.06.2013 16:37, Amit Kapila wrote:
> > On Wednesday, June 26, 2013 2:15 AM Heikki Linnakangas wrote:
> >> Can you also try the attached patch, please? It's the same as
> before,
> >> but in this version, I didn't replace the prev and next pointers in
> >> PGLZ_HistEntry struct with int16s. That avoids some table lookups,
> at
> >> the expense of using more memory. It's closer to what we have
> without
> >> the patch, so maybe that helps on your system.
> >
> > Yes it helped a lot on my system.
>
> Ok, good. Strange, I did not expect such a big difference.
>
> > There was minor problem in you patch, in one of experiments it
> crashed.
> > Fix is not to access 0th history entry in function pglz_find_match(),
> > modified patch is attached.
>
> Thanks, good catch! I thought that a pointer to the 0th entry would
> never make it into the prev/next fields, but it does. In fact, we never
> store a NULL there anymore, a pointer to the 0th entry is now always
> used to mean 'invalid'. I adjusted the patch to remove the NULL check,
> and only check for the 0th entry.
>
> Committed.

Thanks, will update the WAL Optimization patch based on this and post the
new patch and data on the corresponding thread.

With Regards,
Amit Kapila.