Re: jsonb format is pessimal for toast compression

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, "David E(dot) Wheeler" <david(at)justatheory(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, "Jan Wieck" <jan(at)wi3ck(dot)info>
Subject: Re: jsonb format is pessimal for toast compression
Date: 2014-09-12 20:30:51
Message-ID: 541357FB.4080507@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 09/12/2014 08:52 PM, Tom Lane wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On Fri, Sep 12, 2014 at 1:11 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
>>> It's certainly possible that there is a test case for which Heikki's
>>> approach is superior, but if so we haven't seen it. And since it's
>>> approach is also more complicated, sticking with the simpler
>>> lengths-only approach seems like the way to go.
>
>> Huh, OK. I'm slightly surprised, but that's why we benchmark these things.
>
> The argument for Heikki's patch was never that it would offer better
> performance; it's obvious (at least to me) that it won't.

Performance was one argument for sure. It's not hard to come up with a
case where the all-lengths approach is much slower: take a huge array
with, say, million elements, and fetch the last element in a tight loop.
And do that in a PL/pgSQL function without storing the datum to disk, so
that it doesn't get toasted. Not a very common thing to do in real life,
although something like that might come up if you do a lot of json
processing in PL/pgSQL. but storing offsets makes that faster.

IOW, something like this:

do $$
declare
ja jsonb;
i int4;
begin
select json_agg(g) into ja from generate_series(1, 100000) g;
for i in 1..100000 loop
perform ja ->> 90000;
end loop;
end;
$$;

should perform much better with current git master or "my patch", than
with the all-lengths patch.

I'm OK with going for the all-lengths approach anyway; it's simpler, and
working with huge arrays is hopefully not that common. But it's not a
completely open-and-shut case.

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2014-09-12 20:31:19 Re: [REVIEW] Re: Compression of full-page-writes
Previous Message Alexander Korotkov 2014-09-12 20:28:50 Re: jsonb contains behaviour weirdness