Re: jsonb format is pessimal for toast compression

From: Larry White <ljw1001(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Stephen Frost <sfrost(at)snowman(dot)net>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb format is pessimal for toast compression
Date: 2014-08-08 23:19:02
Message-ID: CAMdbzVi_KfSfyUHBt9Q4LNonmtJ47dVWdHDYwNx1vXcftLt_bQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I was not complaining; I think JSONB is awesome.

But I am one of those people who would like to put 100's of GB (or more)
JSON files into Postgres and I am concerned about file size and possible
future changes to the format.

On Fri, Aug 8, 2014 at 7:10 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:

> On Fri, Aug 8, 2014 at 12:06 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> > One we ship 9.4, many users are going to load 100's of GB into JSONB
> > fields. Even if we fix the compressability issue in 9.5, those users
> > won't be able to fix the compression without rewriting all their data,
> > which could be prohibitive. And we'll be in a position where we have
> > to support the 9.4 JSONB format/compression technique for years so that
> > users aren't blocked from upgrading.
>
> FWIW, if we take the delicious JSON data as representative, a table
> storing that data as jsonb is 1374 MB in size. Whereas an equivalent
> table with the data typed using the original json datatype (but with
> white space differences more or less ignored, because it was created
> using a jsonb -> json cast), the same data is 1352 MB.
>
> Larry's complaint is valid; this is a real problem, and I'd like to
> fix it before 9.4 is out. However, let us not lose sight of the fact
> that JSON data is usually a poor target for TOAST compression. With
> idiomatic usage, redundancy is very much more likely to appear across
> rows, and not within individual Datums. Frankly, we aren't doing a
> very good job there, and doing better requires an alternative
> strategy.
>
> --
> Peter Geoghegan
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2014-08-09 00:11:15 Re: B-Tree support function number 3 (strxfrm() optimization)
Previous Message Peter Geoghegan 2014-08-08 23:10:06 Re: jsonb format is pessimal for toast compression