Re: default opclass for jsonb (was Re: Call for GIST/GIN/SP-GIST opclass documentation)

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "David E(dot) Wheeler" <david(at)justatheory(dot)com>, Greg Stark <stark(at)mit(dot)edu>, Robert Haas <robertmhaas(at)gmail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Peter Geoghegan <pg(at)heroku(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: default opclass for jsonb (was Re: Call for GIST/GIN/SP-GIST opclass documentation)
Date: 2014-05-08 13:47:01
Message-ID: 20140508134701.GO30817@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs pgsql-hackers

On Tue, May 6, 2014 at 06:20:53PM -0400, Tom Lane wrote:
> "David E. Wheeler" <david(at)justatheory(dot)com> writes:
> > On May 6, 2014, at 2:20 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> >> Well, then, we only have a few days to come up with a name.
>
> > What are the options?
>
> We have no proposals as yet.
>
> I've been looking at the source code to try to understand the difference
> between the two opclasses (and BTW I concur with the opinions expressed
> recently about the poor state of the internal documentation for jsonb).
> If I've got it straight:
>
> jsonb_ops indexes keys and values separately, so for instance "{xyz: 2}"
> would give rise to GIN entries that are effectively the strings "Kxyz"
> and "V2". If you're looking for tuples containing "{xyz: 2}" then you
> would be looking for the AND of those independent index entries, which
> fortunately GIN is pretty good at computing. But you could also look
> for just keys or just values.
>
> jsonb_hash_ops creates an index entry only for values, but what it
> stores is a hash of both the value and the key it's stored under.
> So in this example you'd get a hash combining "xyz" and "2". This
> means the only type of query you can perform is like "find JSON tuples
> containing {xyz: 2}".

Good summary, thanks. This is the information I was hoping we had in
our docs. How does hstore deal with these issues?

> Because jsonb_ops stores the *whole* value, you can do lossless index
> searches (no recheck needed on the heap tuple), but you also run the
> risk of long strings failing to fit into an index entry. Since jsonb_ops
> reduces everything to a hash, there's no possibility of index failure,
> but all queries are lossy and require recheck.
>
> TBH, at this point I'm sort of agreeing with the thought expressed
> upthread that maybe neither of these should be the default as-is.
> They seem like rather arbitrary combinations of choices. In particular
> I wonder why there's not an option to store keys and values separately,
> but as hashes not as the original strings, so that indexability of
> everything could be guaranteed. Or a variant of that might be to hash
> only strings that are too large to fit in an index entry, and force
> recheck only when searching for a string that needed hashing.
>
> I wonder whether the most effective use of time at this point
> wouldn't be to fix jsonb_ops to do that, rather than arguing about
> what to rename it to. If it didn't have the failure-for-long-strings
> problem I doubt anybody would be unhappy about making it the default.

Can we hash just the values, not the keys, in jsonb_ops, and hash the
combo in jsonb_hash_ops. That would give us key-only lookups without a
recheck.

How do we index long strings now? Is it he combination of GIN and long
strings that is the problem?

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ Everyone has their own god. +

In response to

Responses

Browse pgsql-docs by date

  From Date Subject
Next Message Tom Lane 2014-05-08 14:16:54 Re: default opclass for jsonb (was Re: Call for GIST/GIN/SP-GIST opclass documentation)
Previous Message Peter Eisentraut 2014-05-07 01:00:58 dead ISBN link

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2014-05-08 13:49:04 Re: [v9.5] Custom Plan API
Previous Message Simon Riggs 2014-05-08 13:41:39 Re: [v9.5] Custom Plan API