Re: default opclass for jsonb (was Re: Call for GIST/GIN/SP-GIST opclass documentation)

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Stark <stark(at)mit(dot)edu>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Gavin Flower <GavinFlower(at)archidevsys(dot)co(dot)nz>, "David E(dot) Wheeler" <david(at)justatheory(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Peter Geoghegan <pg(at)heroku(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: default opclass for jsonb (was Re: Call for GIST/GIN/SP-GIST opclass documentation)
Date: 2014-05-10 20:42:34
Message-ID: 536E8F3A.40706@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs pgsql-hackers

On 05/09/2014 11:44 PM, Tom Lane wrote:
> Greg Stark <stark(at)mit(dot)edu> writes:
>> Well the question seems to me to be that if we're always doing recheck
>> then what advantage is there to not hashing everything?
>
> Right now, there's not much. But it seems likely to me that there will be
> more JSON operators in future, and some of them might be able to make use
> of the additional specificity of unhashed entries. For example, it's only
> a very arbitrary definitional choice for the exists operator (ie, not
> looking into sub-objects) that makes jsonb_ops lossy for it. We might
> eventually build a recursive-exists-check operator for which the index
> could be lossless, at least up to the string length where we start to
> hash.

Back to the naming:

The main difference between the two opclasses from a user's standpoint
is not whether they hash or not. The big difference is that one indexes
complete paths from the root, and the other indexes just the "leaf"
level. For example, if you have an object like '{"foo": {"bar": 123 }
}', one will index "foo", "foo->bar", and "foo->bar->123" while the
other will index "foo", "bar" and "123".

Whether the opclasses use hashing to shorten the key is an orthogonal
property, and IMHO not as important. To reflect that, I suggest that we
name the opclasses:

json_path_ops
json_value_ops

or something along those lines.

- Heikki

In response to

Responses

Browse pgsql-docs by date

  From Date Subject
Next Message Andrew Dunstan 2014-05-10 21:00:54 Re: default opclass for jsonb (was Re: Call for GIST/GIN/SP-GIST opclass documentation)
Previous Message Tom Lane 2014-05-09 20:44:32 Re: default opclass for jsonb (was Re: Call for GIST/GIN/SP-GIST opclass documentation)

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2014-05-10 20:43:06 Re: Lossy bitmap scan is broken in GIN
Previous Message Fujii Masao 2014-05-10 20:39:17 Re: Compression of full-page-writes