Re: jsonb status

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb status
Date: 2014-03-17 17:48:13
Message-ID: 5327355D.4020509@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On 03/16/2014 04:10 AM, Peter Geoghegan wrote:
> On Thu, Mar 13, 2014 at 2:00 PM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
>> I'll be travelling a good bit of tomorrow (Friday), but I hope Peter has
>> finished by the time I am back on deck late tomorrow and that I am able to
>> commit this on Saturday.
> I asked Andrew to hold off on committing this today. It was agreed
> that we weren't quite ready, because there were one or two remaining
> bugs (since fixed), but also because I felt that it would be useful to
> first hear the opinions of more people before proceeding. I think that
> we're not that far from having something committed. Obviously I hope
> to get this into 9.4, and attach a lot of strategic importance to
> having the feature, which is why I made a large effort to help land
> it.
>
> Attached patch has a number of notable revisions. Throughout, it has
> been possible for anyone to follow our progress here:
> https://github.com/feodor/postgres/commits/jsonb_and_hstore
>
> * In general, the file jsonb_support.c (renamed to jsonb_utils.c) is
> vastly better commented, and has a much clearer structure. This was
> not something I did much with in the previous revision, and so it has
> been a definite focus of this one.
>
> * Hashing is refactored to not use CRC32 anymore. I felt this was a
> questionable method of hashing, both within jsonb_hash(), as well as
> the jsonb_hash_ops GIN operator class.
>
> * Dead code elimination.
>
> * I got around to fixing the memory leaks in B-Tree support function one.
>
> * Andrew added hstore_to_jsonb, hstore_to_jsonb_loose functions and a
> cast. One goal of this effort is to preserve a parallel set of
> facilities for the json and jsonb types, and that includes
> hstore-related features.
>
> * A fix from Alexander for the jsonb_hash_ops @>operator issue I
> complained about during the last submission was merged.
>
> * There is no longer any GiST opclass. That just leaves B-Tree, hash,
> GIN (default) and GIN jsonb_hash_ops opclasses.
>
> My outstanding concerns are:
>
> * Have we got things right with GIN indexing, containment semantics,
> etc? See my remarks in the patch, by grepping "contain" within
> jsonb_util.c. Is the GIN text storage serialization format appropriate
> and correct?
>
> * General design concerns. By far the largest source of these is the
> file jsonb_util.c.
>
> * Is the on-disk format that we propose to tie Postgres to as good as
> it could be?
>

I've been working through all the changes and fixes that Peter and
others have made, and they look pretty good to me. There are a few
mostly cosmetic changes I want to make, but nothing that would be worth
holding up committing this for. I'm fairly keen to get this committed,
get some buildfarm coverage and get more people playing with it and
testing it.

Like Peter, I would like to see more comments from people on the GIN
support, especially.

The one outstanding significant question of substance I have is this:
given the commit 5 days ago of provision for triConsistent functions for
GIN opclasses, should be be adding these to the two GIN opclasses we are
providing, and what should they look like? Again, this isn't an issue
that I think needs to hold up committing what we have now.

Regarding Peter's last question, if we're not satisfied with the on-disk
format proposed it would mean throwing the whole effort out and starting
again. The only thing I have thought of as an alternative would be to
store the structure and values separately rather than with values inline
with the structure. That way you could have a hash of values more or
less, which would eliminate redundancy of storage of things like object
field names. But such a structure might well involve at least as much
computational overhead as the current structure. And nobody's been
saying all along "hold on, we can do better than this." So I'm pretty
inclined to go with what we have.

cheers

andrew

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Atri Sharma 2014-03-17 17:49:03 Re: Planner hints in Postgresql
Previous Message Atri Sharma 2014-03-17 17:45:55 Re: Planner hints in Postgresql