Re: jsonb status

Lists: pgsql-hackers
From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: jsonb status
Date: 2014-03-13 21:00:33
Message-ID: 53221C71.3060303@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Peter Geoghegan has been doing a lot of great cleanup of the jsonb code,
after moving in the bits we wanted from nested hstore. You can see the
current state of the code at
<https://github.com/feodor/postgres/tree/jsonb_and_hstore>

I've been working through some of his changes, I will probably make a
couple of minor tweaks, but basically they look pretty good.

I'll be travelling a good bit of tomorrow (Friday), but I hope Peter has
finished by the time I am back on deck late tomorrow and that I am able
to commit this on Saturday.

cheers

andrew


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb status
Date: 2014-03-16 08:10:32
Message-ID: CAM3SWZRUjV28_vKUe17gzZJVG-11Dv5e7mSyGVtgPwv_q5Ytvw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Mar 13, 2014 at 2:00 PM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
> I'll be travelling a good bit of tomorrow (Friday), but I hope Peter has
> finished by the time I am back on deck late tomorrow and that I am able to
> commit this on Saturday.

I asked Andrew to hold off on committing this today. It was agreed
that we weren't quite ready, because there were one or two remaining
bugs (since fixed), but also because I felt that it would be useful to
first hear the opinions of more people before proceeding. I think that
we're not that far from having something committed. Obviously I hope
to get this into 9.4, and attach a lot of strategic importance to
having the feature, which is why I made a large effort to help land
it.

Attached patch has a number of notable revisions. Throughout, it has
been possible for anyone to follow our progress here:
https://github.com/feodor/postgres/commits/jsonb_and_hstore

* In general, the file jsonb_support.c (renamed to jsonb_utils.c) is
vastly better commented, and has a much clearer structure. This was
not something I did much with in the previous revision, and so it has
been a definite focus of this one.

* Hashing is refactored to not use CRC32 anymore. I felt this was a
questionable method of hashing, both within jsonb_hash(), as well as
the jsonb_hash_ops GIN operator class.

* Dead code elimination.

* I got around to fixing the memory leaks in B-Tree support function one.

* Andrew added hstore_to_jsonb, hstore_to_jsonb_loose functions and a
cast. One goal of this effort is to preserve a parallel set of
facilities for the json and jsonb types, and that includes
hstore-related features.

* A fix from Alexander for the jsonb_hash_ops @>operator issue I
complained about during the last submission was merged.

* There is no longer any GiST opclass. That just leaves B-Tree, hash,
GIN (default) and GIN jsonb_hash_ops opclasses.

My outstanding concerns are:

* Have we got things right with GIN indexing, containment semantics,
etc? See my remarks in the patch, by grepping "contain" within
jsonb_util.c. Is the GIN text storage serialization format appropriate
and correct?

* General design concerns. By far the largest source of these is the
file jsonb_util.c.

* Is the on-disk format that we propose to tie Postgres to as good as
it could be?

--
Peter Geoghegan

Attachment Content-Type Size
jsonb-11.patch.gz application/x-gzip 80.1 KB

From: "Erik Rijkers" <er(at)xs4all(dot)nl>
To: "Peter Geoghegan" <pg(at)heroku(dot)com>
Cc: "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb status
Date: 2014-03-16 08:28:46
Message-ID: ce3e561c55c913849a060e5cc0829086.squirrel@webmail.xs4all.nl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, March 16, 2014 09:10, Peter Geoghegan wrote:

> [ jsonb-11.patch.gz ]

This doesn't quite compile:

[...]
patching file src/include/catalog/pg_amop.h
patching file src/include/catalog/pg_amproc.h
Hunk #3 FAILED at 358.
1 out of 3 hunks FAILED -- saving rejects to file src/include/catalog/pg_amproc.h.rej
patching file src/include/catalog/pg_cast.h
patching file src/include/catalog/pg_opclass.h
[...]

cat src/include/catalog/pg_amproc.h.rej
*** src/include/catalog/pg_amproc.h
--- src/include/catalog/pg_amproc.h
*************** DATA(insert ( 3659 3614 3614 2 3656 ))
*** 358,364 ****
DATA(insert ( 3659 3614 3614 3 3657 ));
DATA(insert ( 3659 3614 3614 4 3658 ));
DATA(insert ( 3659 3614 3614 5 2700 ));
!

/* sp-gist */
DATA(insert ( 3474 3831 3831 1 3469 ));
--- 360,373 ----
DATA(insert ( 3659 3614 3614 3 3657 ));
DATA(insert ( 3659 3614 3614 4 3658 ));
DATA(insert ( 3659 3614 3614 5 2700 ));
! DATA(insert ( 4036 3802 3802 1 3480 ));
! DATA(insert ( 4036 3802 3802 2 3482 ));
! DATA(insert ( 4036 3802 3802 3 3483 ));
! DATA(insert ( 4036 3802 3802 4 3484 ));
! DATA(insert ( 4037 3802 3802 1 351 ));
! DATA(insert ( 4037 3802 3802 2 3485 ));
! DATA(insert ( 4037 3802 3802 3 3486 ));
! DATA(insert ( 4037 3802 3802 4 3487 ));

/* sp-gist */
DATA(insert ( 3474 3831 3831 1 3469 ));


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Erik Rijkers <er(at)xs4all(dot)nl>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb status
Date: 2014-03-16 08:50:02
Message-ID: CAM3SWZQJH0K1GV2hb4sfKRN2pCTRQ4hDfZECkm=shHo8Spv-hA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Mar 16, 2014 at 1:28 AM, Erik Rijkers <er(at)xs4all(dot)nl> wrote:
>> [ jsonb-11.patch.gz ]
>
> This doesn't quite compile:

Sorry. I guess Andrew's earlier merging of master was insufficient.

Attached revision fixes bitrot.

--
Peter Geoghegan

Attachment Content-Type Size
jsonb-11-1.patch.gz application/x-gzip 80.1 KB

From: "Erik Rijkers" <er(at)xs4all(dot)nl>
To: "Peter Geoghegan" <pg(at)heroku(dot)com>
Cc: "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb status - ‘JsonbValue’ has no member named ‘size’
Date: 2014-03-16 11:37:02
Message-ID: 92d4fda05b98370724d9f0440ab61699.squirrel@webmail.xs4all.nl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, March 16, 2014 09:50, Peter Geoghegan wrote:
> On Sun, Mar 16, 2014 at 1:28 AM, Erik Rijkers <er(at)xs4all(dot)nl> wrote:
>>> [ jsonb-11.patch.gz ]
>>
>> This doesn't quite compile:
>
> Sorry. I guess Andrew's earlier merging of master was insufficient.
>
> Attached revision fixes bitrot.
>

Patch applies, but there is still something wrong; now during the actual compile (of contrib):
(this is actually the same error I get when pulling straight from the git repo (since yesterday somewhere, I think, or
maybe a bit earlier still))

(git repo being: on https://github.com/feodor/postgres.git, branch jsonb_and_hstore)

-- [2014.03.16 12:29:27 jsonb/0] make contrib
hstore_io.c: In function ‘hstore_to_jsonb’:
hstore_io.c:1398:6: error: ‘JsonbValue’ has no member named ‘size’
key.size = sizeof(JEntry);
^
hstore_io.c:1402:6: error: ‘JsonbValue’ has no member named ‘size’
key.size += key.string.len;
^
hstore_io.c:1408:7: error: ‘JsonbValue’ has no member named ‘size’
val.size = sizeof(JEntry);
^
hstore_io.c:1413:7: error: ‘JsonbValue’ has no member named ‘size’
val.size = sizeof(JEntry);
^
hstore_io.c:1417:7: error: ‘JsonbValue’ has no member named ‘size’
val.size += val.string.len;
^
hstore_io.c: In function ‘hstore_to_jsonb_loose’:
hstore_io.c:1450:6: error: ‘JsonbValue’ has no member named ‘size’
key.size = sizeof(JEntry);
^
hstore_io.c:1454:6: error: ‘JsonbValue’ has no member named ‘size’
key.size += key.string.len;
^
hstore_io.c:1458:6: error: ‘JsonbValue’ has no member named ‘size’
val.size = sizeof(JEntry);
^
hstore_io.c:1524:8: error: ‘JsonbValue’ has no member named ‘size’
val.size += VARSIZE_ANY(val.numeric) +sizeof(JEntry);
^
hstore_io.c:1528:8: error: ‘JsonbValue’ has no member named ‘size’
val.size = sizeof(JEntry);
^
hstore_io.c:1532:8: error: ‘JsonbValue’ has no member named ‘size’
val.size += val.string.len;
^
make[1]: *** [hstore_io.o] Error 1
make: *** [all-hstore-recurse] Error 2
-- make returned 2 - abort

thanks,

Erik Rijkers


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Erik Rijkers <er(at)xs4all(dot)nl>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb status - 'JsonbValue' has no member named 'size'
Date: 2014-03-16 12:23:40
Message-ID: 532597CC.2020603@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 03/16/2014 07:37 AM, Erik Rijkers wrote:
> On Sun, March 16, 2014 09:50, Peter Geoghegan wrote:
>> On Sun, Mar 16, 2014 at 1:28 AM, Erik Rijkers <er(at)xs4all(dot)nl> wrote:
>>>> [ jsonb-11.patch.gz ]
>>> This doesn't quite compile:
>> Sorry. I guess Andrew's earlier merging of master was insufficient.
>>
>> Attached revision fixes bitrot.
>>
> Patch applies, but there is still something wrong; now during the actual compile (of contrib):
> (this is actually the same error I get when pulling straight from the git repo (since yesterday somewhere, I think, or
> maybe a bit earlier still))
>
> (git repo being: on https://github.com/feodor/postgres.git, branch jsonb_and_hstore)
>
>
>
> -- [2014.03.16 12:29:27 jsonb/0] make contrib
> hstore_io.c: In function ‘hstore_to_jsonb’:
> hstore_io.c:1398:6: error: ‘JsonbValue’ has no member named ‘size’
> key.size = sizeof(JEntry);

Yeah, the field name was changed but this wasn't followed through to
contrib.

New patch attached.

cheers

andrew

Attachment Content-Type Size
jsonb-12.patch text/x-patch 423.8 KB

From: "Erik Rijkers" <er(at)xs4all(dot)nl>
To: "Andrew Dunstan" <andrew(at)dunslane(dot)net>
Cc: "Peter Geoghegan" <pg(at)heroku(dot)com>, "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb status - 'JsonbValue' has no member named 'size'
Date: 2014-03-16 14:10:31
Message-ID: 639d7c7551d2c04f79003f406c8158a4.squirrel@webmail.xs4all.nl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, March 16, 2014 13:23, Andrew Dunstan wrote:
>
> [ jsonb-12.patch ]

patch applies; compiles, and builds, but contrib installs with this error:

make[1]: *** No rule to make target `hstore--1.2.sql', needed by `installdata'. Stop.
make: *** [install-hstore-recurse] Error 2

After that an instance can be started but hstore is not available:

create extension hstore;
ERROR: could not stat file "/home/aardvark/pg_stuff/pg_installations/pgsql.jsonb/share/extension/hstore--1.3.sql": No such
file or directory


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Erik Rijkers <er(at)xs4all(dot)nl>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb status - 'JsonbValue' has no member named 'size'
Date: 2014-03-16 14:32:48
Message-ID: 5325B610.1030402@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 03/16/2014 10:10 AM, Erik Rijkers wrote:
> On Sun, March 16, 2014 13:23, Andrew Dunstan wrote:
>> [ jsonb-12.patch ]
> patch applies; compiles, and builds, but contrib installs with this error:
>
> make[1]: *** No rule to make target `hstore--1.2.sql', needed by `installdata'. Stop.
> make: *** [install-hstore-recurse] Error 2
>
> After that an instance can be started but hstore is not available:
>
> create extension hstore;
> ERROR: could not stat file "/home/aardvark/pg_stuff/pg_installations/pgsql.jsonb/share/extension/hstore--1.3.sql": No such
> file or directory
>
>

argh

Too many cooks might be spoiling the broth.

try attached

cheers

andrew

Attachment Content-Type Size
jsonb-12-a.patch text/x-patch 423.8 KB

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb status
Date: 2014-03-17 17:48:13
Message-ID: 5327355D.4020509@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 03/16/2014 04:10 AM, Peter Geoghegan wrote:
> On Thu, Mar 13, 2014 at 2:00 PM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
>> I'll be travelling a good bit of tomorrow (Friday), but I hope Peter has
>> finished by the time I am back on deck late tomorrow and that I am able to
>> commit this on Saturday.
> I asked Andrew to hold off on committing this today. It was agreed
> that we weren't quite ready, because there were one or two remaining
> bugs (since fixed), but also because I felt that it would be useful to
> first hear the opinions of more people before proceeding. I think that
> we're not that far from having something committed. Obviously I hope
> to get this into 9.4, and attach a lot of strategic importance to
> having the feature, which is why I made a large effort to help land
> it.
>
> Attached patch has a number of notable revisions. Throughout, it has
> been possible for anyone to follow our progress here:
> https://github.com/feodor/postgres/commits/jsonb_and_hstore
>
> * In general, the file jsonb_support.c (renamed to jsonb_utils.c) is
> vastly better commented, and has a much clearer structure. This was
> not something I did much with in the previous revision, and so it has
> been a definite focus of this one.
>
> * Hashing is refactored to not use CRC32 anymore. I felt this was a
> questionable method of hashing, both within jsonb_hash(), as well as
> the jsonb_hash_ops GIN operator class.
>
> * Dead code elimination.
>
> * I got around to fixing the memory leaks in B-Tree support function one.
>
> * Andrew added hstore_to_jsonb, hstore_to_jsonb_loose functions and a
> cast. One goal of this effort is to preserve a parallel set of
> facilities for the json and jsonb types, and that includes
> hstore-related features.
>
> * A fix from Alexander for the jsonb_hash_ops @>operator issue I
> complained about during the last submission was merged.
>
> * There is no longer any GiST opclass. That just leaves B-Tree, hash,
> GIN (default) and GIN jsonb_hash_ops opclasses.
>
> My outstanding concerns are:
>
> * Have we got things right with GIN indexing, containment semantics,
> etc? See my remarks in the patch, by grepping "contain" within
> jsonb_util.c. Is the GIN text storage serialization format appropriate
> and correct?
>
> * General design concerns. By far the largest source of these is the
> file jsonb_util.c.
>
> * Is the on-disk format that we propose to tie Postgres to as good as
> it could be?
>

I've been working through all the changes and fixes that Peter and
others have made, and they look pretty good to me. There are a few
mostly cosmetic changes I want to make, but nothing that would be worth
holding up committing this for. I'm fairly keen to get this committed,
get some buildfarm coverage and get more people playing with it and
testing it.

Like Peter, I would like to see more comments from people on the GIN
support, especially.

The one outstanding significant question of substance I have is this:
given the commit 5 days ago of provision for triConsistent functions for
GIN opclasses, should be be adding these to the two GIN opclasses we are
providing, and what should they look like? Again, this isn't an issue
that I think needs to hold up committing what we have now.

Regarding Peter's last question, if we're not satisfied with the on-disk
format proposed it would mean throwing the whole effort out and starting
again. The only thing I have thought of as an alternative would be to
store the structure and values separately rather than with values inline
with the structure. That way you could have a hash of values more or
less, which would eliminate redundancy of storage of things like object
field names. But such a structure might well involve at least as much
computational overhead as the current structure. And nobody's been
saying all along "hold on, we can do better than this." So I'm pretty
inclined to go with what we have.

cheers

andrew


From: Oleg Bartunov <obartunov(at)gmail(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb status
Date: 2014-03-17 18:20:30
Message-ID: CAF4Au4yGmyEyt+pi2oFmAD=kQuCMBdEjgCjpST615OrknJix3A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alexander will take a look on TriConsistent function.

On Mon, Mar 17, 2014 at 9:48 PM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
>
> On 03/16/2014 04:10 AM, Peter Geoghegan wrote:
>>
>> On Thu, Mar 13, 2014 at 2:00 PM, Andrew Dunstan <andrew(at)dunslane(dot)net>
>> wrote:
>>>
>>> I'll be travelling a good bit of tomorrow (Friday), but I hope Peter has
>>> finished by the time I am back on deck late tomorrow and that I am able
>>> to
>>> commit this on Saturday.
>>
>> I asked Andrew to hold off on committing this today. It was agreed
>> that we weren't quite ready, because there were one or two remaining
>> bugs (since fixed), but also because I felt that it would be useful to
>> first hear the opinions of more people before proceeding. I think that
>> we're not that far from having something committed. Obviously I hope
>> to get this into 9.4, and attach a lot of strategic importance to
>> having the feature, which is why I made a large effort to help land
>> it.
>>
>> Attached patch has a number of notable revisions. Throughout, it has
>> been possible for anyone to follow our progress here:
>> https://github.com/feodor/postgres/commits/jsonb_and_hstore
>>
>> * In general, the file jsonb_support.c (renamed to jsonb_utils.c) is
>> vastly better commented, and has a much clearer structure. This was
>> not something I did much with in the previous revision, and so it has
>> been a definite focus of this one.
>>
>> * Hashing is refactored to not use CRC32 anymore. I felt this was a
>> questionable method of hashing, both within jsonb_hash(), as well as
>> the jsonb_hash_ops GIN operator class.
>>
>> * Dead code elimination.
>>
>> * I got around to fixing the memory leaks in B-Tree support function one.
>>
>> * Andrew added hstore_to_jsonb, hstore_to_jsonb_loose functions and a
>> cast. One goal of this effort is to preserve a parallel set of
>> facilities for the json and jsonb types, and that includes
>> hstore-related features.
>>
>> * A fix from Alexander for the jsonb_hash_ops @>operator issue I
>> complained about during the last submission was merged.
>>
>> * There is no longer any GiST opclass. That just leaves B-Tree, hash,
>> GIN (default) and GIN jsonb_hash_ops opclasses.
>>
>> My outstanding concerns are:
>>
>> * Have we got things right with GIN indexing, containment semantics,
>> etc? See my remarks in the patch, by grepping "contain" within
>> jsonb_util.c. Is the GIN text storage serialization format appropriate
>> and correct?
>>
>> * General design concerns. By far the largest source of these is the
>> file jsonb_util.c.
>>
>> * Is the on-disk format that we propose to tie Postgres to as good as
>> it could be?
>>
>
>
>
> I've been working through all the changes and fixes that Peter and others
> have made, and they look pretty good to me. There are a few mostly cosmetic
> changes I want to make, but nothing that would be worth holding up
> committing this for. I'm fairly keen to get this committed, get some
> buildfarm coverage and get more people playing with it and testing it.
>
> Like Peter, I would like to see more comments from people on the GIN
> support, especially.
>
> The one outstanding significant question of substance I have is this: given
> the commit 5 days ago of provision for triConsistent functions for GIN
> opclasses, should be be adding these to the two GIN opclasses we are
> providing, and what should they look like? Again, this isn't an issue that I
> think needs to hold up committing what we have now.
>
> Regarding Peter's last question, if we're not satisfied with the on-disk
> format proposed it would mean throwing the whole effort out and starting
> again. The only thing I have thought of as an alternative would be to store
> the structure and values separately rather than with values inline with the
> structure. That way you could have a hash of values more or less, which
> would eliminate redundancy of storage of things like object field names. But
> such a structure might well involve at least as much computational overhead
> as the current structure. And nobody's been saying all along "hold on, we
> can do better than this." So I'm pretty inclined to go with what we have.
>
>
> cheers
>
> andrew
>
>
>
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb status
Date: 2014-03-19 16:28:28
Message-ID: 20140319162828.GC5459@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On 2014-03-13 17:00:33 -0400, Andrew Dunstan wrote:
> Peter Geoghegan has been doing a lot of great cleanup of the jsonb code,
> after moving in the bits we wanted from nested hstore. You can see the
> current state of the code at
> <https://github.com/feodor/postgres/tree/jsonb_and_hstore>

I've pushed one commit with minor fixes, and one with several FIXMEs to
http://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=shortlog;h=refs/heads/jsonb_and_hstore

The most imortant things are:

* Jsonb vs JsonbValue is just bad, the latter needs to be renamed, and
there needs to be a very clear explanation about why two forms exist
and what each is used for.
* The whole datastructure doesn't have any sensible highlevel
documentation.
* I don't really like the JsonbSuperHeader name/abstraction. What does
the "Super" mean? The only interpetation I see is that it's the
toplevel header (why not Top then?), but JEntry has the comment: "/*
May be accessed as superheader */"...

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: jsonb status
Date: 2014-03-19 16:55:03
Message-ID: 5329CBE7.10606@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 03/19/2014 09:28 AM, Andres Freund wrote:
> * The whole datastructure doesn't have any sensible highlevel
> documentation.

Explain ... ? I'm planning on improving the docs through the beta
period for this, so can you explain what kind of docs we're missing here?

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: jsonb status
Date: 2014-03-19 16:58:59
Message-ID: 20140319165858.GF5459@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2014-03-19 09:55:03 -0700, Josh Berkus wrote:
> On 03/19/2014 09:28 AM, Andres Freund wrote:
> > * The whole datastructure doesn't have any sensible highlevel
> > documentation.
>
> Explain ... ? I'm planning on improving the docs through the beta
> period for this, so can you explain what kind of docs we're missing here?

Ah, sorry, that was easy to misunderstand. I was talking about the
on-disk datastructure on the code level, not the user/SQL exposed
part. I've added more verbose FIXMEs to the relevant places where I
think such documentation should be added.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb status
Date: 2014-03-19 19:58:40
Message-ID: CAM3SWZQMOkZbnUUsSZb7hcN_sco5mo06MbsS8ZHi8U8hUySPhg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Mar 19, 2014 at 9:28 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> I've pushed one commit with minor fixes, and one with several FIXMEs to
> http://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=shortlog;h=refs/heads/jsonb_and_hstore

Cool.

> * Jsonb vs JsonbValue is just bad, the latter needs to be renamed, and
> there needs to be a very clear explanation about why two forms exist
> and what each is used for.

Agreed. We should probably emphasize the distinction.

> * The whole datastructure doesn't have any sensible highlevel
> documentation.

I should probably go make some ascii art, per your comment.

> * I don't really like the JsonbSuperHeader name/abstraction. What does
> the "Super" mean? The only interpetation I see is that it's the
> toplevel header (why not Top then?), but JEntry has the comment: "/*
> May be accessed as superheader */"...

What it means is that you can pass a pointer to just past the varlena
Jsonb pointer in some contexts (a VARDATA() pointer), and a pointer
into a jbvBinary buffer in others. The only reason that functions
like findJsonbValueFromSuperHeader() take a super header rather than a
Jsonb pointer is that if they did take a Jsonb*, you'd be making a
representation that the varlena header wasn't garbage, which could not
be ensured in all circumstances, as when the would-be varlena bytes
are something else entirely, or perhaps are even at an address that is
undefined to access. Sites with jbvBinary pointers (e.g. the iterator
code) would have to worry about "faking varlena". That comment is
misleading, and the general idea does need to be explained better.

What you propose for jsonb_typeof() looks incorrect to me. Both of
those JsonbIteratorNext() calls are required, to "get past" the
container pseudo array to get to the underlying single scalar element.

--
Peter Geoghegan


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Oleg Bartunov <obartunov(at)gmail(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb status
Date: 2014-03-19 20:13:43
Message-ID: 5329FA77.3070602@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 03/19/2014 03:58 PM, Peter Geoghegan wrote:
> On Wed, Mar 19, 2014 at 9:28 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
>> * Jsonb vs JsonbValue is just bad, the latter needs to be renamed, and
>> there needs to be a very clear explanation about why two forms exist
>> and what each is used for.
> Agreed. We should probably emphasize the distinction.

Perhaps Oleg and Teodor might like to chime in on this.

>
>> * The whole datastructure doesn't have any sensible highlevel
>> documentation.

And this.

cheers

andrew


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb status
Date: 2014-03-19 22:57:26
Message-ID: CAM3SWZR9gJYC=mDE7hxKb8qoiJfFqM7E3cTgbDtxvD8Y6JhZcQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Mar 19, 2014 at 9:28 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> * Jsonb vs JsonbValue is just bad, the latter needs to be renamed, and
> there needs to be a very clear explanation about why two forms exist
> and what each is used for.

I've pushed some comments to Github that further clarify the
distinction: https://github.com/feodor/postgres/commit/5835de0b55bdfc02ee59e2affb6ce25995587dc4

I did not change the name of JsonbValue, because I sincerely could not
think of a better one. The distinction is subtle.

--
Peter Geoghegan


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb status
Date: 2014-03-20 00:10:48
Message-ID: 532A3208.6060201@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 03/19/2014 06:57 PM, Peter Geoghegan wrote:
> On Wed, Mar 19, 2014 at 9:28 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
>> * Jsonb vs JsonbValue is just bad, the latter needs to be renamed, and
>> there needs to be a very clear explanation about why two forms exist
>> and what each is used for.
> I've pushed some comments to Github that further clarify the
> distinction: https://github.com/feodor/postgres/commit/5835de0b55bdfc02ee59e2affb6ce25995587dc4
>
> I did not change the name of JsonbValue, because I sincerely could not
> think of a better one. The distinction is subtle.
>

I didn't like the _state->state stuff either, but I think you changed
the wrong name - it's the field name in the struct that needs changing.
What you've done is inconsistent with the common idiom in jsonfuncs.c.

cheers

andrew


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb status
Date: 2014-03-20 00:33:30
Message-ID: CAM3SWZQjVHnpPtJUmaN-tst=4j6HMRRwxv9TWqfG-u+Au4h0qA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Mar 19, 2014 at 5:10 PM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
> I didn't like the _state->state stuff either, but I think you changed the
> wrong name - it's the field name in the struct that needs changing. What
> you've done is inconsistent with the common idiom in jsonfuncs.c.

Okay. I've changed the variable name back, while changing the struct
field as suggested.

--
Peter Geoghegan


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb status
Date: 2014-03-22 20:53:06
Message-ID: CAM3SWZSdJz1O+NTwfB+yCkh7R2FouHRF9x2E72DeSbxvs_WvSA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Attached is v12. I think I've brought this as far as I can.

This is mostly just bug fixes, and some additional refactoring. I've
incorporated Andres' feedback. The only points that I think worth
noting are:

* The documentation has been significantly expanded to discuss
"containment" further, since it's a significant part of the feature.
This could probably use some polishing and general scrutiny, which is
something that Andrew may consider. I didn't have time to go over it
to the extent that I'd prefer.

* I altered containment semantics slightly. Now, it is not possible
for a "raw scalar" to contain a single-element array of the same
scalar, while the inverse is still possible (raw scalars still contain
themselves too). This made sense to me, and is consistent with the
behavior of the B-Tree operator class, where a raw scalar is not equal
to a single-element array of the same scalar. Rather, array is greater
than the scalar on the basis of its type alone, as at every other
nesting level. The fact that an array can contain a raw scalar is
explicitly presented as an exception to the general principle that
containment needs to be at the same nesting level.

I'm not going to go into the details of the bugs fixed, since no one
reported them here, and since they're trivial in nature. For full
details, you can review the log of our publicly accessible feature
branch.

--
Peter Geoghegan

Attachment Content-Type Size
jsonb-12.patch.gz application/x-gzip 87.2 KB

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Andres Freund <andres(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb status
Date: 2014-03-23 00:22:12
Message-ID: 20140323002212.GB5606@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Mar 22, 2014 at 01:53:06PM -0700, Peter Geoghegan wrote:
> Attached is v12. I think I've brought this as far as I can.
>
> This is mostly just bug fixes, and some additional refactoring. I've
> incorporated Andres' feedback. The only points that I think worth
> noting are:
>
> * The documentation has been significantly expanded to discuss
> "containment" further, since it's a significant part of the feature.
> This could probably use some polishing and general scrutiny, which is
> something that Andrew may consider. I didn't have time to go over it
> to the extent that I'd prefer.
>
> * I altered containment semantics slightly. Now, it is not possible
> for a "raw scalar" to contain a single-element array of the same
> scalar, while the inverse is still possible (raw scalars still contain
> themselves too). This made sense to me, and is consistent with the
> behavior of the B-Tree operator class, where a raw scalar is not equal
> to a single-element array of the same scalar. Rather, array is greater
> than the scalar on the basis of its type alone, as at every other
> nesting level. The fact that an array can contain a raw scalar is
> explicitly presented as an exception to the general principle that
> containment needs to be at the same nesting level.
>
> I'm not going to go into the details of the bugs fixed, since no one
> reported them here, and since they're trivial in nature. For full
> details, you can review the log of our publicly accessible feature
> branch.

What did you decide about hashing values in indexes vs. putting them in
literally?

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ Everyone has their own god. +


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Andres Freund <andres(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb status
Date: 2014-03-23 00:27:32
Message-ID: CAM3SWZRS_2mJZzCYhsrTahrw8EFK10sj-KVO+hOoS5GKY-SqVQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Mar 22, 2014 at 5:22 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> What did you decide about hashing values in indexes vs. putting them in
> literally?

There are two GIN opclasses supplied. There is a default, which
supports more operators (various "existence" operators - see the
documentation). There is an alternative called jsonb_hash_ops that
only supports containment, and performs considerably better than the
default. Containment *is* the compelling operator to support, though -
you can do rather a lot with it. This must be what you're referring
to, since I recall you blogged about the response it got at pgConf.EU.
Both are available.

--
Peter Geoghegan


From: Oleg Bartunov <obartunov(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Andres Freund <andres(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb status
Date: 2014-03-23 08:32:45
Message-ID: CAF4Au4wXW6XsehaTq50eP0Vaq1qVKjGu2ff_C-MBX3aoPCicmA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

It's easy to add support of other operations to hash_ops, so it will
be on par with default GIN opclass, at the price of bigger size. We
can add it later to contrib/jsonbext.

I'm mostly worrying about changing semantics of scalar.

On Sun, Mar 23, 2014 at 4:27 AM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
> On Sat, Mar 22, 2014 at 5:22 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
>> What did you decide about hashing values in indexes vs. putting them in
>> literally?
>
> There are two GIN opclasses supplied. There is a default, which
> supports more operators (various "existence" operators - see the
> documentation). There is an alternative called jsonb_hash_ops that
> only supports containment, and performs considerably better than the
> default. Containment *is* the compelling operator to support, though -
> you can do rather a lot with it. This must be what you're referring
> to, since I recall you blogged about the response it got at pgConf.EU.
> Both are available.
>
>
> --
> Peter Geoghegan
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Oleg Bartunov <obartunov(at)gmail(dot)com>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Andres Freund <andres(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb status
Date: 2014-03-23 12:20:02
Message-ID: 20140323122002.GD5606@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Mar 23, 2014 at 12:32:45PM +0400, Oleg Bartunov wrote:
> It's easy to add support of other operations to hash_ops, so it will
> be on par with default GIN opclass, at the price of bigger size. We
> can add it later to contrib/jsonbext.
>
> I'm mostly worrying about changing semantics of scalar.
>
>
> On Sun, Mar 23, 2014 at 4:27 AM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
> > On Sat, Mar 22, 2014 at 5:22 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> >> What did you decide about hashing values in indexes vs. putting them in
> >> literally?
> >
> > There are two GIN opclasses supplied. There is a default, which
> > supports more operators (various "existence" operators - see the
> > documentation). There is an alternative called jsonb_hash_ops that
> > only supports containment, and performs considerably better than the
> > default. Containment *is* the compelling operator to support, though -
> > you can do rather a lot with it. This must be what you're referring
> > to, since I recall you blogged about the response it got at pgConf.EU.
> > Both are available.

My question was about whether we decided to abandon the GiST support
entirely as there is no method for indexing long values:

http://www.postgresql.org/message-id/CAM3SWZSbsedz_YwsQm-chT6B6KX0rh-vke=5Nb2GBLSEm0E8AQ@mail.gmail.com

In reading your reply, I now understand that GIN supports hash and
non-hash indexing types, which is great!

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ Everyone has their own god. +