Re: jsonb and nested hstore

Lists: pgsql-hackers
From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: jsonb and nested hstore
Date: 2014-01-26 22:42:20
Message-ID: 52E58F4C.90600@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Here is the latest set of patches for nested hstore and jsonb.

Because it's so large I've broken this into two patches and compressed
them. The jsonb patch should work standalone. The nested hstore patch
depends on it.

All the jsonb functions now use the jsonb API - there is no more turning
jsonb into text and reparsing it.

At this stage I'm going to be starting cleanup on the jsonb code
(indentation, error messages, comments etc.) as well get getting up some
jsonb docs.

cheers

andrew

Attachment Content-Type Size
jsonb-5.patch.gz application/x-gzip 26.8 KB
nested-hstore-5.patch.gz application/x-gzip 65.7 KB

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-01-28 03:43:01
Message-ID: 52E72745.90307@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 01/26/2014 05:42 PM, Andrew Dunstan wrote:
>
> Here is the latest set of patches for nested hstore and jsonb.
>
> Because it's so large I've broken this into two patches and compressed
> them. The jsonb patch should work standalone. The nested hstore patch
> depends on it.
>
> All the jsonb functions now use the jsonb API - there is no more
> turning jsonb into text and reparsing it.
>
> At this stage I'm going to be starting cleanup on the jsonb code
> (indentation, error messages, comments etc.) as well get getting up
> some jsonb docs.
>
>
>

Here is an update of the jsonb part of this. Charges:

* there is now documentation for jsonb
* most uses of elog() in json_funcs.c are replaced by ereport().
* indentation fixes and other tidying.

No changes in functionality.

cheers

andrew

Attachment Content-Type Size
jsonb-6.patch text/x-patch 175.9 KB

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-01-28 14:38:34
Message-ID: CAHyXU0wZ-a4s5GhkgJkr_Vb0FrECcLpDk4u+3M98P94b5srswg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jan 27, 2014 at 9:43 PM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
>
> On 01/26/2014 05:42 PM, Andrew Dunstan wrote:
>>
>>
>> Here is the latest set of patches for nested hstore and jsonb.
>>
>> Because it's so large I've broken this into two patches and compressed
>> them. The jsonb patch should work standalone. The nested hstore patch
>> depends on it.
>>
>> All the jsonb functions now use the jsonb API - there is no more turning
>> jsonb into text and reparsing it.
>>
>> At this stage I'm going to be starting cleanup on the jsonb code
>> (indentation, error messages, comments etc.) as well get getting up some
>> jsonb docs.
>>
>>
>>
>
>
> Here is an update of the jsonb part of this. Charges:
>
> * there is now documentation for jsonb
> * most uses of elog() in json_funcs.c are replaced by ereport().
> * indentation fixes and other tidying.
>
> No changes in functionality.

Don't have time to fire it up this morning, but a quick scan of the
patch turned up a few minor things:

* see a comment typo, line 290 'jsonn':
* line 332: 'bogus input' -- is this up to error reporting standards?
How about "value 'x' must be one of array, object, numeric, string,
bool"?
* line 357: "jsonb's key could be only a string" prefer non
possessive: jsonb keys must be a string
* line 374, 389: ditto 332
* line 513: is panic appropriate here?
* line 599: ditto
* line 730: odd phrasing in comment, also commenting on this function
is a little light
* line 807: slightly prefer 'with respect to'
* line 888: another PANIC: these maybe correct, seems odd to halt
server on corrupted datum though*
* line 1150: hm, is the jsonb internal hash structure documented?
Aside: why didn't we use standard hash table (performance maybe)?
* line 1805-6: poor phrasing. How about: "it will order and make
unique the hash keys. Otherwise we believe that pushed keys are
ordered and unique. (Don't like verbed 'unqiue').
* line 1860: "no break here: "

merlin


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-01-28 15:01:07
Message-ID: 52E7C633.3090704@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 01/28/2014 09:38 AM, Merlin Moncure wrote:
> On Mon, Jan 27, 2014 at 9:43 PM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
>> On 01/26/2014 05:42 PM, Andrew Dunstan wrote:
>>>
>>> Here is the latest set of patches for nested hstore and jsonb.
>>>
>>> Because it's so large I've broken this into two patches and compressed
>>> them. The jsonb patch should work standalone. The nested hstore patch
>>> depends on it.
>>>
>>> All the jsonb functions now use the jsonb API - there is no more turning
>>> jsonb into text and reparsing it.
>>>
>>> At this stage I'm going to be starting cleanup on the jsonb code
>>> (indentation, error messages, comments etc.) as well get getting up some
>>> jsonb docs.
>>>
>>>
>>>
>>
>> Here is an update of the jsonb part of this. Charges:
>>
>> * there is now documentation for jsonb
>> * most uses of elog() in json_funcs.c are replaced by ereport().
>> * indentation fixes and other tidying.
>>
>> No changes in functionality.
> Don't have time to fire it up this morning, but a quick scan of the
> patch turned up a few minor things:
>
> * see a comment typo, line 290 'jsonn':
> * line 332: 'bogus input' -- is this up to error reporting standards?
> How about "value 'x' must be one of array, object, numeric, string,
> bool"?
> * line 357: "jsonb's key could be only a string" prefer non
> possessive: jsonb keys must be a string
> * line 374, 389: ditto 332
> * line 513: is panic appropriate here?
> * line 599: ditto
> * line 730: odd phrasing in comment, also commenting on this function
> is a little light
> * line 807: slightly prefer 'with respect to'
> * line 888: another PANIC: these maybe correct, seems odd to halt
> server on corrupted datum though*
> * line 1150: hm, is the jsonb internal hash structure documented?
> Aside: why didn't we use standard hash table (performance maybe)?
> * line 1805-6: poor phrasing. How about: "it will order and make
> unique the hash keys. Otherwise we believe that pushed keys are
> ordered and unique. (Don't like verbed 'unqiue').
> * line 1860: "no break here: "
>

Looks like this review is against jsonb-5, not jsonb-6.

cheers

andrew


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-01-28 15:02:23
Message-ID: CAHyXU0yNSkKY+P4tLPn-LnRw8dJ0LCartD-6PWQp1=bp1dEbWw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> Looks like this review is against jsonb-5, not jsonb-6.

oh yep -- shoot, sorry for the noise.

merlin


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-01-28 15:50:04
Message-ID: 20140128155004.GN10723@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andrew Dunstan wrote:

> <para>
> + There are two JSON data types: <type>json</type> and <type>jsonb</type>.
> + Both accept identical sets of values as input. The difference is primarily
> + a matter of efficiency. The <type>json</type> data type stores an exact
> + copy of the the input text, and the processing functions have to reparse
> + it to precess it, while the <type>jsonb</type> is stored in a decomposed
> + form that makes it slightly less efficient to input but very much faster
> + to process, since it never needs reparsing.
> + </para>

typo "precess"
duplicated word "of the the input"

> + </indexterm><indexterm>
> + <primary>jsonb_each</primary>
> + </indexterm><para><literal>json_each(json)</literal>
> + </para><para><literal>jsonb_each(jsonb)</literal>
> + </para></entry>

This SGML nesting is odd and hard to read. Please place opening tags in
separate lines (or at least not immediately following a closing tag). I
am not sure whether the mentions of jsonb_each vs. json_each there are
correct or typos. This also occurs in other places.

> Expands the object in <replaceable>from_json</replaceable> to a row whose columns match
> the record type defined by base. Conversion will be best
> effort; columns in base with no corresponding key in <replaceable>from_json</replaceable>
> - will be left null. If a column is specified more than once, the last value is used.
> + will be left null. When processing <type>json</type>, if a column is
> + specified more than once, the last value is used.

Maybe you also need to specify what happens with jsonb?

> diff --git a/src/backend/utils/adt/jsonb.c b/src/backend/utils/adt/jsonb.c
> new file mode 100644
> index 0000000..107ebf0
> --- /dev/null
> +++ b/src/backend/utils/adt/jsonb.c
> @@ -0,0 +1,544 @@
> +/*-------------------------------------------------------------------------
> + *
> + * jsonb.c
> + * I/O for jsonb type
> + *
> + * Portions Copyright (c) 1996-2013, PostgreSQL Global Development Group

2014. Why "Portions", if we don't attribute any portion to UCB?

> + * NOTE. JSONB type is designed to be binary compatible with hstore.
> + *
> + * src/backend/utils/adt/jsonb_support.c

Typo'ed name here.

> +#include "postgres.h"
+
> +#include "libpq/pqformat.h"
> +#include "utils/builtins.h"
> +#include "utils/json.h"
> +#include "utils/jsonapi.h"
> +#include "utils/jsonb.h"

Misplaced prototype?

> +static void recvJsonb(StringInfo buf, JsonbValue *v, uint32 level, uint32 header);

Not sure about the jsonb_1.out file. Is that only due to encoding
differences? What happens if you run it in a completely different
encoding than whatever you tested with? (I would assume Latin-9 and
UTF8) If it fails, then I think you'll end up ripping those tests out,
so probably the _1.out file will have no value at all.

I also wonder if it'd be better to have one large .sql file that
produces the same output in all platforms that tests most of the common
stuff, so that tests that changes output in different platforms can have
smaller alternative expected files.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-01-28 16:00:54
Message-ID: 52E7D436.8080002@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 01/28/2014 10:50 AM, Alvaro Herrera wrote:
>> + </indexterm><indexterm>
>> + <primary>jsonb_each</primary>
>> + </indexterm><para><literal>json_each(json)</literal>
>> + </para><para><literal>jsonb_each(jsonb)</literal>
>> + </para></entry>
> This SGML nesting is odd and hard to read. Please place opening tags in
> separate lines (or at least not immediately following a closing tag). I
> am not sure whether the mentions of jsonb_each vs. json_each there are
> correct or typos. This also occurs in other places.
>
>

As I understand it, an <entry> tag can only contain block-level elements
like <para> if there are no inline elements (including white space).

If that's not correct I'll change it, but that's what I read here:
<http://oreilly.com/openbook/docbook/book/entry.html>

cheers

andrew


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-01-28 16:09:34
Message-ID: 23256.1390925374@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> On 01/28/2014 10:50 AM, Alvaro Herrera wrote:
> + </indexterm><indexterm>
> + <primary>jsonb_each</primary>
> + </indexterm><para><literal>json_each(json)</literal>
> + </para><para><literal>jsonb_each(jsonb)</literal>
> + </para></entry>
>> This SGML nesting is odd and hard to read. Please place opening tags in
>> separate lines (or at least not immediately following a closing tag). I
>> am not sure whether the mentions of jsonb_each vs. json_each there are
>> correct or typos. This also occurs in other places.

> As I understand it, an <entry> tag can only contain block-level elements
> like <para> if there are no inline elements (including white space).

Practically every existing use of <indexterm> is freer than this in its
use of whitespace. It sounds to me like maybe you are trying to put the
<indexterm> inside something it shouldn't go inside of.

regards, tom lane


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-01-28 16:22:23
Message-ID: 20140128162223.GO10723@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> > On 01/28/2014 10:50 AM, Alvaro Herrera wrote:
> > + </indexterm><indexterm>
> > + <primary>jsonb_each</primary>
> > + </indexterm><para><literal>json_each(json)</literal>
> > + </para><para><literal>jsonb_each(jsonb)</literal>
> > + </para></entry>
> >> This SGML nesting is odd and hard to read. Please place opening tags in
> >> separate lines (or at least not immediately following a closing tag). I
> >> am not sure whether the mentions of jsonb_each vs. json_each there are
> >> correct or typos. This also occurs in other places.
>
> > As I understand it, an <entry> tag can only contain block-level elements
> > like <para> if there are no inline elements (including white space).
>
> Practically every existing use of <indexterm> is freer than this in its
> use of whitespace. It sounds to me like maybe you are trying to put the
> <indexterm> inside something it shouldn't go inside of.

FWIW I was just talking about formatting of the SGML source so that it
is easier to read.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-01-28 16:26:16
Message-ID: 52E7DA28.9000406@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 01/28/2014 11:09 AM, Tom Lane wrote:
> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
>> On 01/28/2014 10:50 AM, Alvaro Herrera wrote:
>> + </indexterm><indexterm>
>> + <primary>jsonb_each</primary>
>> + </indexterm><para><literal>json_each(json)</literal>
>> + </para><para><literal>jsonb_each(jsonb)</literal>
>> + </para></entry>
>>> This SGML nesting is odd and hard to read. Please place opening tags in
>>> separate lines (or at least not immediately following a closing tag). I
>>> am not sure whether the mentions of jsonb_each vs. json_each there are
>>> correct or typos. This also occurs in other places.
>> As I understand it, an <entry> tag can only contain block-level elements
>> like <para> if there are no inline elements (including white space).
> Practically every existing use of <indexterm> is freer than this in its
> use of whitespace. It sounds to me like maybe you are trying to put the
> <indexterm> inside something it shouldn't go inside of.

The problem is not the indexterm element, it's the space that might
exist outside it. Are we using block level elements like <para> inside
entry elements anywhere else? If not, then your observation is not
relevant. If there are no block level elements then AIUI we can space
things out how we like inside the entry element.

If you can show me how else legally to get a line break inside an entry
element I'm very interested. I tried several things before I found this
way of making it work.

cheers

andrew


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-01-28 16:27:33
Message-ID: 23739.1390926453@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
> Tom Lane wrote:
>> Practically every existing use of <indexterm> is freer than this in its
>> use of whitespace. It sounds to me like maybe you are trying to put the
>> <indexterm> inside something it shouldn't go inside of.

> FWIW I was just talking about formatting of the SGML source so that it
> is easier to read.

Yeah, me too. I'm just suggesting that maybe Andrew needs to move the
indexterm so that he can format it more readably.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-01-28 16:29:36
Message-ID: 23799.1390926576@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> The problem is not the indexterm element, it's the space that might
> exist outside it. Are we using block level elements like <para> inside
> entry elements anywhere else?

Probably not, and I wonder why you're trying to. Whole paras inside
a table entry (this is a table no?) don't sound like they are going
to lead to nice-looking results.

regards, tom lane


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-01-28 16:33:05
Message-ID: 52E7DBC1.10007@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 01/28/2014 11:27 AM, Tom Lane wrote:
> Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
>> Tom Lane wrote:
>>> Practically every existing use of <indexterm> is freer than this in its
>>> use of whitespace. It sounds to me like maybe you are trying to put the
>>> <indexterm> inside something it shouldn't go inside of.
>> FWIW I was just talking about formatting of the SGML source so that it
>> is easier to read.
> Yeah, me too. I'm just suggesting that maybe Andrew needs to move the
> indexterm so that he can format it more readably.
>
>

Hmm. Maybe I could put them inside the para elements. So we'd have:

<entry><para>
<indexterm>
</indexterm>
para text
</para><para>
<indexterm>
</indexterm>
para text
</para></entry>

cheers

andrew


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-01-28 16:46:29
Message-ID: 52E7DEE5.5020207@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 01/28/2014 11:29 AM, Tom Lane wrote:
> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
>> The problem is not the indexterm element, it's the space that might
>> exist outside it. Are we using block level elements like <para> inside
>> entry elements anywhere else?
> Probably not, and I wonder why you're trying to. Whole paras inside
> a table entry (this is a table no?) don't sound like they are going
> to lead to nice-looking results.

See <http://developer.postgresql.org/~adunstan/functions-json.html>

cheers

andrew


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-01-28 17:58:20
Message-ID: CAHyXU0yeoVZ8yMOVAn8y-LbqQ4syrkARYX1y9_vgMAZzT1mf6w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Jan 28, 2014 at 10:46 AM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
>
> On 01/28/2014 11:29 AM, Tom Lane wrote:
>>
>> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
>>>
>>> The problem is not the indexterm element, it's the space that might
>>> exist outside it. Are we using block level elements like <para> inside
>>> entry elements anywhere else?
>>
>> Probably not, and I wonder why you're trying to. Whole paras inside
>> a table entry (this is a table no?) don't sound like they are going
>> to lead to nice-looking results.
>
> See <http://developer.postgresql.org/~adunstan/functions-json.html>

yeah. note: I think the json documentation needs *major* overhaul. too
much is going in inside the function listings where there really
should be a big breakout discussing the "big picture" of json/jsonb
with examples of various use cases. I want to give it a shot but
unfortunately can not commit to do that by the end of the 'fest.

merlin


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: jsonb and nested hstore
Date: 2014-01-28 18:09:13
Message-ID: 52E7F249.8030506@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 01/28/2014 09:58 AM, Merlin Moncure wrote:
> yeah. note: I think the json documentation needs *major* overhaul. too
> much is going in inside the function listings where there really
> should be a big breakout discussing the "big picture" of json/jsonb
> with examples of various use cases. I want to give it a shot but
> unfortunately can not commit to do that by the end of the 'fest.

FWIW, I've promised Andrew that I'll overhaul this by the end of beta.
Given that we have all of beta for doc refinements.

In addition to this, the JSON vs JSONB datatype page really needs
expansion and clarification.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-01-28 18:29:33
Message-ID: CAHyXU0ymmLMVPOUkyTzPo-VLE1uyxTi=MtUraEfFF6ce5YfxPw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Jan 28, 2014 at 12:09 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> On 01/28/2014 09:58 AM, Merlin Moncure wrote:
>> yeah. note: I think the json documentation needs *major* overhaul. too
>> much is going in inside the function listings where there really
>> should be a big breakout discussing the "big picture" of json/jsonb
>> with examples of various use cases. I want to give it a shot but
>> unfortunately can not commit to do that by the end of the 'fest.
>
> FWIW, I've promised Andrew that I'll overhaul this by the end of beta.
> Given that we have all of beta for doc refinements.
>
> In addition to this, the JSON vs JSONB datatype page really needs
> expansion and clarification.

right: exactly. I'd be happy to help (such as I can) ...I wanted to
see if jsonb to make it in on this 'fest (doc issues notwithstanding);
it hasn't been formally reviewed yet AFAICT. So my thinking here is
to get docs to minimum acceptable standards in the short term and
focus on the structural code issues for the 'fest (if jsonb slips then
it's moot obviously).

merlin


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-01-28 18:37:30
Message-ID: 52E7F8EA.1020001@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 01/28/2014 10:29 AM, Merlin Moncure wrote:
>> In addition to this, the JSON vs JSONB datatype page really needs
>> expansion and clarification.
>
> right: exactly. I'd be happy to help (such as I can) ...I wanted to
> see if jsonb to make it in on this 'fest (doc issues notwithstanding);
> it hasn't been formally reviewed yet AFAICT. So my thinking here is
> to get docs to minimum acceptable standards in the short term and
> focus on the structural code issues for the 'fest (if jsonb slips then
> it's moot obviously).

Well, having reviewed the docs before Andrew sent them in, I felt they
already *were* "minimum acceptable". Certainly they're as complete as
the original JSON docs were.

Or is this just about whitespace and line breaks?

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-01-28 18:56:52
Message-ID: 20140128185652.GW10723@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Josh Berkus escribió:

> Or is this just about whitespace and line breaks?

If the docs are going to be rehauled, please ignore my whitespace
comments.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-01-28 19:09:18
Message-ID: 52E8005E.2000400@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 01/28/2014 10:56 AM, Alvaro Herrera wrote:
> Josh Berkus escribió:
>
>> Or is this just about whitespace and line breaks?
>
> If the docs are going to be rehauled, please ignore my whitespace
> comments.

I'm sure you'll find plenty to criticize in my version. ;-)

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-01-29 18:03:16
Message-ID: 52E94264.9070206@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 01/27/2014 10:43 PM, Andrew Dunstan wrote:
>
> On 01/26/2014 05:42 PM, Andrew Dunstan wrote:
>>
>> Here is the latest set of patches for nested hstore and jsonb.
>>
>> Because it's so large I've broken this into two patches and
>> compressed them. The jsonb patch should work standalone. The nested
>> hstore patch depends on it.
>>
>> All the jsonb functions now use the jsonb API - there is no more
>> turning jsonb into text and reparsing it.
>>
>> At this stage I'm going to be starting cleanup on the jsonb code
>> (indentation, error messages, comments etc.) as well get getting up
>> some jsonb docs.
>>
>>
>>
>
>
> Here is an update of the jsonb part of this. Charges:
>
> * there is now documentation for jsonb
> * most uses of elog() in json_funcs.c are replaced by ereport().
> * indentation fixes and other tidying.
>
> No changes in functionality.
>

Further update of jsonb portion.

Only change in functionality is the addition of casts between jsonb and
json.

The other changes are the merge with the new json functions code, and
rearrangement of the docs changes to make them less ugly. Essentially I
moved the indexterm tags right out of the table as is done in some other
parts pf the docs. That makes the entry tags much clearer to read.

cheers

andrew

Attachment Content-Type Size
jsonb-7.patch text/x-patch 188.2 KB

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-01-29 20:46:42
Message-ID: CAHyXU0z=m--i_vPUe13t5Pzr5W5K6545wOQWGFvVxKx9aLVeiA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Jan 29, 2014 at 12:03 PM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
> Only change in functionality is the addition of casts between jsonb and
> json.
>
> The other changes are the merge with the new json functions code, and
> rearrangement of the docs changes to make them less ugly. Essentially I
> moved the indexterm tags right out of the table as is done in some other
> parts pf the docs. That makes the entry tags much clearer to read.

I think the opening paragraphs contrasting json/jsonb be needs
refinement. json is going to be slightly faster than jsonb for input
*and* output. For example, in one application I store fairly large
json objects containing pre-compiled static polygon data that is
simply flipped up to google maps. This case will likely be pessimal
for jsonb. For the next paragaph, I'd like to expand it a bit on
'specialized needs' and boil it down to specific uses cases.
Basically, json will likely be more compact in most cases and slightly
faster for input/output; jsonb would be preferred in any context
where processing, or searching or extensive server side parsing is
employed.

If you agree, I'd be happy to do that...

merlin


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: jsonb and nested hstore
Date: 2014-01-29 20:50:57
Message-ID: 52E969B1.6090905@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 01/29/2014 12:46 PM, Merlin Moncure wrote:
> I think the opening paragraphs contrasting json/jsonb be needs
> refinement. json is going to be slightly faster than jsonb for input
> *and* output. For example, in one application I store fairly large
> json objects containing pre-compiled static polygon data that is
> simply flipped up to google maps. This case will likely be pessimal
> for jsonb. For the next paragaph, I'd like to expand it a bit on
> 'specialized needs' and boil it down to specific uses cases.
> Basically, json will likely be more compact in most cases and slightly
> faster for input/output; jsonb would be preferred in any context
> where processing, or searching or extensive server side parsing is
> employed.
>
> If you agree, I'd be happy to do that...

Please take a stab at it, I'll be happy to revise it.

I was working on doing a two-column table comparison chart; I still
think that's the best way to go.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-01-29 21:56:38
Message-ID: 52E97916.3010703@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 01/29/2014 01:03 PM, Andrew Dunstan wrote:
>
> On 01/27/2014 10:43 PM, Andrew Dunstan wrote:
>>
>> On 01/26/2014 05:42 PM, Andrew Dunstan wrote:
>>>
>>> Here is the latest set of patches for nested hstore and jsonb.
>>>
>>> Because it's so large I've broken this into two patches and
>>> compressed them. The jsonb patch should work standalone. The nested
>>> hstore patch depends on it.
>>>
>>> All the jsonb functions now use the jsonb API - there is no more
>>> turning jsonb into text and reparsing it.
>>>
>>> At this stage I'm going to be starting cleanup on the jsonb code
>>> (indentation, error messages, comments etc.) as well get getting up
>>> some jsonb docs.
>>>
>>>
>>>
>>
>>
>> Here is an update of the jsonb part of this. Charges:
>>
>> * there is now documentation for jsonb
>> * most uses of elog() in json_funcs.c are replaced by ereport().
>> * indentation fixes and other tidying.
>>
>> No changes in functionality.
>>
>
>
> Further update of jsonb portion.
>
> Only change in functionality is the addition of casts between jsonb
> and json.
>
> The other changes are the merge with the new json functions code, and
> rearrangement of the docs changes to make them less ugly. Essentially
> I moved the indexterm tags right out of the table as is done in some
> other parts pf the docs. That makes the entry tags much clearer to read.
>
>
>

Updated to apply cleanly after recent commits.

cheers

andrew

Attachment Content-Type Size
jsonb-8.patch text/x-patch 188.0 KB

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-01-29 22:37:48
Message-ID: CAHyXU0wqadCJk7MMkeARuuY05VrD=AXDn6wDceMtuWo5p4CUiA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Jan 29, 2014 at 3:56 PM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
>
> On 01/29/2014 01:03 PM, Andrew Dunstan wrote:
>>
>>
>> On 01/27/2014 10:43 PM, Andrew Dunstan wrote:
>>>
>>>
>>> On 01/26/2014 05:42 PM, Andrew Dunstan wrote:
>>>>
>>>>
>>>> Here is the latest set of patches for nested hstore and jsonb.
>>>>
>>>> Because it's so large I've broken this into two patches and compressed
>>>> them. The jsonb patch should work standalone. The nested hstore patch
>>>> depends on it.
>>>>
>>>> All the jsonb functions now use the jsonb API - there is no more turning
>>>> jsonb into text and reparsing it.
>>>>
>>>> At this stage I'm going to be starting cleanup on the jsonb code
>>>> (indentation, error messages, comments etc.) as well get getting up some
>>>> jsonb docs.
>>>>
>>>>
>>>>
>>>
>>>
>>> Here is an update of the jsonb part of this. Charges:
>>>
>>> * there is now documentation for jsonb
>>> * most uses of elog() in json_funcs.c are replaced by ereport().
>>> * indentation fixes and other tidying.
>>>
>>> No changes in functionality.
>>>
>>
>>
>> Further update of jsonb portion.
>>
>> Only change in functionality is the addition of casts between jsonb and
>> json.
>>
>> The other changes are the merge with the new json functions code, and
>> rearrangement of the docs changes to make them less ugly. Essentially I
>> moved the indexterm tags right out of the table as is done in some other
>> parts pf the docs. That makes the entry tags much clearer to read.
>
> Updated to apply cleanly after recent commits.

ok, great. This is really fabulous. So far most everything feels
natural and good.

I see something odd in terms of the jsonb use case coverage. One of
the major headaches with json deserialization presently is that
there's no easy way to easily move a complex (record- or array-
containing) json structure into a row object. For example,

create table bar(a int, b int[]);
postgres=# select jsonb_populate_record(null::bar, '{"a": 1, "b":
[1,2]}'::jsonb, false);
ERROR: cannot populate with a nested object unless use_json_as_text is true

If find the use_json_as_text argument here to be pretty useless
(unlike in the json_build to_record variants where it least provides
some hope for an escape hatch) for handling this since it will just
continue to fail:

postgres=# select jsonb_populate_record(null::bar, '{"a": 1, "b":
[1,2]}'::jsonb, true);
ERROR: missing "]" in array dimensions

OTOH, the nested hstore handles this no questions asked:

postgres=# select * from populate_record(null::bar, '"a"=>1,
"b"=>{1,2}'::hstore);
a | b
---+-------
1 | {1,2}

So, if you need to convert a complex json to a row type, the only
effective way to do that is like this:
postgres=# select* from populate_record(null::bar, '{"a": 1, "b":
[1,2]}'::json::hstore);
a | b
---+-------
1 | {1,2}

Not a big deal really. But it makes me wonder (now that we have the
internal capability of properly mapping to a record) why *both* the
json/jsonb populate record variants shouldn't point to what the nested
hstore behavior is when the 'as_text' flag is false. That would
demolish the error and remove the dependency on hstore in order to do
effective rowtype mapping. In an ideal world the json_build
'to_record' variants would behave similarly I think although there's
no existing hstore analog so I'm assuming it's a non-trival amount of
work.

Now, if we're agreed on that, I then also wonder if the 'as_text'
argument needs to exist at all for the populate functions except for
backwards compatibility on the json side (not jsonb). For non-complex
structures it does best effort casting anyways so the flag is moot.

merlin


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-01-29 22:55:18
Message-ID: 52E986D6.6070603@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 01/29/2014 05:37 PM, Merlin Moncure wrote:
> On Wed, Jan 29, 2014 at 3:56 PM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
>> On 01/29/2014 01:03 PM, Andrew Dunstan wrote:
>>>
>>> On 01/27/2014 10:43 PM, Andrew Dunstan wrote:
>>>>
>>>> On 01/26/2014 05:42 PM, Andrew Dunstan wrote:
>>>>>
>>>>> Here is the latest set of patches for nested hstore and jsonb.
>>>>>
>>>>> Because it's so large I've broken this into two patches and compressed
>>>>> them. The jsonb patch should work standalone. The nested hstore patch
>>>>> depends on it.
>>>>>
>>>>> All the jsonb functions now use the jsonb API - there is no more turning
>>>>> jsonb into text and reparsing it.
>>>>>
>>>>> At this stage I'm going to be starting cleanup on the jsonb code
>>>>> (indentation, error messages, comments etc.) as well get getting up some
>>>>> jsonb docs.
>>>>>
>>>>>
>>>>>
>>>>
>>>> Here is an update of the jsonb part of this. Charges:
>>>>
>>>> * there is now documentation for jsonb
>>>> * most uses of elog() in json_funcs.c are replaced by ereport().
>>>> * indentation fixes and other tidying.
>>>>
>>>> No changes in functionality.
>>>>
>>>
>>> Further update of jsonb portion.
>>>
>>> Only change in functionality is the addition of casts between jsonb and
>>> json.
>>>
>>> The other changes are the merge with the new json functions code, and
>>> rearrangement of the docs changes to make them less ugly. Essentially I
>>> moved the indexterm tags right out of the table as is done in some other
>>> parts pf the docs. That makes the entry tags much clearer to read.
>> Updated to apply cleanly after recent commits.
> ok, great. This is really fabulous. So far most everything feels
> natural and good.
>
> I see something odd in terms of the jsonb use case coverage. One of
> the major headaches with json deserialization presently is that
> there's no easy way to easily move a complex (record- or array-
> containing) json structure into a row object. For example,
>
> create table bar(a int, b int[]);
> postgres=# select jsonb_populate_record(null::bar, '{"a": 1, "b":
> [1,2]}'::jsonb, false);
> ERROR: cannot populate with a nested object unless use_json_as_text is true
>
> If find the use_json_as_text argument here to be pretty useless
> (unlike in the json_build to_record variants where it least provides
> some hope for an escape hatch) for handling this since it will just
> continue to fail:
>
> postgres=# select jsonb_populate_record(null::bar, '{"a": 1, "b":
> [1,2]}'::jsonb, true);
> ERROR: missing "]" in array dimensions
>
> OTOH, the nested hstore handles this no questions asked:
>
> postgres=# select * from populate_record(null::bar, '"a"=>1,
> "b"=>{1,2}'::hstore);
> a | b
> ---+-------
> 1 | {1,2}
>
> So, if you need to convert a complex json to a row type, the only
> effective way to do that is like this:
> postgres=# select* from populate_record(null::bar, '{"a": 1, "b":
> [1,2]}'::json::hstore);
> a | b
> ---+-------
> 1 | {1,2}
>
> Not a big deal really. But it makes me wonder (now that we have the
> internal capability of properly mapping to a record) why *both* the
> json/jsonb populate record variants shouldn't point to what the nested
> hstore behavior is when the 'as_text' flag is false. That would
> demolish the error and remove the dependency on hstore in order to do
> effective rowtype mapping. In an ideal world the json_build
> 'to_record' variants would behave similarly I think although there's
> no existing hstore analog so I'm assuming it's a non-trival amount of
> work.
>
> Now, if we're agreed on that, I then also wonder if the 'as_text'
> argument needs to exist at all for the populate functions except for
> backwards compatibility on the json side (not jsonb). For non-complex
> structures it does best effort casting anyways so the flag is moot.
>

Well, I could certainly look at making the populate_record{set} and
to_record{set} logic handle types that are arrays or composites inside
the record. It might not be terribly hard to do - not sure.

cheers

andrew


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: jsonb and nested hstore
Date: 2014-01-29 23:20:27
Message-ID: 52E98CBB.1000404@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 01/29/2014 02:37 PM, Merlin Moncure wrote:
> create table bar(a int, b int[]);
> postgres=# select jsonb_populate_record(null::bar, '{"a": 1, "b":
> [1,2]}'::jsonb, false);
> ERROR: cannot populate with a nested object unless use_json_as_text is true

Hmmm. What about just making any impossibly complex objects type JSON?

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-01-30 15:50:15
Message-ID: 52EA74B7.4070106@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>> ok, great. This is really fabulous. So far most everything feels
>> natural and good.
>>
>> I see something odd in terms of the jsonb use case coverage. One of
>> the major headaches with json deserialization presently is that
>> there's no easy way to easily move a complex (record- or array-
>> containing) json structure into a row object. For example,
>>
>> create table bar(a int, b int[]);
>> postgres=# select jsonb_populate_record(null::bar, '{"a": 1, "b":
>> [1,2]}'::jsonb, false);
>> ERROR: cannot populate with a nested object unless use_json_as_text
>> is true
>>
>> If find the use_json_as_text argument here to be pretty useless
>> (unlike in the json_build to_record variants where it least provides
>> some hope for an escape hatch) for handling this since it will just
>> continue to fail:
>>
>> postgres=# select jsonb_populate_record(null::bar, '{"a": 1, "b":
>> [1,2]}'::jsonb, true);
>> ERROR: missing "]" in array dimensions
>>
>> OTOH, the nested hstore handles this no questions asked:
>>
>> postgres=# select * from populate_record(null::bar, '"a"=>1,
>> "b"=>{1,2}'::hstore);
>> a | b
>> ---+-------
>> 1 | {1,2}
>>
>> So, if you need to convert a complex json to a row type, the only
>> effective way to do that is like this:
>> postgres=# select* from populate_record(null::bar, '{"a": 1, "b":
>> [1,2]}'::json::hstore);
>> a | b
>> ---+-------
>> 1 | {1,2}
>>
>> Not a big deal really. But it makes me wonder (now that we have the
>> internal capability of properly mapping to a record) why *both* the
>> json/jsonb populate record variants shouldn't point to what the nested
>> hstore behavior is when the 'as_text' flag is false. That would
>> demolish the error and remove the dependency on hstore in order to do
>> effective rowtype mapping. In an ideal world the json_build
>> 'to_record' variants would behave similarly I think although there's
>> no existing hstore analog so I'm assuming it's a non-trival amount of
>> work.
>>
>> Now, if we're agreed on that, I then also wonder if the 'as_text'
>> argument needs to exist at all for the populate functions except for
>> backwards compatibility on the json side (not jsonb). For non-complex
>> structures it does best effort casting anyways so the flag is moot.
>>
>
> Well, I could certainly look at making the populate_record{set} and
> to_record{set} logic handle types that are arrays or composites inside
> the record. It might not be terribly hard to do - not sure.
>
>

A quick analysis suggests that this is fixable with fairly minimal
disturbance in the jsonb case. In the json case it would probably
involve reparsing the inner json. That's probably doable, because the
routines are all reentrant, but not likely to be terribly efficient. It
will also be a deal more work.

cheers

andrew


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-01-30 17:34:40
Message-ID: CAHyXU0zWzf4FYagQ76KNAPVLkLZnuLXGkLeV-p2wPSV1q4Su_w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Jan 30, 2014 at 9:50 AM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
>>> Now, if we're agreed on that, I then also wonder if the 'as_text'
>>> argument needs to exist at all for the populate functions except for
>>> backwards compatibility on the json side (not jsonb). For non-complex
>>> structures it does best effort casting anyways so the flag is moot.
>>>
>>
>> Well, I could certainly look at making the populate_record{set} and
>> to_record{set} logic handle types that are arrays or composites inside the
>> record. It might not be terribly hard to do - not sure.
>
> A quick analysis suggests that this is fixable with fairly minimal
> disturbance in the jsonb case. In the json case it would probably involve
> reparsing the inner json. That's probably doable, because the routines are
> all reentrant, but not likely to be terribly efficient. It will also be a
> deal more work.

Right. Also the text json functions are already in the wild anyways
-- that's not in the scope of this patch so if they need to be fixed
that could be done later.

ISTM then the right course of action is to point jsonb 'populate'
variants at hstore implementation, not the text json one and remove
the 'as text' argument. Being able to ditch that argument is the main
reason why I think this should be handled now (not forcing hstore
dependency to handle complex json is gravy).

People handling json as text would then invoke a ::jsonb cast trading
off performance for flexibility which is perfectly fine. If you
agree, perhaps we can HINT the error in certain places that return
"ERROR: cannot call json_populate_record on a nested object" that the
jsonb variant can be used as a workaround.

merlin


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-01-30 17:45:26
Message-ID: 52EA8FB6.2000607@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 01/30/2014 12:34 PM, Merlin Moncure wrote:
> On Thu, Jan 30, 2014 at 9:50 AM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
>>>> Now, if we're agreed on that, I then also wonder if the 'as_text'
>>>> argument needs to exist at all for the populate functions except for
>>>> backwards compatibility on the json side (not jsonb). For non-complex
>>>> structures it does best effort casting anyways so the flag is moot.
>>>>
>>> Well, I could certainly look at making the populate_record{set} and
>>> to_record{set} logic handle types that are arrays or composites inside the
>>> record. It might not be terribly hard to do - not sure.
>> A quick analysis suggests that this is fixable with fairly minimal
>> disturbance in the jsonb case. In the json case it would probably involve
>> reparsing the inner json. That's probably doable, because the routines are
>> all reentrant, but not likely to be terribly efficient. It will also be a
>> deal more work.
> Right. Also the text json functions are already in the wild anyways
> -- that's not in the scope of this patch so if they need to be fixed
> that could be done later.
>
> ISTM then the right course of action is to point jsonb 'populate'
> variants at hstore implementation, not the text json one and remove
> the 'as text' argument. Being able to ditch that argument is the main
> reason why I think this should be handled now (not forcing hstore
> dependency to handle complex json is gravy).

We can't reference any hstore code in jsonb. There is no guarantee that
hstore will even be loaded.

We'd have to move that code from hstore to jsonb_support.c and then make
hstore refer to it.

cheers

andrew


From: Hannu Krosing <hannu(at)2ndQuadrant(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>, Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-01-30 18:03:29
Message-ID: 52EA93F1.4050109@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 01/30/2014 06:45 PM, Andrew Dunstan wrote:
>
> On 01/30/2014 12:34 PM, Merlin Moncure wrote:
>> On Thu, Jan 30, 2014 at 9:50 AM, Andrew Dunstan <andrew(at)dunslane(dot)net>
>> wrote:
>>>>> Now, if we're agreed on that, I then also wonder if the 'as_text'
>>>>> argument needs to exist at all for the populate functions except for
>>>>> backwards compatibility on the json side (not jsonb). For
>>>>> non-complex
>>>>> structures it does best effort casting anyways so the flag is moot.
>>>>>
>>>> Well, I could certainly look at making the populate_record{set} and
>>>> to_record{set} logic handle types that are arrays or composites
>>>> inside the
>>>> record. It might not be terribly hard to do - not sure.
>>> A quick analysis suggests that this is fixable with fairly minimal
>>> disturbance in the jsonb case.
As row_to_json() works with arbitrarily complex nested types (for
example row having a field
of type array of another (table)type containing arrays of third type) it
would be really nice if
you can get the result back into that row without too much hassle.

and it should be ok to treat json as "source type" and require it to be
translated to jsonb
for more complex operations
>>> In the json case it would probably involve
>>> reparsing the inner json. That's probably doable, because the
>>> routines are
>>> all reentrant, but not likely to be terribly efficient. It will also
>>> be a
>>> deal more work.
>> Right. Also the text json functions are already in the wild anyways
>> -- that's not in the scope of this patch so if they need to be fixed
>> that could be done later.
>>
>> ISTM then the right course of action is to point jsonb 'populate'
>> variants at hstore implementation, not the text json one and remove
>> the 'as text' argument. Being able to ditch that argument is the main
>> reason why I think this should be handled now (not forcing hstore
>> dependency to handle complex json is gravy).
>
>
> We can't reference any hstore code in jsonb. There is no guarantee
> that hstore will even be loaded.
>
> We'd have to move that code from hstore to jsonb_support.c and then
> make hstore refer to it.
Or just copy it and leave hstore alone - the code duplication is not
terribly huge
here and hstore might still want to develop independently.

Cheers

--
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Hannu Krosing <hannu(at)2ndQuadrant(dot)com>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-01-30 18:14:21
Message-ID: 52EA967D.4000406@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 01/30/2014 01:03 PM, Hannu Krosing wrote:
> On 01/30/2014 06:45 PM, Andrew Dunstan wrote:
>> On 01/30/2014 12:34 PM, Merlin Moncure wrote:
>>> On Thu, Jan 30, 2014 at 9:50 AM, Andrew Dunstan <andrew(at)dunslane(dot)net>
>>> wrote:
>>>>>> Now, if we're agreed on that, I then also wonder if the 'as_text'
>>>>>> argument needs to exist at all for the populate functions except for
>>>>>> backwards compatibility on the json side (not jsonb). For
>>>>>> non-complex
>>>>>> structures it does best effort casting anyways so the flag is moot.
>>>>>>
>>>>> Well, I could certainly look at making the populate_record{set} and
>>>>> to_record{set} logic handle types that are arrays or composites
>>>>> inside the
>>>>> record. It might not be terribly hard to do - not sure.

>>>> A quick analysis suggests that this is fixable with fairly minimal
>>>> disturbance in the jsonb case.
> As row_to_json() works with arbitrarily complex nested types (for
> example row having a field
> of type array of another (table)type containing arrays of third type) it
> would be really nice if
> you can get the result back into that row without too much hassle.
>
> and it should be ok to treat json as "source type" and require it to be
> translated to jsonb
> for more complex operations

Might be possible.

>>>> In the json case it would probably involve
>>>> reparsing the inner json. That's probably doable, because the
>>>> routines are
>>>> all reentrant, but not likely to be terribly efficient. It will also
>>>> be a
>>>> deal more work.
>>> Right. Also the text json functions are already in the wild anyways
>>> -- that's not in the scope of this patch so if they need to be fixed
>>> that could be done later.
>>>
>>> ISTM then the right course of action is to point jsonb 'populate'
>>> variants at hstore implementation, not the text json one and remove
>>> the 'as text' argument. Being able to ditch that argument is the main
>>> reason why I think this should be handled now (not forcing hstore
>>> dependency to handle complex json is gravy).
>>
>> We can't reference any hstore code in jsonb. There is no guarantee
>> that hstore will even be loaded.
>>
>> We'd have to move that code from hstore to jsonb_support.c and then
>> make hstore refer to it.
> Or just copy it and leave hstore alone - the code duplication is not
> terribly huge
> here and hstore might still want to develop independently.
>

We have gone to great deal of trouble to make jsonb and nested hstore
more or less incarnations of the same thing. The new hstore relies
heavily on the new jsonb. So what you're suggesting is the opposite of
what's been developed these last months.

cheers

andrew


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Hannu Krosing <hannu(at)2ndQuadrant(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-01-30 18:50:25
Message-ID: 4062.1391107825@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> On 01/30/2014 01:03 PM, Hannu Krosing wrote:
>> On 01/30/2014 06:45 PM, Andrew Dunstan wrote:
>>> We'd have to move that code from hstore to jsonb_support.c and then
>>> make hstore refer to it.

>> Or just copy it and leave hstore alone - the code duplication is not
>> terribly huge here and hstore might still want to develop independently.

> We have gone to great deal of trouble to make jsonb and nested hstore
> more or less incarnations of the same thing. The new hstore relies
> heavily on the new jsonb. So what you're suggesting is the opposite of
> what's been developed these last months.

If so, why would you be resistant to pushing more code out of hstore
and into jsonb?

regards, tom lane


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Hannu Krosing <hannu(at)2ndQuadrant(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-01-30 18:54:32
Message-ID: 52EA9FE8.7000604@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 01/30/2014 01:50 PM, Tom Lane wrote:
> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
>> On 01/30/2014 01:03 PM, Hannu Krosing wrote:
>>> On 01/30/2014 06:45 PM, Andrew Dunstan wrote:
>>>> We'd have to move that code from hstore to jsonb_support.c and then
>>>> make hstore refer to it.
>>> Or just copy it and leave hstore alone - the code duplication is not
>>> terribly huge here and hstore might still want to develop independently.
>> We have gone to great deal of trouble to make jsonb and nested hstore
>> more or less incarnations of the same thing. The new hstore relies
>> heavily on the new jsonb. So what you're suggesting is the opposite of
>> what's been developed these last months.
> If so, why would you be resistant to pushing more code out of hstore
> and into jsonb?
>

I'm not. Above I suggested exactly that. I was simply opposed to Hannu's
suggestion that instead of making hstore refer to the adopted code we
maintain two copies of code that does essentially the same thing.

cheers

andrew


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-01-30 19:07:42
Message-ID: 52EAA2FE.1020800@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 01/29/2014 04:56 PM, Andrew Dunstan wrote:
>
> On 01/29/2014 01:03 PM, Andrew Dunstan wrote:
>>
>> On 01/27/2014 10:43 PM, Andrew Dunstan wrote:
>>>
>>> On 01/26/2014 05:42 PM, Andrew Dunstan wrote:
>>>>
>>>> Here is the latest set of patches for nested hstore and jsonb.
>>>>
>>>> Because it's so large I've broken this into two patches and
>>>> compressed them. The jsonb patch should work standalone. The nested
>>>> hstore patch depends on it.
>>>>
>>>> All the jsonb functions now use the jsonb API - there is no more
>>>> turning jsonb into text and reparsing it.
>>>>
>>>> At this stage I'm going to be starting cleanup on the jsonb code
>>>> (indentation, error messages, comments etc.) as well get getting up
>>>> some jsonb docs.
>>>>
>>>>
>>>>
>>>
>>>
>>> Here is an update of the jsonb part of this. Charges:
>>>
>>> * there is now documentation for jsonb
>>> * most uses of elog() in json_funcs.c are replaced by ereport().
>>> * indentation fixes and other tidying.
>>>
>>> No changes in functionality.
>>>
>>
>>
>> Further update of jsonb portion.
>>
>> Only change in functionality is the addition of casts between jsonb
>> and json.
>>
>> The other changes are the merge with the new json functions code, and
>> rearrangement of the docs changes to make them less ugly. Essentially
>> I moved the indexterm tags right out of the table as is done in some
>> other parts pf the docs. That makes the entry tags much clearer to read.
>>
>>
>>
>
>
> Updated to apply cleanly after recent commits.
>
>

Updated patches for both pieces. Included is some tidying done by
Teodor, and fixes for remaining whitespace issues. This now passes "git
diff --check master" cleanly for me.

cheers

andrew

Attachment Content-Type Size
jsonb-9.patch.gz application/x-gzip 30.9 KB
nested-hstore-9.patch.gz application/x-gzip 66.0 KB

From: "Erik Rijkers" <er(at)xs4all(dot)nl>
To: "Andrew Dunstan" <andrew(at)dunslane(dot)net>
Cc: "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore - small docpatch
Date: 2014-01-30 22:15:00
Message-ID: 9e2ce27a4d95a69148a9aeb8d6636474.squirrel@webmail.xs4all.nl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, January 30, 2014 20:07, Andrew Dunstan wrote:
>
> Updated patches for both pieces. Included is some tidying done by
>
> [ nested-hstore-9.patch.gz ]

Here is a small doc-patch to Table F-6. hstore Operators

It corrects its booleans in the 'Result' column ( t and f instead of true and false ).

Thanks,

Erik Rijkers


From: "Erik Rijkers" <er(at)xs4all(dot)nl>
To: "Andrew Dunstan" <andrew(at)dunslane(dot)net>
Cc: "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore - small docpatch
Date: 2014-01-30 22:17:30
Message-ID: 098155e40326f4bd4537fb978201e524.squirrel@webmail.xs4all.nl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, January 30, 2014 23:15, Erik Rijkers wrote:
> On Thu, January 30, 2014 20:07, Andrew Dunstan wrote:
>>
>> Updated patches for both pieces. Included is some tidying done by
>>
>> [ nested-hstore-9.patch.gz ]
>
> Here is a small doc-patch to Table F-6. hstore Operators
>
> It corrects its booleans in the 'Result' column ( t and f instead of true and false ).

I mean, here it is...

Attachment Content-Type Size
hstore-20140130.sgml.diff text/x-patch 3.4 KB

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-01-31 00:21:12
Message-ID: CAHyXU0zJS8M4uwDZCWJ-8CCjKqCwq=b28B=d0DGLV9w3fsP3DA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Jan 30, 2014 at 1:07 PM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
>
> On 01/29/2014 04:56 PM, Andrew Dunstan wrote:
>>
>>
>> On 01/29/2014 01:03 PM, Andrew Dunstan wrote:
>>>
>>>
>>> On 01/27/2014 10:43 PM, Andrew Dunstan wrote:
>>>>
>>>>
>>>> On 01/26/2014 05:42 PM, Andrew Dunstan wrote:
>>>>>
>>>>>
>>>>> Here is the latest set of patches for nested hstore and jsonb.
>>>>>
>>>>> Because it's so large I've broken this into two patches and compressed
>>>>> them. The jsonb patch should work standalone. The nested hstore patch
>>>>> depends on it.
>>>>>
>>>>> All the jsonb functions now use the jsonb API - there is no more
>>>>> turning jsonb into text and reparsing it.
>>>>>
>>>>> At this stage I'm going to be starting cleanup on the jsonb code
>>>>> (indentation, error messages, comments etc.) as well get getting up some
>>>>> jsonb docs.
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> Here is an update of the jsonb part of this. Charges:
>>>>
>>>> * there is now documentation for jsonb
>>>> * most uses of elog() in json_funcs.c are replaced by ereport().
>>>> * indentation fixes and other tidying.
>>>>
>>>> No changes in functionality.
>>>>
>>>
>>>
>>> Further update of jsonb portion.
>>>
>>> Only change in functionality is the addition of casts between jsonb and
>>> json.
>>>
>>> The other changes are the merge with the new json functions code, and
>>> rearrangement of the docs changes to make them less ugly. Essentially I
>>> moved the indexterm tags right out of the table as is done in some other
>>> parts pf the docs. That makes the entry tags much clearer to read.
>>>
>>>
>>>
>>
>>
>> Updated to apply cleanly after recent commits.
>>
>>
>
> Updated patches for both pieces. Included is some tidying done by Teodor,
> and fixes for remaining whitespace issues. This now passes "git diff --check
> master" cleanly for me.

Something seems off:

postgres=# create type z as (a int, b int[]);
CREATE TYPE
postgres=# create type y as (a int, b z[]);
CREATE TYPE
postgres=# create type x as (a int, b y[]);
CREATE TYPE

-- test a complicated construction
postgres=# select row(1, array[row(1, array[row(1, array[1,2])::z])::y])::x;
row
-------------------------------------------------------------------------------------
(1,"{""(1,\\""{\\""\\""(1,\\\\\\\\\\""\\""{1,2}\\\\\\\\\\""\\"")\\""\\""}\\"")""}")

postgres=# select hstore(row(1, array[row(1, array[row(1,
array[1,2])::z])::y])::x);
hstore
----------------------------------------------------------------------------------------------
"a"=>1, "b"=>"{\"(1,\\\"{\\\"\\\"(1,\\\\\\\\\\\"\\\"{1,2}\\\\\\\\\\\"\\\")\\\"\\\"}\\\")\"}"

here, the output escaping has leaked into the internal array
structures. istm we should have a json expressing the internal
structure. It does (weirdly) map back however:

postgres=# select populate_record(null::x, hstore(row(1, array[row(1,
array[row(1, array[1,2])::z])::y])::x));
populate_record
-------------------------------------------------------------------------------------
(1,"{""(1,\\""{\\""\\""(1,\\\\\\\\\\""\\""{1,2}\\\\\\\\\\""\\"")\\""\\""}\\"")""}")

OTOH, if I go via json route:

postgres=# select row_to_json(row(1, array[row(1, array[row(1,
array[1,2])::z])::y])::x);
row_to_json
-----------------------------------------------
{"a":1,"b":[{"a":1,"b":[{"a":1,"b":[1,2]}]}]}

so far, so good. let's push to hstore:
postgres=# select row_to_json(row(1, array[row(1, array[row(1,
array[1,2])::z])::y])::x)::jsonb::hstore;
row_to_json
-------------------------------------------------------
"a"=>1, "b"=>[{"a"=>1, "b"=>[{"a"=>1, "b"=>[1, 2]}]}]

this ISTM is the 'right' behavior. but what if we bring it back to
record object?

postgres=# select populate_record(null::x, row_to_json(row(1,
array[row(1, array[row(1, array[1,2])::z])::y])::x)::jsonb::hstore);
ERROR: malformed array literal: "{{"a"=>1, "b"=>{{"a"=>1, "b"=>{1, 2}}}}}"

yikes. The situation as I read it is that (notwithstanding my comments
upthread) there is no clean way to slide rowtypes to/from hstore and
jsonb while preserving structure. IMO, the above query should work
and the populate function record above should return the internally
structured row object, not the text escaped version.

merlin


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-01-31 00:52:31
Message-ID: 52EAF3CF.3050105@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 01/30/2014 07:21 PM, Merlin Moncure wrote:

> Something seems off:
>
> postgres=# create type z as (a int, b int[]);
> CREATE TYPE
> postgres=# create type y as (a int, b z[]);
> CREATE TYPE
> postgres=# create type x as (a int, b y[]);
> CREATE TYPE
>
> -- test a complicated construction
> postgres=# select row(1, array[row(1, array[row(1, array[1,2])::z])::y])::x;
> row
> -------------------------------------------------------------------------------------
> (1,"{""(1,\\""{\\""\\""(1,\\\\\\\\\\""\\""{1,2}\\\\\\\\\\""\\"")\\""\\""}\\"")""}")
>
> postgres=# select hstore(row(1, array[row(1, array[row(1,
> array[1,2])::z])::y])::x);
> hstore
> ----------------------------------------------------------------------------------------------
> "a"=>1, "b"=>"{\"(1,\\\"{\\\"\\\"(1,\\\\\\\\\\\"\\\"{1,2}\\\\\\\\\\\"\\\")\\\"\\\"}\\\")\"}"
>
> here, the output escaping has leaked into the internal array
> structures. istm we should have a json expressing the internal
> structure.

What has this to do with json at all? It's clearly a failure in the
hstore() function.

> It does (weirdly) map back however:
>
> postgres=# select populate_record(null::x, hstore(row(1, array[row(1,
> array[row(1, array[1,2])::z])::y])::x));
> populate_record
> -------------------------------------------------------------------------------------
> (1,"{""(1,\\""{\\""\\""(1,\\\\\\\\\\""\\""{1,2}\\\\\\\\\\""\\"")\\""\\""}\\"")""}")
>
>
> OTOH, if I go via json route:
>
> postgres=# select row_to_json(row(1, array[row(1, array[row(1,
> array[1,2])::z])::y])::x);
> row_to_json
> -----------------------------------------------
> {"a":1,"b":[{"a":1,"b":[{"a":1,"b":[1,2]}]}]}
>
>
> so far, so good. let's push to hstore:
> postgres=# select row_to_json(row(1, array[row(1, array[row(1,
> array[1,2])::z])::y])::x)::jsonb::hstore;
> row_to_json
> -------------------------------------------------------
> "a"=>1, "b"=>[{"a"=>1, "b"=>[{"a"=>1, "b"=>[1, 2]}]}]
>
> this ISTM is the 'right' behavior. but what if we bring it back to
> record object?
>
> postgres=# select populate_record(null::x, row_to_json(row(1,
> array[row(1, array[row(1, array[1,2])::z])::y])::x)::jsonb::hstore);
> ERROR: malformed array literal: "{{"a"=>1, "b"=>{{"a"=>1, "b"=>{1, 2}}}}}"
>
> yikes. The situation as I read it is that (notwithstanding my comments
> upthread) there is no clean way to slide rowtypes to/from hstore and
> jsonb while preserving structure. IMO, the above query should work
> and the populate function record above should return the internally
> structured row object, not the text escaped version.

And this is a failure in populate_record().

I think we possibly need to say that handling of nested composites and
arrays is an area that needs further work. OTOH, the refusal of
json_populate_record() and json_populate_recordset() to handle these in
9.3 has not generated a flood of complaints, so I don't think it's a
tragedy, just a limitation, which should be documented if it's not
already. (And of course hstore hasn't handled nested anything before now.)

Meanwhile, maybe Teodor can fix the two hstore bugs shown here.

cheers

andrew


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-01-31 01:13:08
Message-ID: CAHyXU0z19J8CPXw__y9zPZYdpngZ-MzuZgqhTvK4vMJYkpPuWQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Jan 30, 2014 at 4:52 PM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
>
> On 01/30/2014 07:21 PM, Merlin Moncure wrote:
>> postgres=# select hstore(row(1, array[row(1, array[row(1,
>> array[1,2])::z])::y])::x);
>> hstore
>>
>> ----------------------------------------------------------------------------------------------
>> "a"=>1,
>> "b"=>"{\"(1,\\\"{\\\"\\\"(1,\\\\\\\\\\\"\\\"{1,2}\\\\\\\\\\\"\\\")\\\"\\\"}\\\")\"}"
>>
>> here, the output escaping has leaked into the internal array
>> structures. istm we should have a json expressing the internal
>> structure.
>
> What has this to do with json at all? It's clearly a failure in the hstore()
> function.

yeah -- meant to say 'hstore' there. Also I'm not sure that it's
'wrong'; it's just doing what it always did. That brings up another
point: are there any interesting cases of compatibility breakage? I'm
inclined not to care about this particular case though...

>> array[row(1, array[row(1, array[1,2])::z])::y])::x)::jsonb::hstore);
>> ERROR: malformed array literal: "{{"a"=>1, "b"=>{{"a"=>1, "b"=>{1,
>> 2}}}}}"
>>
>> yikes. The situation as I read it is that (notwithstanding my comments
>> upthread) there is no clean way to slide rowtypes to/from hstore and
>> jsonb while preserving structure. IMO, the above query should work
>> and the populate function record above should return the internally
>> structured row object, not the text escaped version.
>
>
>
> And this is a failure in populate_record().
>
> I think we possibly need to say that handling of nested composites and
> arrays is an area that needs further work. OTOH, the refusal of
> json_populate_record() and json_populate_recordset() to handle these in 9.3
> has not generated a flood of complaints, so I don't think it's a tragedy,
> just a limitation, which should be documented if it's not already. (And of
> course hstore hasn't handled nested anything before now.)
>
> Meanwhile, maybe Teodor can fix the two hstore bugs shown here.

While not a "flood", there certainly have been complaints. See
http://postgresql.1045698.n5.nabble.com/Best-way-to-populate-nested-composite-type-from-JSON-td5770566.html
http://osdir.com/ml/postgresql-pgsql-general/2014-01/msg00205.html

But, if we had to drop this in the interests of time I'd rather see
the behavior cauterized off so that it errored out 'not supported' (as
json_populate does) that attempt to implement the wrong behavior.

merlin


From: Oleg Bartunov <obartunov(at)gmail(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-01-31 10:03:46
Message-ID: CAF4Au4xxReiTd2S4fO+KmHYRRsRcCQ-8rb+uaP7oP=GOqTNX+w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hmm,
neither me, nor Teodor have experience and knowledge with
populate_record() and moreover hstore here is virgin and we don't know
the right behaviour, so I think we better take it from jsonb, once
Andrew realize it. Andrew ?

On Fri, Jan 31, 2014 at 4:52 AM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
>
> On 01/30/2014 07:21 PM, Merlin Moncure wrote:
>
>> Something seems off:
>>
>> postgres=# create type z as (a int, b int[]);
>> CREATE TYPE
>> postgres=# create type y as (a int, b z[]);
>> CREATE TYPE
>> postgres=# create type x as (a int, b y[]);
>> CREATE TYPE
>>
>> -- test a complicated construction
>> postgres=# select row(1, array[row(1, array[row(1,
>> array[1,2])::z])::y])::x;
>> row
>>
>> -------------------------------------------------------------------------------------
>>
>> (1,"{""(1,\\""{\\""\\""(1,\\\\\\\\\\""\\""{1,2}\\\\\\\\\\""\\"")\\""\\""}\\"")""}")
>>
>> postgres=# select hstore(row(1, array[row(1, array[row(1,
>> array[1,2])::z])::y])::x);
>> hstore
>>
>> ----------------------------------------------------------------------------------------------
>> "a"=>1,
>> "b"=>"{\"(1,\\\"{\\\"\\\"(1,\\\\\\\\\\\"\\\"{1,2}\\\\\\\\\\\"\\\")\\\"\\\"}\\\")\"}"
>>
>> here, the output escaping has leaked into the internal array
>> structures. istm we should have a json expressing the internal
>> structure.
>
>
> What has this to do with json at all? It's clearly a failure in the hstore()
> function.
>
>
>
>> It does (weirdly) map back however:
>>
>> postgres=# select populate_record(null::x, hstore(row(1, array[row(1,
>> array[row(1, array[1,2])::z])::y])::x));
>> populate_record
>>
>> -------------------------------------------------------------------------------------
>>
>> (1,"{""(1,\\""{\\""\\""(1,\\\\\\\\\\""\\""{1,2}\\\\\\\\\\""\\"")\\""\\""}\\"")""}")
>>
>>
>> OTOH, if I go via json route:
>>
>> postgres=# select row_to_json(row(1, array[row(1, array[row(1,
>> array[1,2])::z])::y])::x);
>> row_to_json
>> -----------------------------------------------
>> {"a":1,"b":[{"a":1,"b":[{"a":1,"b":[1,2]}]}]}
>>
>>
>> so far, so good. let's push to hstore:
>> postgres=# select row_to_json(row(1, array[row(1, array[row(1,
>> array[1,2])::z])::y])::x)::jsonb::hstore;
>> row_to_json
>> -------------------------------------------------------
>> "a"=>1, "b"=>[{"a"=>1, "b"=>[{"a"=>1, "b"=>[1, 2]}]}]
>>
>> this ISTM is the 'right' behavior. but what if we bring it back to
>> record object?
>>
>> postgres=# select populate_record(null::x, row_to_json(row(1,
>> array[row(1, array[row(1, array[1,2])::z])::y])::x)::jsonb::hstore);
>> ERROR: malformed array literal: "{{"a"=>1, "b"=>{{"a"=>1, "b"=>{1,
>> 2}}}}}"
>>
>> yikes. The situation as I read it is that (notwithstanding my comments
>> upthread) there is no clean way to slide rowtypes to/from hstore and
>> jsonb while preserving structure. IMO, the above query should work
>> and the populate function record above should return the internally
>> structured row object, not the text escaped version.
>
>
>
> And this is a failure in populate_record().
>
> I think we possibly need to say that handling of nested composites and
> arrays is an area that needs further work. OTOH, the refusal of
> json_populate_record() and json_populate_recordset() to handle these in 9.3
> has not generated a flood of complaints, so I don't think it's a tragedy,
> just a limitation, which should be documented if it's not already. (And of
> course hstore hasn't handled nested anything before now.)
>
> Meanwhile, maybe Teodor can fix the two hstore bugs shown here.
>
> cheers
>
> andrew
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Oleg Bartunov <obartunov(at)gmail(dot)com>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-01-31 13:57:23
Message-ID: CAHyXU0zUDp_mZKbEU2o0cKyDmrpY2kUqSQfWp0O6YCWin+NRFQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Jan 31, 2014 at 4:03 AM, Oleg Bartunov <obartunov(at)gmail(dot)com> wrote:
> Hmm,
> neither me, nor Teodor have experience and knowledge with
> populate_record() and moreover hstore here is virgin and we don't know
> the right behaviour, so I think we better take it from jsonb, once
> Andrew realize it. Andrew ?

Andrew Gierth wrote the current implementation of htsore
populate_record IIRC. Unfortunately the plan for jsonb was to borrow
hstore's (I don't think hstore can use the jsonb implementation
because you'd be taking away the ability to handle internally nested
structures it currently has). Of my two complaints upthread, the
second one, not being able to populate from and internally well formed
structure, is by far the more serious one I think.

merlin


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Oleg Bartunov <obartunov(at)gmail(dot)com>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-01-31 14:45:17
Message-ID: 52EBB6FD.6000005@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 01/31/2014 08:57 AM, Merlin Moncure wrote:
> On Fri, Jan 31, 2014 at 4:03 AM, Oleg Bartunov <obartunov(at)gmail(dot)com> wrote:
>> Hmm,
>> neither me, nor Teodor have experience and knowledge with
>> populate_record() and moreover hstore here is virgin and we don't know
>> the right behaviour, so I think we better take it from jsonb, once
>> Andrew realize it. Andrew ?
> Andrew Gierth wrote the current implementation of htsore
> populate_record IIRC. Unfortunately the plan for jsonb was to borrow
> hstore's (I don't think hstore can use the jsonb implementation
> because you'd be taking away the ability to handle internally nested
> structures it currently has). Of my two complaints upthread, the
> second one, not being able to populate from and internally well formed
> structure, is by far the more serious one I think.
>

Umm, I think at least one of us is seriously confused.

I am going to look at dealing with these issues in a way that can be
used by both - at least the populate_record case.

As far as populate_record goes, there is a bit of an impedance mismatch,
since json/hstore records are heterogenous and one-dimensional, whereas
sql arrays are homogeneous and multidimensional. Right now I am thinking
I will deal with arrays up to two dimensions, because I can do that
relatively simply, and after that throw in the towel. That will surely
deal with 99.9% of use cases. Of course this would be documented.

Anyway, Let me see what I can do.

If Andrew Gierth wants to have a look at fixing the hstore() side that
might help speed things up.

cheers

andrew


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Oleg Bartunov <obartunov(at)gmail(dot)com>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-01-31 14:53:06
Message-ID: CAHyXU0yWKExP8rVPa-0nTXG1d6jO8DMWu4LaVjV8+-2+Lqrs4Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Jan 31, 2014 at 8:45 AM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
>
> On 01/31/2014 08:57 AM, Merlin Moncure wrote:
>>
>> On Fri, Jan 31, 2014 at 4:03 AM, Oleg Bartunov <obartunov(at)gmail(dot)com>
>> wrote:
>>>
>>> Hmm,
>>> neither me, nor Teodor have experience and knowledge with
>>> populate_record() and moreover hstore here is virgin and we don't know
>>> the right behaviour, so I think we better take it from jsonb, once
>>> Andrew realize it. Andrew ?
>>
>> Andrew Gierth wrote the current implementation of htsore
>> populate_record IIRC. Unfortunately the plan for jsonb was to borrow
>> hstore's (I don't think hstore can use the jsonb implementation
>> because you'd be taking away the ability to handle internally nested
>> structures it currently has). Of my two complaints upthread, the
>> second one, not being able to populate from and internally well formed
>> structure, is by far the more serious one I think.
>>
>
>
> Umm, I think at least one of us is seriously confused.
>
> I am going to look at dealing with these issues in a way that can be used by
> both - at least the populate_record case.
>
> As far as populate_record goes, there is a bit of an impedance mismatch,
> since json/hstore records are heterogenous and one-dimensional, whereas sql
> arrays are homogeneous and multidimensional. Right now I am thinking I will
> deal with arrays up to two dimensions, because I can do that relatively
> simply, and after that throw in the towel. That will surely deal with 99.9%
> of use cases. Of course this would be documented.
>
> Anyway, Let me see what I can do.
>
> If Andrew Gierth wants to have a look at fixing the hstore() side that might
> help speed things up.

(ah, you beat me to it.)

Disregard my statements above. It works.

postgres=# select jsonb_populate_record(null::x, hstore(row(1,
array[row(1, array[row(1, array[1,2])::z])::y])::x)::jsonb);
jsonb_populate_record
-------------------------------------------------------------------------------------
(1,"{""(1,\\""{\\""\\""(1,\\\\\\\\\\""\\""{1,2}\\\\\\\\\\""\\"")\\""\\""}\\"")""}")

merlin


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Oleg Bartunov <obartunov(at)gmail(dot)com>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-01-31 15:26:00
Message-ID: 52EBC088.2040800@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 01/31/2014 09:53 AM, Merlin Moncure wrote:
> On Fri, Jan 31, 2014 at 8:45 AM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
>> On 01/31/2014 08:57 AM, Merlin Moncure wrote:
>>> On Fri, Jan 31, 2014 at 4:03 AM, Oleg Bartunov <obartunov(at)gmail(dot)com>
>>> wrote:
>>>> Hmm,
>>>> neither me, nor Teodor have experience and knowledge with
>>>> populate_record() and moreover hstore here is virgin and we don't know
>>>> the right behaviour, so I think we better take it from jsonb, once
>>>> Andrew realize it. Andrew ?
>>> Andrew Gierth wrote the current implementation of htsore
>>> populate_record IIRC. Unfortunately the plan for jsonb was to borrow
>>> hstore's (I don't think hstore can use the jsonb implementation
>>> because you'd be taking away the ability to handle internally nested
>>> structures it currently has). Of my two complaints upthread, the
>>> second one, not being able to populate from and internally well formed
>>> structure, is by far the more serious one I think.
>>>
>>
>> Umm, I think at least one of us is seriously confused.
>>
>> I am going to look at dealing with these issues in a way that can be used by
>> both - at least the populate_record case.
>>
>> As far as populate_record goes, there is a bit of an impedance mismatch,
>> since json/hstore records are heterogenous and one-dimensional, whereas sql
>> arrays are homogeneous and multidimensional. Right now I am thinking I will
>> deal with arrays up to two dimensions, because I can do that relatively
>> simply, and after that throw in the towel. That will surely deal with 99.9%
>> of use cases. Of course this would be documented.
>>
>> Anyway, Let me see what I can do.
>>
>> If Andrew Gierth wants to have a look at fixing the hstore() side that might
>> help speed things up.
> (ah, you beat me to it.)
>
> Disregard my statements above. It works.
>
> postgres=# select jsonb_populate_record(null::x, hstore(row(1,
> array[row(1, array[row(1, array[1,2])::z])::y])::x)::jsonb);
> jsonb_populate_record
> -------------------------------------------------------------------------------------
> (1,"{""(1,\\""{\\""\\""(1,\\\\\\\\\\""\\""{1,2}\\\\\\\\\\""\\"")\\""\\""}\\"")""}")
>

Actually, there is a workaround to the limitations of hstore(record):

andrew=# select row_to_json(row(1,
array[row(1, array[row(1, array[1,2])::z])::y])::x)::jsonb::hstore;
row_to_json
-------------------------------------------------------
"a"=>1, "b"=>[{"a"=>1, "b"=>[{"a"=>1, "b"=>[1, 2]}]}]

I think we could just document that for now, or possibly just use it
inside hstore(record) if we encounter a nested composite or array.

cheers

andrew


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>, s(dot)juba(at)jacobs-university(dot)de
Cc: Oleg Bartunov <obartunov(at)gmail(dot)com>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-01-31 19:48:54
Message-ID: CAHyXU0xz1L8KzWJoDHoKX6AhZ0DX3v0DHbUMMsyeumgfCvEA1w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Jan 31, 2014 at 9:26 AM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
>
> On 01/31/2014 09:53 AM, Merlin Moncure wrote:
>>
>> On Fri, Jan 31, 2014 at 8:45 AM, Andrew Dunstan <andrew(at)dunslane(dot)net>
>> wrote:
>>>
>>> On 01/31/2014 08:57 AM, Merlin Moncure wrote:
>>>>
>>>> On Fri, Jan 31, 2014 at 4:03 AM, Oleg Bartunov <obartunov(at)gmail(dot)com>
>>>> wrote:
>>>>>
>>>>> Hmm,
>>>>> neither me, nor Teodor have experience and knowledge with
>>>>> populate_record() and moreover hstore here is virgin and we don't know
>>>>> the right behaviour, so I think we better take it from jsonb, once
>>>>> Andrew realize it. Andrew ?
>>>>
>>>> Andrew Gierth wrote the current implementation of htsore
>>>> populate_record IIRC. Unfortunately the plan for jsonb was to borrow
>>>> hstore's (I don't think hstore can use the jsonb implementation
>>>> because you'd be taking away the ability to handle internally nested
>>>> structures it currently has). Of my two complaints upthread, the
>>>> second one, not being able to populate from and internally well formed
>>>> structure, is by far the more serious one I think.
>>>>
>>>
>>> Umm, I think at least one of us is seriously confused.
>>>
>>> I am going to look at dealing with these issues in a way that can be used
>>> by
>>> both - at least the populate_record case.
>>>
>>> As far as populate_record goes, there is a bit of an impedance mismatch,
>>> since json/hstore records are heterogenous and one-dimensional, whereas
>>> sql
>>> arrays are homogeneous and multidimensional. Right now I am thinking I
>>> will
>>> deal with arrays up to two dimensions, because I can do that relatively
>>> simply, and after that throw in the towel. That will surely deal with
>>> 99.9%
>>> of use cases. Of course this would be documented.
>>>
>>> Anyway, Let me see what I can do.
>>>
>>> If Andrew Gierth wants to have a look at fixing the hstore() side that
>>> might
>>> help speed things up.
>>
>> (ah, you beat me to it.)
>>
>> Disregard my statements above. It works.
>>
>> postgres=# select jsonb_populate_record(null::x, hstore(row(1,
>> array[row(1, array[row(1, array[1,2])::z])::y])::x)::jsonb);
>> jsonb_populate_record
>>
>> -------------------------------------------------------------------------------------
>>
>> (1,"{""(1,\\""{\\""\\""(1,\\\\\\\\\\""\\""{1,2}\\\\\\\\\\""\\"")\\""\\""}\\"")""}")
>
>
> Actually, there is a workaround to the limitations of hstore(record):

yeah I'm ok with hstore() function as it is. That also eliminates
backwards compatibility concerns so things worked out. The only 'must
fix' 9.4 facing issue I see on the table is to make sure jsonb
populate function is forward compatible with future expectations of
behavior which to me means zeroing in on the necessity of the as_text
argument (but if you can expand coverage without jeopardizing 9.4
inclusion than great...).

For my part I'm going to continue functionally testing the rest of the
API (so far, a cursory look hasn't turned up anything else). I'm also
signing up for some documentation refinements which will be done after
you nail down these little bits but before the end of the 'fest.

IMNSHO, formal code review needs to begin ASAP (salahaldin is the
reviewer per the fest wiki)

merlin


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: s(dot)juba(at)jacobs-university(dot)de, Oleg Bartunov <obartunov(at)gmail(dot)com>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-01 04:35:45
Message-ID: 52EC79A1.9030409@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 01/31/2014 02:48 PM, Merlin Moncure wrote:

>>
>> Actually, there is a workaround to the limitations of hstore(record):
> yeah I'm ok with hstore() function as it is. That also eliminates
> backwards compatibility concerns so things worked out. The only 'must
> fix' 9.4 facing issue I see on the table is to make sure jsonb
> populate function is forward compatible with future expectations of
> behavior which to me means zeroing in on the necessity of the as_text
> argument (but if you can expand coverage without jeopardizing 9.4
> inclusion than great...).

This isn't terribly clear. Currently, if jsonb_populate_record{set}
encounters a nested array or object when populating the record it errors
out, regardless of the type of the field, unless as_text is set (it
defaults to off). In the latter case it tries to use the array or
object's json text representation as the value to populate the field
(realistically, this only works for text, json and jsonb fields). This
is exactly the current behaviour of json_populate_record{set}. The
enhancement would be to alter the behaviour when as_text is NOT set. In
this case, we would try recursively to populate an array or composite
field with the corresponding jsonb. i.e we would be removing some
current error conditions and returning a result. But we would not be
returning a different result in any case where we now return a result. I
think that's future-proof enough.

Frankly, I think the behaviour of hstore(record) with nested composites
and arrays is sufficiently counter-intuitive, to put it mildly, that we
should at least document the workaround from my previous email.

>
> For my part I'm going to continue functionally testing the rest of the
> API (so far, a cursory look hasn't turned up anything else). I'm also
> signing up for some documentation refinements which will be done after
> you nail down these little bits but before the end of the 'fest.
>
> IMNSHO, formal code review needs to begin ASAP (salahaldin is the
> reviewer per the fest wiki)

Yes, or anyone else who wants to join in. I'd very much welcome a
substantial code review - I have been staring at this far too long on my
own.

cheers

andrew


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: s(dot)juba(at)jacobs-university(dot)de, Oleg Bartunov <obartunov(at)gmail(dot)com>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-01 05:01:16
Message-ID: 52EC7F9C.8060503@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 01/31/2014 11:35 PM, Andrew Dunstan wrote:
>
> Yes, or anyone else who wants to join in. I'd very much welcome a
> substantial code review - I have been staring at this far too long on
> my own.

I should mention that in fact by far the largest piece of this is not my
work, but Oleg and Teodor's work. I'm sure they would welcome an in
depth review too, but I realised that my message above might have given
a false impression. They deserve the credit for having come up with and
implemented this scheme for tree-ish data.

cheers

andrew


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-01 22:20:45
Message-ID: 20140201222045.GC5930@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2014-01-30 14:07:42 -0500, Andrew Dunstan wrote:
> + <para id="functions-json-table">
> + <xref linkend="functions-json-creation-table"> shows the functions that are
> + available for creating <type>json</type> values.
> + (see <xref linkend="datatype-json">)
> </para>
>
> - <table id="functions-json-table">
> - <title>JSON Support Functions</title>
> + <indexterm>
> + <primary>array_to_json</primary>
> + </indexterm>
> + <indexterm>
> + <primary>row_to_json</primary>
> + </indexterm>
> + <indexterm>
> + <primary>to_json</primary>
> + </indexterm>
> + <indexterm>
> + <primary>json_build_array</primary>
> + </indexterm>
> + <indexterm>
> + <primary>json_build_object</primary>
> + </indexterm>
> + <indexterm>
> + <primary>json_object</primary>
> + </indexterm>

Hm, why are you collecting the indexterms at the top in the contrast to
the previous way of collecting them at the point of documentation?

> diff --git a/src/backend/utils/adt/Makefile b/src/backend/utils/adt/Makefile
> index 1ae9fa0..fd93d9b 100644
> --- a/src/backend/utils/adt/Makefile
> +++ b/src/backend/utils/adt/Makefile
> @@ -32,7 +32,8 @@ OBJS = acl.o arrayfuncs.o array_selfuncs.o array_typanalyze.o \
> tsquery_op.o tsquery_rewrite.o tsquery_util.o tsrank.o \
> tsvector.o tsvector_op.o tsvector_parser.o \
> txid.o uuid.o windowfuncs.o xml.o rangetypes_spgist.o \
> - rangetypes_typanalyze.o rangetypes_selfuncs.o
> + rangetypes_typanalyze.o rangetypes_selfuncs.o \
> + jsonb.o jsonb_support.o

Odd, most OBJS lines are kept in alphabetical order, but that doesn't
seem to be the case here.

> +/*
> + * for jsonb we always want the de-escaped value - that's what's in token
> + */
> +

strange newline.

> +static void
> +jsonb_in_scalar(void *state, char *token, JsonTokenType tokentype)
> +{
> + JsonbInState *_state = (JsonbInState *) state;
> + JsonbValue v;
> +
> + v.size = sizeof(JEntry);
> +
> + switch (tokentype)
> + {
> +
...

> + default: /* nothing else should be here in fact */
> + break;

Shouldn't this at least Assert(false) or something?

> +static void
> +recvJsonbValue(StringInfo buf, JsonbValue *v, uint32 level, int c)
> +{
> + uint32 hentry = c & JENTRY_TYPEMASK;
> +
> + if (hentry == JENTRY_ISNULL)
> + {
> + v->type = jbvNull;
> + v->size = sizeof(JEntry);
> + }
> + else if (hentry == JENTRY_ISOBJECT || hentry == JENTRY_ISARRAY || hentry == JENTRY_ISCALAR)
> + {
> + recvJsonb(buf, v, level + 1, (uint32) c);
> + }
> + else if (hentry == JENTRY_ISFALSE || hentry == JENTRY_ISTRUE)
> + {
> + v->type = jbvBool;
> + v->size = sizeof(JEntry);
> + v->boolean = (hentry == JENTRY_ISFALSE) ? false : true;
> + }
> + else if (hentry == JENTRY_ISNUMERIC)
> + {
> + v->type = jbvNumeric;
> + v->numeric = DatumGetNumeric(DirectFunctionCall3(numeric_recv, PointerGetDatum(buf),
> + Int32GetDatum(0), Int32GetDatum(-1)));
> +
> + v->size = sizeof(JEntry) * 2 + VARSIZE_ANY(v->numeric);

What's the *2 here?

> +static void
> +recvJsonb(StringInfo buf, JsonbValue *v, uint32 level, uint32 header)
> +{
> + uint32 hentry;
> + uint32 i;

This function and recvJsonbValue call each other recursively, afaics
without any limit, shouldn't they check for the stack depth?

> + hentry = header & JENTRY_TYPEMASK;
> +
> + v->size = 3 * sizeof(JEntry);

*3?

> + if (hentry == JENTRY_ISOBJECT)
> + {
> + v->type = jbvHash;
> + v->hash.npairs = header & JB_COUNT_MASK;
> + if (v->hash.npairs > 0)
> + {
> + v->hash.pairs = palloc(sizeof(*v->hash.pairs) * v->hash.npairs);
> +

Hm, if I understand correctly, we're just allocating JB_COUNT_MASK
(which is 0x0FFFFFFF) * sizeof(*v->hash.pairs) bytes here, without any
crosschecks about the actual length of the data? So with a few bytes the
server can be coaxed to allocate a gigabyte of data?
Since this immediately calls another input routine, this can be done in
a nested fashion, quickly OOMing the server.

I think this and several other places really need a bit more input
sanity checking.

> + for (i = 0; i < v->hash.npairs; i++)
> + {
> + recvJsonbValue(buf, &v->hash.pairs[i].key, level, pq_getmsgint(buf, 4));
> + if (v->hash.pairs[i].key.type != jbvString)
> + elog(ERROR, "jsonb's key could be only a string");

Shouldn't that be an ereport(ERRCODE_DATATYPE_MISMATCH)? Similar in a
few other places.

> +char *
> +JsonbToCString(StringInfo out, char *in, int estimated_len)
> +{
> + bool first = true;
> + JsonbIterator *it;
> + int type;
> + JsonbValue v;
> + int level = 0;
> +
> + if (out == NULL)
> + out = makeStringInfo();

Such a behaviour certainly deserves a documentary comment. Generally
some more functions could use that.

> + while ((type = JsonbIteratorGet(&it, &v, false)) != 0)
> + {
> +reout:
> + switch (type)
> + {
...
> + {
> + Assert(type == WJB_BEGIN_OBJECT || type == WJB_BEGIN_ARRAY);
> + goto reout;

Hrmpf.

> +Datum
> +jsonb_typeof(PG_FUNCTION_ARGS)
> +{
...
> +}

Hm, shouldn't that be in jsonfuncs.c?

> diff --git a/src/backend/utils/adt/jsonfuncs.c b/src/backend/utils/adt/jsonfuncs.c
> index a19b222..f1eacc6 100644
> --- a/src/backend/utils/adt/jsonfuncs.c
> +++ b/src/backend/utils/adt/jsonfuncs.c
> @@ -27,6 +27,7 @@
> #include "utils/builtins.h"
> #include "utils/hsearch.h"
> #include "utils/json.h"
> +#include "utils/jsonb.h"
> #include "utils/jsonapi.h"
> #include "utils/lsyscache.h"
> #include "utils/memutils.h"
> @@ -51,6 +52,7 @@ static inline Datum get_path_all(PG_FUNCTION_ARGS, bool as_text);
> static inline text *get_worker(text *json, char *field, int elem_index,
> char **tpath, int *ipath, int npath,
> bool normalize_results);
> +static inline Datum get_jsonb_path_all(PG_FUNCTION_ARGS, bool as_text);

I don't see the point of using PG_FUNCTION_ARGS if you're manually
calling it like
+ return get_jsonb_path_all(fcinfo, false);

That just makes it harder if someday PG_FUNCTION_ARGS grows a second
argument or something.

> +Datum
> +jsonb_object_keys(PG_FUNCTION_ARGS)
> +{
> + FuncCallContext *funcctx;
> + OkeysState *state;
> + int i;
> +
> + if (SRF_IS_FIRSTCALL())
> + {
> + MemoryContext oldcontext;
> + Jsonb *jb = PG_GETARG_JSONB(0);
> + bool skipNested = false;
> + JsonbIterator *it;
> + JsonbValue v;
> + int r = 0;
> +
> + if (JB_ROOT_IS_SCALAR(jb))
> + ereport(ERROR,
> + (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> + errmsg("cannot call jsonb_object_keys on a scalar")));
> + else if (JB_ROOT_IS_ARRAY(jb))
> + ereport(ERROR,
> + (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> + errmsg("cannot call jsonb_object_keys on an array")));
> +
> + funcctx = SRF_FIRSTCALL_INIT();
> + oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);

This will detoast 'jb' into the expression context, since
PG_GETARG_JSONB() is called before the MemoryContextSwitchTo. But that's
ok since the percall code only deals with ->result, right?

> - /* make these in a sufficiently long-lived memory context */
> old_cxt = MemoryContextSwitchTo(rsi->econtext->ecxt_per_query_memory);

wh remove that comment?

> +#define JENTRY_ISCALAR (0x10000000 | 0x40000000)

Isn't there an S missing here?

> --- a/contrib/hstore/hstore_compat.c
> +++ b/contrib/hstore/hstore_compat.c
> +/*
> + * New Old version (new not-nested version of hstore, v2 version)
> + * V2 and v3 (nested) are upward binary compatible. But
> + * framework was fully changed. Keep here old definitions (v2)
> + */

That's an, err, interesting sentence. I think referring to old new
version and stuff is less than helpful. I realize lots of that is
baggage from existing code, but yet another version doesn't make it
easier.

I lost my stomach (or maybe it was the glass of red) somewhere in the
middle, but I think this needs a lot of work. Especially the io code
doesn't seem ready to me. I'd consider ripping out the send/recv code
for 9.4, that seems the biggest can of worms. It will still be usable
without.

There's just about no comments in large and relevant parts of the
code. There's not much documentation about the binary layout of a
definitely not trivial type with convoluted interdependencies with
hstore...

Greetings,

Andres Freund


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Oleg Bartunov <obartunov(at)gmail(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-01 23:13:42
Message-ID: 52ED7FA6.30008@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 02/01/2014 05:20 PM, Andres Freund wrote:

[Long review]

Most of these comments actually refer to Teodor and Oleg's code.

I will attend to the parts that apply to my code.

Thanks for the review.

cheers

andrew


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Oleg Bartunov <obartunov(at)gmail(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-01 23:15:53
Message-ID: 20140201231553.GC32123@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On 2014-02-01 18:13:42 -0500, Andrew Dunstan wrote:
> [Long review]
>
> Most of these comments actually refer to Teodor and Oleg's code.
>
> I will attend to the parts that apply to my code.

Well, somebody will need to address them nonetheless :/

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Oleg Bartunov <obartunov(at)gmail(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-01 23:33:22
Message-ID: 52ED8442.1030608@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 02/01/2014 06:15 PM, Andres Freund wrote:
> Hi,
>
> On 2014-02-01 18:13:42 -0500, Andrew Dunstan wrote:
>> [Long review]
>>
>> Most of these comments actually refer to Teodor and Oleg's code.
>>
>> I will attend to the parts that apply to my code.
> Well, somebody will need to address them nonetheless :/
>

Yes, of course, I didn't suggest otherwise.

cheers

andrew


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-03 15:22:52
Message-ID: CAHyXU0xGMcUkTv7WPpqeiZXSmBvY29C5JovgVM8EpnhZ+LhVqQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Feb 1, 2014 at 4:20 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> On 2014-01-30 14:07:42 -0500, Andrew Dunstan wrote:
>> + <para id="functions-json-table">
>> + <xref linkend="functions-json-creation-table"> shows the functions that are
>> + available for creating <type>json</type> values.
>> + (see <xref linkend="datatype-json">)
>> </para>
>>
>> - <table id="functions-json-table">
>> - <title>JSON Support Functions</title>
>> + <indexterm>
>> + <primary>array_to_json</primary>
>> + </indexterm>
>> + <indexterm>
>> + <primary>row_to_json</primary>
>> + </indexterm>
>> + <indexterm>
>> + <primary>to_json</primary>
>> + </indexterm>
>> + <indexterm>
>> + <primary>json_build_array</primary>
>> + </indexterm>
>> + <indexterm>
>> + <primary>json_build_object</primary>
>> + </indexterm>
>> + <indexterm>
>> + <primary>json_object</primary>
>> + </indexterm>
>
> Hm, why are you collecting the indexterms at the top in the contrast to
> the previous way of collecting them at the point of documentation?
>
>> diff --git a/src/backend/utils/adt/Makefile b/src/backend/utils/adt/Makefile
>> index 1ae9fa0..fd93d9b 100644
>> --- a/src/backend/utils/adt/Makefile
>> +++ b/src/backend/utils/adt/Makefile
>> @@ -32,7 +32,8 @@ OBJS = acl.o arrayfuncs.o array_selfuncs.o array_typanalyze.o \
>> tsquery_op.o tsquery_rewrite.o tsquery_util.o tsrank.o \
>> tsvector.o tsvector_op.o tsvector_parser.o \
>> txid.o uuid.o windowfuncs.o xml.o rangetypes_spgist.o \
>> - rangetypes_typanalyze.o rangetypes_selfuncs.o
>> + rangetypes_typanalyze.o rangetypes_selfuncs.o \
>> + jsonb.o jsonb_support.o
>
> Odd, most OBJS lines are kept in alphabetical order, but that doesn't
> seem to be the case here.
>
>> +/*
>> + * for jsonb we always want the de-escaped value - that's what's in token
>> + */
>> +
>
> strange newline.
>
>> +static void
>> +jsonb_in_scalar(void *state, char *token, JsonTokenType tokentype)
>> +{
>> + JsonbInState *_state = (JsonbInState *) state;
>> + JsonbValue v;
>> +
>> + v.size = sizeof(JEntry);
>> +
>> + switch (tokentype)
>> + {
>> +
> ...
>
>> + default: /* nothing else should be here in fact */
>> + break;
>
> Shouldn't this at least Assert(false) or something?
>
>> +static void
>> +recvJsonbValue(StringInfo buf, JsonbValue *v, uint32 level, int c)
>> +{
>> + uint32 hentry = c & JENTRY_TYPEMASK;
>> +
>> + if (hentry == JENTRY_ISNULL)
>> + {
>> + v->type = jbvNull;
>> + v->size = sizeof(JEntry);
>> + }
>> + else if (hentry == JENTRY_ISOBJECT || hentry == JENTRY_ISARRAY || hentry == JENTRY_ISCALAR)
>> + {
>> + recvJsonb(buf, v, level + 1, (uint32) c);
>> + }
>> + else if (hentry == JENTRY_ISFALSE || hentry == JENTRY_ISTRUE)
>> + {
>> + v->type = jbvBool;
>> + v->size = sizeof(JEntry);
>> + v->boolean = (hentry == JENTRY_ISFALSE) ? false : true;
>> + }
>> + else if (hentry == JENTRY_ISNUMERIC)
>> + {
>> + v->type = jbvNumeric;
>> + v->numeric = DatumGetNumeric(DirectFunctionCall3(numeric_recv, PointerGetDatum(buf),
>> + Int32GetDatum(0), Int32GetDatum(-1)));
>> +
>> + v->size = sizeof(JEntry) * 2 + VARSIZE_ANY(v->numeric);
>
> What's the *2 here?
>
>> +static void
>> +recvJsonb(StringInfo buf, JsonbValue *v, uint32 level, uint32 header)
>> +{
>> + uint32 hentry;
>> + uint32 i;
>
> This function and recvJsonbValue call each other recursively, afaics
> without any limit, shouldn't they check for the stack depth?
>
>> + hentry = header & JENTRY_TYPEMASK;
>> +
>> + v->size = 3 * sizeof(JEntry);
>
> *3?
>
>> + if (hentry == JENTRY_ISOBJECT)
>> + {
>> + v->type = jbvHash;
>> + v->hash.npairs = header & JB_COUNT_MASK;
>> + if (v->hash.npairs > 0)
>> + {
>> + v->hash.pairs = palloc(sizeof(*v->hash.pairs) * v->hash.npairs);
>> +
>
> Hm, if I understand correctly, we're just allocating JB_COUNT_MASK
> (which is 0x0FFFFFFF) * sizeof(*v->hash.pairs) bytes here, without any
> crosschecks about the actual length of the data? So with a few bytes the
> server can be coaxed to allocate a gigabyte of data?
> Since this immediately calls another input routine, this can be done in
> a nested fashion, quickly OOMing the server.
>
> I think this and several other places really need a bit more input
> sanity checking.
>
>> + for (i = 0; i < v->hash.npairs; i++)
>> + {
>> + recvJsonbValue(buf, &v->hash.pairs[i].key, level, pq_getmsgint(buf, 4));
>> + if (v->hash.pairs[i].key.type != jbvString)
>> + elog(ERROR, "jsonb's key could be only a string");
>
> Shouldn't that be an ereport(ERRCODE_DATATYPE_MISMATCH)? Similar in a
> few other places.
>
>> +char *
>> +JsonbToCString(StringInfo out, char *in, int estimated_len)
>> +{
>> + bool first = true;
>> + JsonbIterator *it;
>> + int type;
>> + JsonbValue v;
>> + int level = 0;
>> +
>> + if (out == NULL)
>> + out = makeStringInfo();
>
> Such a behaviour certainly deserves a documentary comment. Generally
> some more functions could use that.
>
>> + while ((type = JsonbIteratorGet(&it, &v, false)) != 0)
>> + {
>> +reout:
>> + switch (type)
>> + {
> ...
>> + {
>> + Assert(type == WJB_BEGIN_OBJECT || type == WJB_BEGIN_ARRAY);
>> + goto reout;
>
> Hrmpf.
>
>> +Datum
>> +jsonb_typeof(PG_FUNCTION_ARGS)
>> +{
> ...
>> +}
>
> Hm, shouldn't that be in jsonfuncs.c?
>
>> diff --git a/src/backend/utils/adt/jsonfuncs.c b/src/backend/utils/adt/jsonfuncs.c
>> index a19b222..f1eacc6 100644
>> --- a/src/backend/utils/adt/jsonfuncs.c
>> +++ b/src/backend/utils/adt/jsonfuncs.c
>> @@ -27,6 +27,7 @@
>> #include "utils/builtins.h"
>> #include "utils/hsearch.h"
>> #include "utils/json.h"
>> +#include "utils/jsonb.h"
>> #include "utils/jsonapi.h"
>> #include "utils/lsyscache.h"
>> #include "utils/memutils.h"
>> @@ -51,6 +52,7 @@ static inline Datum get_path_all(PG_FUNCTION_ARGS, bool as_text);
>> static inline text *get_worker(text *json, char *field, int elem_index,
>> char **tpath, int *ipath, int npath,
>> bool normalize_results);
>> +static inline Datum get_jsonb_path_all(PG_FUNCTION_ARGS, bool as_text);
>
> I don't see the point of using PG_FUNCTION_ARGS if you're manually
> calling it like
> + return get_jsonb_path_all(fcinfo, false);
>
> That just makes it harder if someday PG_FUNCTION_ARGS grows a second
> argument or something.
>
>
>> +Datum
>> +jsonb_object_keys(PG_FUNCTION_ARGS)
>> +{
>> + FuncCallContext *funcctx;
>> + OkeysState *state;
>> + int i;
>> +
>> + if (SRF_IS_FIRSTCALL())
>> + {
>> + MemoryContext oldcontext;
>> + Jsonb *jb = PG_GETARG_JSONB(0);
>> + bool skipNested = false;
>> + JsonbIterator *it;
>> + JsonbValue v;
>> + int r = 0;
>> +
>> + if (JB_ROOT_IS_SCALAR(jb))
>> + ereport(ERROR,
>> + (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
>> + errmsg("cannot call jsonb_object_keys on a scalar")));
>> + else if (JB_ROOT_IS_ARRAY(jb))
>> + ereport(ERROR,
>> + (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
>> + errmsg("cannot call jsonb_object_keys on an array")));
>> +
>> + funcctx = SRF_FIRSTCALL_INIT();
>> + oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
>
> This will detoast 'jb' into the expression context, since
> PG_GETARG_JSONB() is called before the MemoryContextSwitchTo. But that's
> ok since the percall code only deals with ->result, right?
>
>> - /* make these in a sufficiently long-lived memory context */
>> old_cxt = MemoryContextSwitchTo(rsi->econtext->ecxt_per_query_memory);
>
> wh remove that comment?
>
>> +#define JENTRY_ISCALAR (0x10000000 | 0x40000000)
>
> Isn't there an S missing here?
>
>> --- a/contrib/hstore/hstore_compat.c
>> +++ b/contrib/hstore/hstore_compat.c
>> +/*
>> + * New Old version (new not-nested version of hstore, v2 version)
>> + * V2 and v3 (nested) are upward binary compatible. But
>> + * framework was fully changed. Keep here old definitions (v2)
>> + */
>
> That's an, err, interesting sentence. I think referring to old new
> version and stuff is less than helpful. I realize lots of that is
> baggage from existing code, but yet another version doesn't make it
> easier.
>
> I lost my stomach (or maybe it was the glass of red) somewhere in the
> middle, but I think this needs a lot of work. Especially the io code
> doesn't seem ready to me. I'd consider ripping out the send/recv code
> for 9.4, that seems the biggest can of worms. It will still be usable
> without.

Not having type send/recv functions is somewhat dangerous; it can
cause problems for libraries that run everything through the binary
wire format. I'd give jsonb a pass on that, being a new type, but
would be concerned if hstore had that ability revoked.

offhand note: hstore_send seems pretty simply written and clean; it's
a simple nonrecursive iterator...

merlin


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-03 15:27:23
Message-ID: 20140203152723.GF1225@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2014-02-03 09:22:52 -0600, Merlin Moncure wrote:
> > I lost my stomach (or maybe it was the glass of red) somewhere in the
> > middle, but I think this needs a lot of work. Especially the io code
> > doesn't seem ready to me. I'd consider ripping out the send/recv code
> > for 9.4, that seems the biggest can of worms. It will still be usable
> > without.
>
> Not having type send/recv functions is somewhat dangerous; it can
> cause problems for libraries that run everything through the binary
> wire format. I'd give jsonb a pass on that, being a new type, but
> would be concerned if hstore had that ability revoked.

Yea, removing it for hstore would be a compat problem...

> offhand note: hstore_send seems pretty simply written and clean; it's
> a simple nonrecursive iterator...

But a send function is pretty pointless without the corresponding recv
function... And imo recv simply is to dangerous as it's currently
written.
I am not saying that it cannot be made work, just that it's still nearly
as ugly as when I pointed out several of the dangers some weeks back.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: jsonb and nested hstore
Date: 2014-02-04 22:03:12
Message-ID: 52F163A0.3050603@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 02/03/2014 07:27 AM, Andres Freund wrote:
> On 2014-02-03 09:22:52 -0600, Merlin Moncure wrote:
>>> I lost my stomach (or maybe it was the glass of red) somewhere in the
>>> middle, but I think this needs a lot of work. Especially the io code
>>> doesn't seem ready to me. I'd consider ripping out the send/recv code
>>> for 9.4, that seems the biggest can of worms. It will still be usable
>>> without.
>>
>> Not having type send/recv functions is somewhat dangerous; it can
>> cause problems for libraries that run everything through the binary
>> wire format. I'd give jsonb a pass on that, being a new type, but
>> would be concerned if hstore had that ability revoked.
>
> Yea, removing it for hstore would be a compat problem...
>
>> offhand note: hstore_send seems pretty simply written and clean; it's
>> a simple nonrecursive iterator...
>
> But a send function is pretty pointless without the corresponding recv
> function... And imo recv simply is to dangerous as it's currently
> written.
> I am not saying that it cannot be made work, just that it's still nearly
> as ugly as when I pointed out several of the dangers some weeks back.

Oleg, Teodor, any comments on the above?

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-05 06:44:38
Message-ID: 52F1DDD6.2060101@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 02/03/2014 05:22 PM, Merlin Moncure wrote:
>> >I lost my stomach (or maybe it was the glass of red) somewhere in the
>> >middle, but I think this needs a lot of work. Especially the io code
>> >doesn't seem ready to me. I'd consider ripping out the send/recv code
>> >for 9.4, that seems the biggest can of worms. It will still be usable
>> >without.
> Not having type send/recv functions is somewhat dangerous; it can
> cause problems for libraries that run everything through the binary
> wire format. I'd give jsonb a pass on that, being a new type, but
> would be concerned if hstore had that ability revoked.

send/recv functions are also needed for binary-format COPY. IMHO jsonb
must have send/recv functions. All other built-in types have them,
except for types like 'smgr', 'aclitem' and 'any*' that no-one should be
using as column types.

- Heikki


From: Oleg Bartunov <obartunov(at)gmail(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-05 07:21:25
Message-ID: CAF4Au4y3Qa-27cxyHYzmeBJSKLO+6OGwN_6Z0mKJaH+PjE+PAw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andrew provided us more information and we'll work on recv. What
people think about testing this stuff ? btw, we don't have any
regression test on this.

Oleg

On Wed, Feb 5, 2014 at 2:03 AM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> On 02/03/2014 07:27 AM, Andres Freund wrote:
>> On 2014-02-03 09:22:52 -0600, Merlin Moncure wrote:
>>>> I lost my stomach (or maybe it was the glass of red) somewhere in the
>>>> middle, but I think this needs a lot of work. Especially the io code
>>>> doesn't seem ready to me. I'd consider ripping out the send/recv code
>>>> for 9.4, that seems the biggest can of worms. It will still be usable
>>>> without.
>>>
>>> Not having type send/recv functions is somewhat dangerous; it can
>>> cause problems for libraries that run everything through the binary
>>> wire format. I'd give jsonb a pass on that, being a new type, but
>>> would be concerned if hstore had that ability revoked.
>>
>> Yea, removing it for hstore would be a compat problem...
>>
>>> offhand note: hstore_send seems pretty simply written and clean; it's
>>> a simple nonrecursive iterator...
>>
>> But a send function is pretty pointless without the corresponding recv
>> function... And imo recv simply is to dangerous as it's currently
>> written.
>> I am not saying that it cannot be made work, just that it's still nearly
>> as ugly as when I pointed out several of the dangers some weeks back.
>
> Oleg, Teodor, any comments on the above?
>
> --
> Josh Berkus
> PostgreSQL Experts Inc.
> http://pgexperts.com
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Andres Freund <andres(at)2ndquadrant(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-05 15:36:05
Message-ID: 52F25A65.7000305@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>> +static void
>> +recvJsonbValue(StringInfo buf, JsonbValue *v, uint32 level, int c)
>> + v->size = sizeof(JEntry) * 2 + VARSIZE_ANY(v->numeric);
>
> What's the *2 here?
Reservation for aligment. It's allowed to be v->size greater than it's actually
needed. Fixed.

> This function and recvJsonbValue call each other recursively, afaics
> without any limit, shouldn't they check for the stack depth?
added a check_stack_depth()

>
> *3?

Jentry + header + reservation for aligment

>> + v->hash.pairs = palloc(sizeof(*v->hash.pairs) * v->hash.npairs);
>> +
if (v->hash.npairs > (buf->len - buf->cursor) / (2 * sizeof(uint32)))
ereport(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE)
2 * sizeof(uint32) - minimal size of object element (key plus its value)

> Shouldn't that be an ereport(ERRCODE_DATATYPE_MISMATCH)? Similar in a
> few other places.
fixed

>> +char *
>> +JsonbToCString(StringInfo out, char *in, int estimated_len)
> Such a behaviour certainly deserves a documentary comment. Generally
> some more functions could use that.
add comment

>
>> + while ((type = JsonbIteratorGet(&it, &v, false)) != 0)
>> +reout:
>> + goto reout;
>
> Hrmpf.

:) commented

>
>> +Datum
>> +jsonb_typeof(PG_FUNCTION_ARGS)
>> +{
> ...
>> +}
>
> Hm, shouldn't that be in jsonfuncs.c?
No idea, i don't have an objection

send/recv for hstore is fixed too. Should I make new version of patch? Right now
it's placed on github. May be Andrew wants to change something?

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Oleg Bartunov <obartunov(at)gmail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-05 15:48:20
Message-ID: CAHyXU0wpq+aN9EQZdVc3qC2cp_qeZBmzqwYtCxWyDFC-mQdY6g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Feb 5, 2014 at 12:44 AM, Heikki Linnakangas
<hlinnakangas(at)vmware(dot)com> wrote:
> send/recv functions are also needed for binary-format COPY. IMHO jsonb must
> have send/recv functions. All other built-in types have them, except for
> types like 'smgr', 'aclitem' and 'any*' that no-one should be using as
> column types.

Yes -- completely agree. I also consider the hstore functionality (in
particular, searching and access operators) to be essential
functionality.

I'm actually surprised we have an alternate binary wire format for
jsonb at all; json is explicitly text and I'm not sure what the use
case of sending the internal structure is. Meaning, maybe jsonb
send/recv should be a thin wrapper to sending the json string. The
hstore send/recv I think properly covers the case where client side
binary wire format actors would want to manage performance critical
cases that want to avoid parsing.

On Wed, Feb 5, 2014 at 1:21 AM, Oleg Bartunov <obartunov(at)gmail(dot)com> wrote:
> Andrew provided us more information and we'll work on recv. What
> people think about testing this stuff ? btw, we don't have any
> regression test on this.

I'm intensely interested in this work; I consider it to be transformative.

I've *lightly* tested the jsonb/hstore functionality and so far
everything is working.

I still have concerns about the API. Aside from the stuff I mentioned
upthread I find the API split between jsonb and hstore to be a little
odd; a lot of useful bits (for example, the @> operator) come via the
hstore type only. So these types are joined at the hip for real work
which makes the diverging incomplete behaviors in functions like
populate_record() disconcerting. Another point I'm struggling with is
what jsonb brings to the table that isn't covered either hstore or
json; working through a couple of cases I find myself not using the
jsonb functionality except as a 'hstore json formatter' which the json
type covers. I'm probably being obtuse, but we have to be cautious
before plonking a couple of dozen extra functions in the public
schema.

merlin


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Oleg Bartunov <obartunov(at)gmail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-05 16:22:55
Message-ID: 52F2655F.1030408@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 02/05/2014 10:48 AM, Merlin Moncure wrote:
> On Wed, Feb 5, 2014 at 12:44 AM, Heikki Linnakangas
> <hlinnakangas(at)vmware(dot)com> wrote:
>> send/recv functions are also needed for binary-format COPY. IMHO jsonb must
>> have send/recv functions. All other built-in types have them, except for
>> types like 'smgr', 'aclitem' and 'any*' that no-one should be using as
>> column types.
> Yes -- completely agree. I also consider the hstore functionality (in
> particular, searching and access operators) to be essential
> functionality.
>
> I'm actually surprised we have an alternate binary wire format for
> jsonb at all; json is explicitly text and I'm not sure what the use
> case of sending the internal structure is. Meaning, maybe jsonb
> send/recv should be a thin wrapper to sending the json string. The
> hstore send/recv I think properly covers the case where client side
> binary wire format actors would want to manage performance critical
> cases that want to avoid parsing.
>
>

The whole reason we have jsonb is to avoid reparsing where possible. In
fact, I'd rather have the send and recv functions in the jsonb code and
have hstore's functions call them, so we don't duplicate code.

cheers

andrew


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-05 16:29:44
Message-ID: 52F266F8.8060908@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 02/05/2014 10:36 AM, Teodor Sigaev wrote:
>
>>
>>> +Datum
>>> +jsonb_typeof(PG_FUNCTION_ARGS)
>>> +{
>> ...
>>> +}
>>
>> Hm, shouldn't that be in jsonfuncs.c?
> No idea, i don't have an objection

No it shouldn't. The json equivalent function is in json.c, and needs to
be because it uses the parser internals that aren't exposed outside that
code.

>
> send/recv for hstore is fixed too. Should I make new version of patch?
> Right now it's placed on github. May be Andrew wants to change something?
>
>

I'll take a look, but I think we need to unify this so we use one set of
send/recv code for the two types if possible, as I just said to Merlin.

cheers

andrew


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Oleg Bartunov <obartunov(at)gmail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-05 16:33:25
Message-ID: CAHyXU0xdn+ko6GfwbL4OxiAjgaz6PAkvo71otGYBH3FdU88TZQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Feb 5, 2014 at 10:22 AM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
>> I'm actually surprised we have an alternate binary wire format for
>> jsonb at all; json is explicitly text and I'm not sure what the use
>> case of sending the internal structure is. Meaning, maybe jsonb
>> send/recv should be a thin wrapper to sending the json string. The
>> hstore send/recv I think properly covers the case where client side
>> binary wire format actors would want to manage performance critical
>> cases that want to avoid parsing.
>
> The whole reason we have jsonb is to avoid reparsing where possible

Sure; but on the server side. The wire format is for handling client
concerns. For example, the case you're arguing for would be for libpq
client to extract as jsonb as binary, manipulate it on a binary level,
then send it back as binary. I find this case to be something of a
stretch.

That being said, for binary dump/restore perhaps there's a performance
case to be made.

> In fact, I'd rather have the send and recv functions in the jsonb code and have
> hstore's functions call them, so we don't duplicate code.

yeah. Agree that there needs to be two sets of routines, not three.
I think a case could be made for the jsonb type could take either
json's or hstore's depending on if the above. FWIW, either way is
fine.

merlin


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Oleg Bartunov <obartunov(at)gmail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-05 16:40:49
Message-ID: 19267.1391618449@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
> On Wed, Feb 5, 2014 at 10:22 AM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
>> The whole reason we have jsonb is to avoid reparsing where possible

> Sure; but on the server side. The wire format is for handling client
> concerns. For example, the case you're arguing for would be for libpq
> client to extract as jsonb as binary, manipulate it on a binary level,
> then send it back as binary. I find this case to be something of a
> stretch.

I'm with Merlin in thinking that the case for exposing a binary format
to clients is pretty weak, or at least a convincing use-case has not
been shown. Given the concerns upthread about security hazards in the
patch's existing recv code, and the fact that it's already February,
switching to "binary is the same as text" may well be the most prudent
path here.

regards, tom lane


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, Oleg Bartunov <obartunov(at)gmail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-05 16:55:56
Message-ID: 52F26D1C.1000102@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 02/05/2014 11:40 AM, Tom Lane wrote:
> Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
>> On Wed, Feb 5, 2014 at 10:22 AM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
>>> The whole reason we have jsonb is to avoid reparsing where possible
>> Sure; but on the server side. The wire format is for handling client
>> concerns. For example, the case you're arguing for would be for libpq
>> client to extract as jsonb as binary, manipulate it on a binary level,
>> then send it back as binary. I find this case to be something of a
>> stretch.
> I'm with Merlin in thinking that the case for exposing a binary format
> to clients is pretty weak, or at least a convincing use-case has not
> been shown. Given the concerns upthread about security hazards in the
> patch's existing recv code, and the fact that it's already February,
> switching to "binary is the same as text" may well be the most prudent
> path here.
>
>

If we do that we're going to have to live with that forever, aren't we?
I don't see why there should be a convincing case for binary format for
nested hstore but not for jsonb.

If it were only for arbitrary libpq clietns I wouldn't bother so much.
To me the main case for binary format is that some people use COPY
BINARY for efficiency reasons, and I heard tell recently of someone
working on that as an option for pg_dump, which seems to me worth
considering.

cheers

andrew


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, Oleg Bartunov <obartunov(at)gmail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-05 17:48:07
Message-ID: 20435.1391622487@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> On 02/05/2014 11:40 AM, Tom Lane wrote:
>> switching to "binary is the same as text" may well be the most prudent
>> path here.

> If we do that we're going to have to live with that forever, aren't we?

Yeah, but the other side of that coin is that we'll have to live forever
with whatever binary format we pick, too. If it turns out to be badly
designed, that could be much worse than eating some parsing costs during
dump/restore.

If we had infinite time/manpower, this wouldn't really be an issue.
We don't, though, and so I suggest that this may be one of the better
things to toss overboard.

regards, tom lane


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, Oleg Bartunov <obartunov(at)gmail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-05 18:10:50
Message-ID: 52F27EAA.3000509@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 02/05/2014 12:48 PM, Tom Lane wrote:
> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
>> On 02/05/2014 11:40 AM, Tom Lane wrote:
>>> switching to "binary is the same as text" may well be the most prudent
>>> path here.
>> If we do that we're going to have to live with that forever, aren't we?
> Yeah, but the other side of that coin is that we'll have to live forever
> with whatever binary format we pick, too. If it turns out to be badly
> designed, that could be much worse than eating some parsing costs during
> dump/restore.
>
> If we had infinite time/manpower, this wouldn't really be an issue.
> We don't, though, and so I suggest that this may be one of the better
> things to toss overboard.
>
>

The main reason I'm prepared to consider this is the JSON parser seems
to be fairly efficient (See Oleg's recent stats) and in fact we'd more
or less be parsing the binary format on input anyway, so there's no
proof that a binary format is going to be hugely faster (or possibly
even that it will be faster at all).

If anyone else has opinions on this sing out pretty darn soon (like the
next 24 hours or so, before I begin.)

cheers

andrew


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Oleg Bartunov <obartunov(at)gmail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-05 18:22:40
Message-ID: CAHyXU0zK0OUhLby11YgFB8w5OSfVp8HVYUHbakNGD=UQxN--3g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Feb 5, 2014 at 11:48 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> If we had infinite time/manpower, this wouldn't really be an issue.
> We don't, though, and so I suggest that this may be one of the better
> things to toss overboard.

The hstore send/recv functions have basically the same
(copy/pasted/name adjusted) implementation. Since hstore will
presumably remain (as the current hstore is) 'deep binary' and all of
Andres's gripes apply to the hstore as well, this change buys us
precisely zap from a time perspective; it comes down to which is
intrinsically the better choice.

merlin


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>, Oleg Bartunov <obartunov(at)gmail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-05 19:03:26
Message-ID: 52F28AFE.3050607@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 02/05/2014 07:48 AM, Merlin Moncure wrote:
> Another point I'm struggling with is
> what jsonb brings to the table that isn't covered either hstore or
> json; working through a couple of cases I find myself not using the
> jsonb functionality except as a 'hstore json formatter' which the json
> type covers. I'm probably being obtuse, but we have to be cautious
> before plonking a couple of dozen extra functions in the public
> schema.

There's three reasons why it's worthwhile:

1) user-friendliness: telling users they need to do "::JSON" and
"::HSTORE2" all the time is sufficiently annoying -- and prone to
causing errors -- to be a blocker to adoption by a certain, very
numerous, class of user.

2) performance: to the extent that we can operate entirely in JSONB and
not transform back and forth to JSON and HSTORE, function calls (and
index lookups) will be much faster. And given the competition, speed is
important.

3) growth: 9.4's JSONB functions are a prerequisite to developing richer
JSON querying capabilities in 9.5 and later, which will go beyond "JSON
formatting for HSTORE".

Frankly, if it were entirely up to me HSTORE2 would be part of core and
its only interface would be JSONB. But it's not. So this is a compromise.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, Oleg Bartunov <obartunov(at)gmail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-05 19:35:44
Message-ID: 52F29290.7000601@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 02/05/2014 02:03 PM, Josh Berkus wrote:

> Frankly, if it were entirely up to me HSTORE2 would be part of core and
> its only interface would be JSONB. But it's not. So this is a compromise.
>

You could only do that by inventing a new type. But hstore2 isn't a new
type, it's meant to be the existing hstore type with new capabilities.

Incidentally, some work is being done by one of my colleagues on an
extension of gin/gist operators for indexing jsonb similarly to hstore2.
Now that will possibly be something we can bring into 9.4, although
we'll have to check how we go about pg_upgrade for that case.

cheers

andrew


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Oleg Bartunov <obartunov(at)gmail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-05 20:15:14
Message-ID: CAHyXU0zwUmZe3NyY-s-WkV3ba=jFfOsEjY7vj6jfQgR9T6UgUQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Feb 5, 2014 at 1:03 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> On 02/05/2014 07:48 AM, Merlin Moncure wrote:
>> Another point I'm struggling with is
>> what jsonb brings to the table that isn't covered either hstore or
>> json; working through a couple of cases I find myself not using the
>> jsonb functionality except as a 'hstore json formatter' which the json
>> type covers. I'm probably being obtuse, but we have to be cautious
>> before plonking a couple of dozen extra functions in the public
>> schema.
>
> There's three reasons why it's worthwhile:
>
> 1) user-friendliness: telling users they need to do "::JSON" and
> "::HSTORE2" all the time is sufficiently annoying -- and prone to
> causing errors -- to be a blocker to adoption by a certain, very
> numerous, class of user.

That's a legitimate point of concern. But in and of itself I'm sure
sure it warrants exposing a separate API.

> 2) performance: to the extent that we can operate entirely in JSONB and
> not transform back and forth to JSON and HSTORE, function calls (and
> index lookups) will be much faster. And given the competition, speed is
> important.

Not following this. I do not see how the presence of jsonb helps at
all. Client to server communication will be text->binary (and vice
versa) and handling within the server itself will be in binary. This
is the crux of my point.

> 3) growth: 9.4's JSONB functions are a prerequisite to developing richer
> JSON querying capabilities in 9.5 and later, which will go beyond "JSON
> formatting for HSTORE".

I kind of get this point. But in lieu of a practical use case today,
what's the rush to implement? I fully anticipate I'm out on left
field on this one (I have a cot and mini fridge there). The question
on the table is: what use cases (performance included) does jsonb
solve that is not solve can't be solved without it? With the possible
limited exception of andrew's yet to be delivered enhanced
deserialization routines, I can't think of any. If presented with
reasonable evidence I'll shut my yap, pronto.

> Frankly, if it were entirely up to me HSTORE2 would be part of core and
> its only interface would be JSONB. But it's not. So this is a compromise.

I don't. To be pedantic: hstore is in core, but packaged as an
extension. That's a very important distinction.

In fact, I'll go further and say it seem wise for all SQL standard
type work to happen in extensions. As long as it's an in core contrib
extension, I see no stigma to that whatsoever. It's not clear at all
to me why the json type was put to the public schema and now we're
about to double down with jsonb. Having things extension packaged
greatly eases concerns about future API changes because if problems
emerge it's not impossible to imagine compatibility extensions to
appear to bridge the gap if certain critical functions change. That's
exactly the sort of thing that we may want to happen here, I think.

merlin


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Oleg Bartunov <obartunov(at)gmail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-05 20:37:24
Message-ID: 52F2A104.4090001@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 02/05/2014 03:15 PM, Merlin Moncure wrote:
> On Wed, Feb 5, 2014 at 1:03 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
>> On 02/05/2014 07:48 AM, Merlin Moncure wrote:
>>> Another point I'm struggling with is
>>> what jsonb brings to the table that isn't covered either hstore or
>>> json; working through a couple of cases I find myself not using the
>>> jsonb functionality except as a 'hstore json formatter' which the json
>>> type covers. I'm probably being obtuse, but we have to be cautious
>>> before plonking a couple of dozen extra functions in the public
>>> schema.
>> There's three reasons why it's worthwhile:
>>
>> 1) user-friendliness: telling users they need to do "::JSON" and
>> "::HSTORE2" all the time is sufficiently annoying -- and prone to
>> causing errors -- to be a blocker to adoption by a certain, very
>> numerous, class of user.
> That's a legitimate point of concern. But in and of itself I'm sure
> sure it warrants exposing a separate API.
>
>> 2) performance: to the extent that we can operate entirely in JSONB and
>> not transform back and forth to JSON and HSTORE, function calls (and
>> index lookups) will be much faster. And given the competition, speed is
>> important.
> Not following this. I do not see how the presence of jsonb helps at
> all. Client to server communication will be text->binary (and vice
> versa) and handling within the server itself will be in binary. This
> is the crux of my point.
>
>> 3) growth: 9.4's JSONB functions are a prerequisite to developing richer
>> JSON querying capabilities in 9.5 and later, which will go beyond "JSON
>> formatting for HSTORE".
> I kind of get this point. But in lieu of a practical use case today,
> what's the rush to implement? I fully anticipate I'm out on left
> field on this one (I have a cot and mini fridge there). The question
> on the table is: what use cases (performance included) does jsonb
> solve that is not solve can't be solved without it? With the possible
> limited exception of andrew's yet to be delivered enhanced
> deserialization routines, I can't think of any. If presented with
> reasonable evidence I'll shut my yap, pronto.
>
>> Frankly, if it were entirely up to me HSTORE2 would be part of core and
>> its only interface would be JSONB. But it's not. So this is a compromise.
> I don't. To be pedantic: hstore is in core, but packaged as an
> extension. That's a very important distinction.
>
> In fact, I'll go further and say it seem wise for all SQL standard
> type work to happen in extensions. As long as it's an in core contrib
> extension, I see no stigma to that whatsoever. It's not clear at all
> to me why the json type was put to the public schema and now we're
> about to double down with jsonb. Having things extension packaged
> greatly eases concerns about future API changes because if problems
> emerge it's not impossible to imagine compatibility extensions to
> appear to bridge the gap if certain critical functions change. That's
> exactly the sort of thing that we may want to happen here, I think.
>

The time for this discussion was months ago. I would not have spent many
many hours of my time if I thought it was going to be thrown away. I
find this attitude puzzling, to say the least. You were a major part of
the discussion when we said "OK, we'll leave json as it is (text based)
and add jsonb." That's exactly what we're doing.

And no, hstore is NOT in core. In core for a type means to me it's
builtin, with a fixed Oid.

cheers

andrew


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Oleg Bartunov <obartunov(at)gmail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-05 20:45:56
Message-ID: CAHyXU0xCLNgRR_cA7jK90SKUFTXS-t0wZbJibAsPiK0Vxy9b5Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Feb 5, 2014 at 2:37 PM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
> The time for this discussion was months ago. I would not have spent many
> many hours of my time if I thought it was going to be thrown away. I find
> this attitude puzzling, to say the least. You were a major part of the
> discussion when we said "OK, we'll leave json as it is (text based) and add
> jsonb." That's exactly what we're doing.

certainly. I'll shut my yap; I understand your puzzlement. At the
time though, I had assumed the API was going to incorporate more of
the hstore feature set than it did.

merlin


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Oleg Bartunov <obartunov(at)gmail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-05 20:58:01
Message-ID: 52F2A5D9.6070405@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 02/05/2014 03:45 PM, Merlin Moncure wrote:
> On Wed, Feb 5, 2014 at 2:37 PM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
>> The time for this discussion was months ago. I would not have spent many
>> many hours of my time if I thought it was going to be thrown away. I find
>> this attitude puzzling, to say the least. You were a major part of the
>> discussion when we said "OK, we'll leave json as it is (text based) and add
>> jsonb." That's exactly what we're doing.
> certainly. I'll shut my yap; I understand your puzzlement. At the
> time though, I had assumed the API was going to incorporate more of
> the hstore feature set than it did.
>

And we will. Specifically the indexing ops I mentioned upthread. We've
got done as much as could be done this cycle. That's how Postgres
development works.

One of the major complaints about json in 9.3 is that almost all the
functions and operators involve reparsing the json. The equivalent
operations for jsonb do not, and should accordingly be significantly
faster. That's what I have been spending my time on. I don't think
that's an inconsiderable advance.

cheers

andrew


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Oleg Bartunov <obartunov(at)gmail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-05 21:03:06
Message-ID: 52F2A70A.6040705@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Merlin,

> Not following this. I do not see how the presence of jsonb helps at
> all. Client to server communication will be text->binary (and vice
> versa) and handling within the server itself will be in binary. This
> is the crux of my point.

Except that handling it on the server, in binary, would require using
the HSTORE syntax. Otherwise you're converting from text JSON and back
whenever you want to nest functions.

> I kind of get this point. But in lieu of a practical use case today,
> what's the rush to implement? I fully anticipate I'm out on left
> field on this one (I have a cot and mini fridge there). The question
> on the table is: what use cases (performance included) does jsonb
> solve that is not solve can't be solved without it?

Indexed element extraction. JSON path queries. JSON manipulation.

If JSONB is in 9.4, then these are things we can build as extensions and
have available long before September 2015 -- in fact, we've already
started on a couple. If JSONB isn't in core as a data type, then we
have to wait for the 9.5 dev cycle to do anything.

> In fact, I'll go further and say it seem wise for all SQL standard
> type work to happen in extensions. As long as it's an in core contrib
> extension, I see no stigma to that whatsoever. It's not clear at all
> to me why the json type was put to the public schema and now we're
> about to double down with jsonb.

I'll agree that having hstore in contrib and json in core has been a
significant source of issues.

On 02/05/2014 12:45 PM, Merlin Moncure wrote:> certainly. I'll shut my
yap; I understand your puzzlement. At the
> time though, I had assumed the API was going to incorporate more of
> the hstore feature set than it did.

That was the original goal. However, Oleg and Teodor's late delivery of
Hstore2 limited what Andrew could do for JSONB before CF4 started.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Oleg Bartunov <obartunov(at)gmail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-05 21:06:11
Message-ID: CAHyXU0w7Wd72Uo5_ck+Vmx1aDgey+J378CLP6vL=Fvvpe7+L9Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Feb 5, 2014 at 3:03 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> That was the original goal. However, Oleg and Teodor's late delivery of
> Hstore2 limited what Andrew could do for JSONB before CF4 started.

yeah. anyways, I'm good on this point.

merlin


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Oleg Bartunov <obartunov(at)gmail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-05 21:17:51
Message-ID: 52F2AA7F.1060103@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 02/05/2014 04:06 PM, Merlin Moncure wrote:
> On Wed, Feb 5, 2014 at 3:03 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
>> That was the original goal. However, Oleg and Teodor's late delivery of
>> Hstore2 limited what Andrew could do for JSONB before CF4 started.

I also had issues. But this is the sort of thing that happens. We get
done as much as we can.

cheers

andrew


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, Oleg Bartunov <obartunov(at)gmail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-05 23:59:09
Message-ID: 52F2D04D.3080602@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 02/05/2014 01:10 PM, Andrew Dunstan wrote:
>
> On 02/05/2014 12:48 PM, Tom Lane wrote:
>> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
>>> On 02/05/2014 11:40 AM, Tom Lane wrote:
>>>> switching to "binary is the same as text" may well be the most prudent
>>>> path here.
>>> If we do that we're going to have to live with that forever, aren't we?
>> Yeah, but the other side of that coin is that we'll have to live forever
>> with whatever binary format we pick, too. If it turns out to be badly
>> designed, that could be much worse than eating some parsing costs during
>> dump/restore.
>>
>> If we had infinite time/manpower, this wouldn't really be an issue.
>> We don't, though, and so I suggest that this may be one of the better
>> things to toss overboard.
>>
>>
>
>
> The main reason I'm prepared to consider this is the JSON parser seems
> to be fairly efficient (See Oleg's recent stats) and in fact we'd more
> or less be parsing the binary format on input anyway, so there's no
> proof that a binary format is going to be hugely faster (or possibly
> even that it will be faster at all).
>
> If anyone else has opinions on this sing out pretty darn soon (like
> the next 24 hours or so, before I begin.)

I got a slightly earlier start ;-) For people wanting to play along,
here's what this change looks like:
<https://github.com/feodor/postgres/commit/3fe899b3d7e8f806b14878da4a4e2331b0eb58e8>

I have a bit more cleanup to do and then I'll try to make new patches.

cheers

andrew


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: adt Makefile, was Re: jsonb and nested hstore
Date: 2014-02-06 16:18:47
Message-ID: 52F3B5E7.6040508@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 02/01/2014 05:20 PM, Andres Freund wrote:
>> diff --git a/src/backend/utils/adt/Makefile b/src/backend/utils/adt/Makefile
>> >index 1ae9fa0..fd93d9b 100644
>> >--- a/src/backend/utils/adt/Makefile
>> >+++ b/src/backend/utils/adt/Makefile
>> >@@ -32,7 +32,8 @@ OBJS = acl.o arrayfuncs.o array_selfuncs.o array_typanalyze.o \
>> > tsquery_op.o tsquery_rewrite.o tsquery_util.o tsrank.o \
>> > tsvector.o tsvector_op.o tsvector_parser.o \
>> > txid.o uuid.o windowfuncs.o xml.o rangetypes_spgist.o \
>> >- rangetypes_typanalyze.o rangetypes_selfuncs.o
>> >+ rangetypes_typanalyze.o rangetypes_selfuncs.o \
>> >+ jsonb.o jsonb_support.o
> Odd, most OBJS lines are kept in alphabetical order, but that doesn't
> seem to be the case here.

This whole list is a mess, and we don't even have all the range_types
files following each other.

Worth cleaning up?

I'm actually wondering if it might be worth having some subgroups of
object files and then combining them into $OBJS.

Or it could just be left more or less as is - it's hardly a breakthrough
advance.

cheers

andrew


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: adt Makefile, was Re: jsonb and nested hstore
Date: 2014-02-06 16:33:15
Message-ID: 17618.1391704395@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> On 02/01/2014 05:20 PM, Andres Freund wrote:
>> Odd, most OBJS lines are kept in alphabetical order, but that doesn't
>> seem to be the case here.

> This whole list is a mess, and we don't even have all the range_types
> files following each other.

> Worth cleaning up?

+1. It's just neatnik-ism, but isn't compulsive neatnik-ism pretty
much a job requirement for programmers? It's hard enough dealing
with necessary complexities without having to wonder if some seemingly
arbitrary choice has hidden meanings.

> I'm actually wondering if it might be worth having some subgroups of
> object files and then combining them into $OBJS.

Nah, let's just alphabetize them and be done. The Makefile has no
reason to care about subgroups of those files.

regards, tom lane


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: adt Makefile, was Re: jsonb and nested hstore
Date: 2014-02-06 16:38:28
Message-ID: 20140206163828.GW10723@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andrew Dunstan wrote:

> This whole list is a mess, and we don't even have all the
> range_types files following each other.
>
> Worth cleaning up?
>
> I'm actually wondering if it might be worth having some subgroups of
> object files and then combining them into $OBJS.

Doesn't the MSVC build stuff parse OBJS definitions?

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: "David E(dot) Wheeler" <david(at)justatheory(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Oleg Bartunov <obartunov(at)gmail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-06 17:00:58
Message-ID: E1104C86-13C8-4B94-A98D-573B041537E8@justatheory.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Feb 5, 2014, at 3:59 PM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:

> I got a slightly earlier start ;-) For people wanting to play along, here's what this change looks like: <https://github.com/feodor/postgres/commit/3fe899b3d7e8f806b14878da4a4e2331b0eb58e8>

Man I love seeing all that read. :-)

D


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: adt Makefile, was Re: jsonb and nested hstore
Date: 2014-02-06 17:03:49
Message-ID: 52F3C075.1050106@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 02/06/2014 11:38 AM, Alvaro Herrera wrote:
> Andrew Dunstan wrote:
>
>> This whole list is a mess, and we don't even have all the
>> range_types files following each other.
>>
>> Worth cleaning up?
>>
>> I'm actually wondering if it might be worth having some subgroups of
>> object files and then combining them into $OBJS.
> Doesn't the MSVC build stuff parse OBJS definitions?
>

Good point. At least in some cases it does.

cheers

andrew


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-06 23:47:31
Message-ID: 52F41F13.4070903@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 02/05/2014 10:36 AM, Teodor Sigaev wrote:
> Should I make new version of patch? Right now it's placed on github.
> May be Andrew wants to change something?
>

Attached are updated patches.

Apart from the things Teodor has fixed, this includes

* switching to using text representation in jsonb send/recv
* implementation of jsonb_array_elements_text that we need now we have
json_array_elements_text
* some code fixes requested in code reviews, plus some other tidying
and refactoring.

cheers

andrew

Attachment Content-Type Size
jsonb-10.patch.gz application/x-gzip 31.6 KB
nested-hstore-10.patch.gz application/x-gzip 66.2 KB

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: adt Makefile, was Re: jsonb and nested hstore
Date: 2014-02-07 01:02:27
Message-ID: CAB7nPqR27vvmBzcOYOv7AF5jVuFeb0VHd5ZGmEHXzV_HdzuSJw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 7, 2014 at 1:18 AM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
>
> On 02/01/2014 05:20 PM, Andres Freund wrote:
>>>
>>> diff --git a/src/backend/utils/adt/Makefile
>>> b/src/backend/utils/adt/Makefile
>>> >index 1ae9fa0..fd93d9b 100644
>>> >--- a/src/backend/utils/adt/Makefile
>>> >+++ b/src/backend/utils/adt/Makefile
>>> >@@ -32,7 +32,8 @@ OBJS = acl.o arrayfuncs.o array_selfuncs.o
>>> > array_typanalyze.o \
>>> > tsquery_op.o tsquery_rewrite.o tsquery_util.o tsrank.o \
>>> > tsvector.o tsvector_op.o tsvector_parser.o \
>>> > txid.o uuid.o windowfuncs.o xml.o rangetypes_spgist.o \
>>> >- rangetypes_typanalyze.o rangetypes_selfuncs.o
>>> >+ rangetypes_typanalyze.o rangetypes_selfuncs.o \
>>> >+ jsonb.o jsonb_support.o
>>
>> Odd, most OBJS lines are kept in alphabetical order, but that doesn't
>> seem to be the case here.
>
>
>
> This whole list is a mess, and we don't even have all the range_types files
> following each other.
>
> Worth cleaning up?
+1. Yes please.
--
Michael


From: "Erik Rijkers" <er(at)xs4all(dot)nl>
To: "Andrew Dunstan" <andrew(at)dunslane(dot)net>
Cc: "Teodor Sigaev" <teodor(at)sigaev(dot)ru>, "Andres Freund" <andres(at)2ndquadrant(dot)com>, "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-09 14:30:10
Message-ID: 1def7d0e9b169a889b7a1a6e55c0452b.squirrel@webmail.xs4all.nl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, February 7, 2014 00:47, Andrew Dunstan wrote:
>
> Attached are updated patches.
>
> jsonb-10.patch.gz
> nested-hstore-10.patch.gz

Small changes to json documentation, mostly of typo caliber.

Thanks,

Erik Rijkers

Attachment Content-Type Size
datatype.sgml.diff text/x-patch 1.1 KB
func.sgml.diff text/x-patch 5.2 KB

From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, Oleg Bartunov <obartunov(at)gmail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-10 07:24:07
Message-ID: 52F87E97.6090804@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 02/06/2014 01:48 AM, Tom Lane wrote:
> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
>> On 02/05/2014 11:40 AM, Tom Lane wrote:
>>> switching to "binary is the same as text" may well be the most prudent
>>> path here.
>
>> If we do that we're going to have to live with that forever, aren't we?
>
> Yeah, but the other side of that coin is that we'll have to live forever
> with whatever binary format we pick, too. If it turns out to be badly
> designed, that could be much worse than eating some parsing costs during
> dump/restore.
>
> If we had infinite time/manpower, this wouldn't really be an issue.
> We don't, though, and so I suggest that this may be one of the better
> things to toss overboard.

Can't we just reject attempts to transfer these via binary copy,
allowing only a text format? So rather than sending text when the binary
is requested, we just require clients to use text for this type.

That way it's possible to add the desired binary format later, without
rushed decisions.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Hannu Krosing <hannu(at)krosing(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, Oleg Bartunov <obartunov(at)gmail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-10 09:41:47
Message-ID: 52F89EDB.8080405@krosing.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 02/05/2014 06:48 PM, Tom Lane wrote:
> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
>> On 02/05/2014 11:40 AM, Tom Lane wrote:
>>> switching to "binary is the same as text" may well be the most prudent
>>> path here.
>> If we do that we're going to have to live with that forever, aren't we?
> Yeah, but the other side of that coin is that we'll have to live forever
> with whatever binary format we pick, too. If it turns out to be badly
> designed, that could be much worse than eating some parsing costs during
> dump/restore.
The fastest and lowest parsing cost format for "JSON" is tnetstrings
http://tnetstrings.org/ why not use it as the binary wire format ?

It would be as binary as it gets and still be generally parse-able by
lots of different platforms, at leas by all of these we care about.

Cheers
Hannu


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-10 10:05:22
Message-ID: 20140210100522.GE26601@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On 2014-02-06 18:47:31 -0500, Andrew Dunstan wrote:
> * switching to using text representation in jsonb send/recv

> +/*
> + * jsonb type recv function
> + *
> + * the type is sent as text in binary mode, so this is almost the same
> + * as the input function.
> + */
> +Datum
> +jsonb_recv(PG_FUNCTION_ARGS)
> +{
> + StringInfo buf = (StringInfo) PG_GETARG_POINTER(0);
> + text *result = cstring_to_text_with_len(buf->data, buf->len);
> +
> + return deserialize_json_text(result);
> +}

> +/*
> + * jsonb type send function
> + *
> + * Just send jsonb as a string of text
> + */
> +Datum
> +jsonb_send(PG_FUNCTION_ARGS)
> +{
> + Jsonb *jb = PG_GETARG_JSONB(0);
> + StringInfoData buf;
> + char *out;
> +
> + out = JsonbToCString(NULL, (JB_ISEMPTY(jb)) ? NULL : VARDATA(jb), VARSIZE(jb));
> +
> + pq_begintypsend(&buf);
> + pq_sendtext(&buf, out, strlen(out));
> + PG_RETURN_BYTEA_P(pq_endtypsend(&buf));
> +}

I'd suggest making the format discernible from possible different future
formats, to allow introducing a proper binary at some later time. Maybe
just send a int8 first, containing the format.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Hannu Krosing <hannu(at)2ndQuadrant(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-10 10:10:59
Message-ID: 52F8A5B3.5080809@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 02/10/2014 11:05 AM, Andres Freund wrote:
> Hi,
>
> On 2014-02-06 18:47:31 -0500, Andrew Dunstan wrote:
>> * switching to using text representation in jsonb send/recv
>> +/*
>> + * jsonb type recv function
>> + *
>> + * the type is sent as text in binary mode, so this is almost the same
>> + * as the input function.
>> + */
>> +Datum
>> +jsonb_recv(PG_FUNCTION_ARGS)
>> +{
>> + StringInfo buf = (StringInfo) PG_GETARG_POINTER(0);
>> + text *result = cstring_to_text_with_len(buf->data, buf->len);
>> +
>> + return deserialize_json_text(result);
>> +}
>> +/*
>> + * jsonb type send function
>> + *
>> + * Just send jsonb as a string of text
>> + */
>> +Datum
>> +jsonb_send(PG_FUNCTION_ARGS)
>> +{
>> + Jsonb *jb = PG_GETARG_JSONB(0);
>> + StringInfoData buf;
>> + char *out;
>> +
>> + out = JsonbToCString(NULL, (JB_ISEMPTY(jb)) ? NULL : VARDATA(jb), VARSIZE(jb));
>> +
>> + pq_begintypsend(&buf);
>> + pq_sendtext(&buf, out, strlen(out));
>> + PG_RETURN_BYTEA_P(pq_endtypsend(&buf));
>> +}
> I'd suggest making the format discernible from possible different future
> formats, to allow introducing a proper binary at some later time. Maybe
> just send a int8 first, containing the format.
+10

Especially as this is one type where we may want add type-specific
compression options at some point

Cheers

--
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-10 12:27:59
Message-ID: 52F8C5CF.4060605@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 02/10/2014 05:05 AM, Andres Freund wrote:
> Hi,
>
> On 2014-02-06 18:47:31 -0500, Andrew Dunstan wrote:
>> * switching to using text representation in jsonb send/recv
>> +/*
>> + * jsonb type recv function
>> + *
>> + * the type is sent as text in binary mode, so this is almost the same
>> + * as the input function.
>> + */
>> +Datum
>> +jsonb_recv(PG_FUNCTION_ARGS)
>> +{
>> + StringInfo buf = (StringInfo) PG_GETARG_POINTER(0);
>> + text *result = cstring_to_text_with_len(buf->data, buf->len);
>> +
>> + return deserialize_json_text(result);
>> +}
>> +/*
>> + * jsonb type send function
>> + *
>> + * Just send jsonb as a string of text
>> + */
>> +Datum
>> +jsonb_send(PG_FUNCTION_ARGS)
>> +{
>> + Jsonb *jb = PG_GETARG_JSONB(0);
>> + StringInfoData buf;
>> + char *out;
>> +
>> + out = JsonbToCString(NULL, (JB_ISEMPTY(jb)) ? NULL : VARDATA(jb), VARSIZE(jb));
>> +
>> + pq_begintypsend(&buf);
>> + pq_sendtext(&buf, out, strlen(out));
>> + PG_RETURN_BYTEA_P(pq_endtypsend(&buf));
>> +}
> I'd suggest making the format discernible from possible different future
> formats, to allow introducing a proper binary at some later time. Maybe
> just send a int8 first, containing the format.
>

Teodor privately suggested something similar. I was thinking of just
sending a version byte, which for now would be '\x01'. An int8 seems
like more future-proofing provision than we really need.

cheers

andrew


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-10 12:39:04
Message-ID: 20140210123904.GA10885@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2014-02-10 07:27:59 -0500, Andrew Dunstan wrote:
> On 02/10/2014 05:05 AM, Andres Freund wrote:
> >I'd suggest making the format discernible from possible different future
> >formats, to allow introducing a proper binary at some later time. Maybe
> >just send a int8 first, containing the format.
> >
>
> Teodor privately suggested something similar. I was thinking of just
> sending a version byte, which for now would be '\x01'. An int8 seems like
> more future-proofing provision than we really need.

Hm. Isn't that just about the same? I was thinking of the c type int8,
not the 64bit type. It seems cleaner to do a pg_sendint(..., 1, 1) than
to do it manually inside the string.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-10 13:18:06
Message-ID: 52F8D18E.4070205@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 02/10/2014 07:39 AM, Andres Freund wrote:
> On 2014-02-10 07:27:59 -0500, Andrew Dunstan wrote:
>> On 02/10/2014 05:05 AM, Andres Freund wrote:
>>> I'd suggest making the format discernible from possible different future
>>> formats, to allow introducing a proper binary at some later time. Maybe
>>> just send a int8 first, containing the format.
>>>
>> Teodor privately suggested something similar. I was thinking of just
>> sending a version byte, which for now would be '\x01'. An int8 seems like
>> more future-proofing provision than we really need.
> Hm. Isn't that just about the same? I was thinking of the c type int8,
> not the 64bit type. It seems cleaner to do a pg_sendint(..., 1, 1) than
> to do it manually inside the string.

OK, works for me. I'm tied up for a couple of days, will do this when
I'm back on deck.

cheers

andrew


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Oleg Bartunov <obartunov(at)gmail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-10 15:13:57
Message-ID: 9645.1392045237@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Craig Ringer <craig(at)2ndquadrant(dot)com> writes:
> On 02/06/2014 01:48 AM, Tom Lane wrote:
>>> switching to "binary is the same as text" may well be the most prudent
>>> path here.

> Can't we just reject attempts to transfer these via binary copy,
> allowing only a text format? So rather than sending text when the binary
> is requested, we just require clients to use text for this type.

That used to be the case, back when we didn't have send/recv functions for
all built-in types; and client-code authors complained bitterly about it.
It's pretty much unworkable if the text/binary choice is being made by
a code level that doesn't have complete understanding of the queries it's
transmitting. Consider "SELECT * FROM ..."; how are you going to know
which columns to request in binary and which in text? Even if you're
willing to do trial and error (ie, it's okay to cause transaction
rollbacks), the backend isn't very helpful about telling you exactly
which column(s) would need to be requested as text.

I think the downthread solution of prepending a type-specific format ID
byte is a better answer for giving us flexibility down the road.

regards, tom lane


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-10 17:59:53
Message-ID: CAHyXU0wUPqESRC-wrbHvDHp0sZL+kKmz=9Wb6DH29Jj+o1O3AQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Feb 10, 2014 at 6:39 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> On 2014-02-10 07:27:59 -0500, Andrew Dunstan wrote:
>> On 02/10/2014 05:05 AM, Andres Freund wrote:
>> >I'd suggest making the format discernible from possible different future
>> >formats, to allow introducing a proper binary at some later time. Maybe
>> >just send a int8 first, containing the format.
>> >
>>
>> Teodor privately suggested something similar. I was thinking of just
>> sending a version byte, which for now would be '\x01'. An int8 seems like
>> more future-proofing provision than we really need.
>
> Hm. Isn't that just about the same? I was thinking of the c type int8,
> not the 64bit type. It seems cleaner to do a pg_sendint(..., 1, 1) than
> to do it manually inside the string.

-1. Currently no other wire format types send version and it's not
clear why this one is special. We've changed the wire format versions
before and it's upon the client to deal with those changes. The
server version *is* the version basically. If a broader solution
exists I think it should be addressed broadly. Versioning one type
only IMNSHO is a complete hack.

merlin


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-10 18:15:53
Message-ID: 15780.1392056153@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
> On Mon, Feb 10, 2014 at 6:39 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
>> On 2014-02-10 07:27:59 -0500, Andrew Dunstan wrote:
>>> Teodor privately suggested something similar. I was thinking of just
>>> sending a version byte, which for now would be '\x01'. An int8 seems like
>>> more future-proofing provision than we really need.

> -1. Currently no other wire format types send version and it's not
> clear why this one is special. We've changed the wire format versions
> before and it's upon the client to deal with those changes.

Really? How would you expect to do that, exactly? In particular,
how would you propose that a binary pg_dump file be reloadable if
we redefine the binary format down the road without having made
provision like this?

> Versioning one type only IMNSHO is a complete hack.

I don't feel a need for versioning int, or float8, or most other types;
and that includes the ones for which we've previously defined binary
format as equivalent to text (enums). In this case we know that we're not
totally satisfied with the binary format we're defining today, so I think
a type-specific escape hatch is a reasonable solution.

Moreover, I don't especially buy tying it to server version, even if we
had an information pathway that would provide that reliably in all
contexts. Granting the presumption that more than one data type would
want such versioning, it's still possible that different data types would
have different ideas about what they needed to do and where the cutover
points were.

regards, tom lane


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-10 19:07:35
Message-ID: CAHyXU0xN3YZDDyj4x_PcbE2CmrTzMtg-5w3S2VpoD54gb2=JGQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Feb 10, 2014 at 12:15 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
>> On Mon, Feb 10, 2014 at 6:39 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
>>> On 2014-02-10 07:27:59 -0500, Andrew Dunstan wrote:
>>>> Teodor privately suggested something similar. I was thinking of just
>>>> sending a version byte, which for now would be '\x01'. An int8 seems like
>>>> more future-proofing provision than we really need.
>
>> -1. Currently no other wire format types send version and it's not
>> clear why this one is special. We've changed the wire format versions
>> before and it's upon the client to deal with those changes.
>
> Really? How would you expect to do that, exactly? In particular,
> how would you propose that a binary pg_dump file be reloadable if
> we redefine the binary format down the road without having made
> provision like this?
>
>> Versioning one type only IMNSHO is a complete hack.
>
> I don't feel a need for versioning int, or float8, or most other types;
> and that includes the ones for which we've previously defined binary
> format as equivalent to text (enums). In this case we know that we're not
> totally satisfied with the binary format we're defining today, so I think
> a type-specific escape hatch is a reasonable solution.
>
> Moreover, I don't especially buy tying it to server version, even if we
> had an information pathway that would provide that reliably in all
> contexts.

Why not? Furthermore what are we doing now? If we need a binary
format contract that needs to be separated from this discussion.

I've written (along with Andrew C) the only serious attempt to deal
with client side binary format handling (http://libpqtypes.esilo.com/)
and in all interesting cases it depends on the server version to
define binary parsing behaviors. I agree WRT float8, etc but other
types have changed in a couple of cases and it's always been with the
version. I find it highly unlikely that any compatibility behaviors
are going to be defined *for each and every returned datum* now, or
ever...so even if it's a few bytes lost, why do it? Intra-version
compatibility issues should they ever have to be handled would be more
likely handled at connection- or query- time.

Point being, if an escape hatch is needed, I'm near 100% certain this
is not the right place to do it. Binary wire format compatibility is
a complex topic and proposed solution ISTM is not at all fleshed out.

merlin


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-10 23:02:14
Message-ID: 20140210230214.GE15246@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2014-02-10 11:59:53 -0600, Merlin Moncure wrote:
> On Mon, Feb 10, 2014 at 6:39 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> > On 2014-02-10 07:27:59 -0500, Andrew Dunstan wrote:
> >> On 02/10/2014 05:05 AM, Andres Freund wrote:
> >> >I'd suggest making the format discernible from possible different future
> >> >formats, to allow introducing a proper binary at some later time. Maybe
> >> >just send a int8 first, containing the format.
> >> >
> >>
> >> Teodor privately suggested something similar. I was thinking of just
> >> sending a version byte, which for now would be '\x01'. An int8 seems like
> >> more future-proofing provision than we really need.
> >
> > Hm. Isn't that just about the same? I was thinking of the c type int8,
> > not the 64bit type. It seems cleaner to do a pg_sendint(..., 1, 1) than
> > to do it manually inside the string.
>
> -1. Currently no other wire format types send version and it's not
> clear why this one is special. We've changed the wire format versions
> before and it's upon the client to deal with those changes. The
> server version *is* the version basically. If a broader solution
> exists I think it should be addressed broadly. Versioning one type
> only IMNSHO is a complete hack.

I don't find that very convincing. The entire reason jsonb exists is
because the parsing overhead of text json is significant, so it stands
to reason that soon somebody will try to work on a better wire protocol,
even if the current code cannot be made ready for 9.4. And I don't think
past instability of binary type's formats is a good reason for
*needlessly* breaking stuff like binary COPYs.
And it's not like one prefixed byte has any real-world relevant cost.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-10 23:35:12
Message-ID: CAHyXU0wRnG6GxGdz5o_n+bxmAPF0x3q2kqYfYWo-Tb4K+W5-xg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Feb 10, 2014 at 5:02 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> On 2014-02-10 11:59:53 -0600, Merlin Moncure wrote:
>> On Mon, Feb 10, 2014 at 6:39 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
>> > On 2014-02-10 07:27:59 -0500, Andrew Dunstan wrote:
>> >> On 02/10/2014 05:05 AM, Andres Freund wrote:
>> >> >I'd suggest making the format discernible from possible different future
>> >> >formats, to allow introducing a proper binary at some later time. Maybe
>> >> >just send a int8 first, containing the format.
>> >> >
>> >>
>> >> Teodor privately suggested something similar. I was thinking of just
>> >> sending a version byte, which for now would be '\x01'. An int8 seems like
>> >> more future-proofing provision than we really need.
>> >
>> > Hm. Isn't that just about the same? I was thinking of the c type int8,
>> > not the 64bit type. It seems cleaner to do a pg_sendint(..., 1, 1) than
>> > to do it manually inside the string.
>>
>> -1. Currently no other wire format types send version and it's not
>> clear why this one is special. We've changed the wire format versions
>> before and it's upon the client to deal with those changes. The
>> server version *is* the version basically. If a broader solution
>> exists I think it should be addressed broadly. Versioning one type
>> only IMNSHO is a complete hack.
>
> I don't find that very convincing. The entire reason jsonb exists is
> because the parsing overhead of text json is significant, so it stands
> to reason that soon somebody will try to work on a better wire protocol,
> even if the current code cannot be made ready for 9.4. And I don't think
> past instability of binary type's formats is a good reason for
> *needlessly* breaking stuff like binary COPYs.
> And it's not like one prefixed byte has any real-world relevant cost.

The point is, why does this one type get a version id? Imagine a
hypothetical program that sent/received the binary format for jsonb.
All you have to to is manage the version flag appropriately, right?

Wrong. You still need to have code that checks the server version and
see if it's supported (particularly for sending) and as there is *no
protocol negotiation of the formats at present it's all going to boil
down to if version = X do Y*. How does the server know which
'versions' are ok to send? It doesn't. Follow along with me here:
Suppose we don't introduce a version flag today and change the format
to some more exotic structure for 9.5. How has the version flag made
things easier for the client? It hasn't. The client goes "if version
= X do Y".

I guess you could argue that having a version flag could, say, allow
libpq clients to gracefully error out if, say, a old non-exotic-format
speaking libpq happens to connect to a newer sever -- assuming the
client actually bothered to check the flag. That's zero help to the
client though -- regardless the compatibility isn't established and
that's zero help to other binary formats that we have=, and probably
will continue to-, change. What about them? Are we now, at the
upteenth hour of the final commit fest, suddenly deciding that binary
wire formats going to be compatible across versions?

The kinda low effort way to deal with binary format compatibility is
to simply document the existing formats and document format changes in
some convenient place. The 'real' long term path to doing it IMO is
to abstract out a shared/client server type library with some protocol
negotiation features. Then, at connection time, the client/server
agree on what's the optimal way to send things -- perhaps the client
can signal things like 'want compression for long datums'.

The only case for a version flag at the data point level is if the
server is sending version X at this tuple and version Y at that tuple.
I don't think that's a makable case. Some might say, "what about a
compression bit based on compressibility/length?" and to that I'd
answer: why is that handling specific to the json type...are
text/bytea/arrays not worth that feature too?

merlin


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-10 23:38:29
Message-ID: 20140210233829.GA31598@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2014-02-10 17:35:12 -0600, Merlin Moncure wrote:
> Wrong. You still need to have code that checks the server version and
> see if it's supported (particularly for sending) and as there is *no
> protocol negotiation of the formats at present it's all going to boil
> down to if version = X do Y*. How does the server know which
> 'versions' are ok to send? It doesn't. Follow along with me here:
> Suppose we don't introduce a version flag today and change the format
> to some more exotic structure for 9.5. How has the version flag made
> things easier for the client? It hasn't. The client goes "if version
> = X do Y".

think of binary COPY outputting data in 9.4 and then trying to import
that data into 9.5. That's the interesting case here.

> What about them? Are we now, at the
> upteenth hour of the final commit fest, suddenly deciding that binary
> wire formats going to be compatible across versions?

It has been a concern before.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-10 23:48:32
Message-ID: CAHyXU0zCK7umtv=8-m0JFR1iXsX1w85pfOtFVp4yYsU=VmEuZg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Feb 10, 2014 at 5:38 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> On 2014-02-10 17:35:12 -0600, Merlin Moncure wrote:
>> Wrong. You still need to have code that checks the server version and
>> see if it's supported (particularly for sending) and as there is *no
>> protocol negotiation of the formats at present it's all going to boil
>> down to if version = X do Y*. How does the server know which
>> 'versions' are ok to send? It doesn't. Follow along with me here:
>> Suppose we don't introduce a version flag today and change the format
>> to some more exotic structure for 9.5. How has the version flag made
>> things easier for the client? It hasn't. The client goes "if version
>> = X do Y".
>
> think of binary COPY outputting data in 9.4 and then trying to import
> that data into 9.5. That's the interesting case here.

right, json could be made work, but any other format change introduced
to any other already existing type will break. That's not a real
solution unless we decree henceforth that no formats will change from
here on in, in which case I withdraw my objection.

I think COPY binary has exactly the same set of considerations as the
client side. If you want to operate cleanly between versions (which
has never been promised in the past), you have to encode in a header
the kinds of things the server would need to parse it properly.
Starting with, but not necessarily limited to, the encoding server's
version.

merlin


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-10 23:52:51
Message-ID: 20140210235251.GB31598@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2014-02-10 17:48:32 -0600, Merlin Moncure wrote:
> On Mon, Feb 10, 2014 at 5:38 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> > On 2014-02-10 17:35:12 -0600, Merlin Moncure wrote:
> >> Wrong. You still need to have code that checks the server version and
> >> see if it's supported (particularly for sending) and as there is *no
> >> protocol negotiation of the formats at present it's all going to boil
> >> down to if version = X do Y*. How does the server know which
> >> 'versions' are ok to send? It doesn't. Follow along with me here:
> >> Suppose we don't introduce a version flag today and change the format
> >> to some more exotic structure for 9.5. How has the version flag made
> >> things easier for the client? It hasn't. The client goes "if version
> >> = X do Y".
> >
> > think of binary COPY outputting data in 9.4 and then trying to import
> > that data into 9.5. That's the interesting case here.
>
> right, json could be made work, but any other format change introduced
> to any other already existing type will break. That's not a real
> solution unless we decree henceforth that no formats will change from
> here on in, in which case I withdraw my objection.

Sure, it's not a full solution. But it's better than nothing, and it's
likely that we'll see breakage soonish. I don't think there's been much
recent mucking around with incompatible binary formats?

> I think COPY binary has exactly the same set of considerations as the
> client side. If you want to operate cleanly between versions (which
> has never been promised in the past), you have to encode in a header
> the kinds of things the server would need to parse it properly.
> Starting with, but not necessarily limited to, the encoding server's
> version.

It works in enough cases atm that it's worthwile trying to keep it
working. Sure, it could be better, but it's what we have right now. Atm
it's e.g. the only realistic way to copy larger amounts of bytea between
servers without copying the entire cluster.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-11 00:16:15
Message-ID: CAHyXU0x4FxigqeOWwx7JdvCrRODoF8v1nK5tpUX64CkSUo=Nyw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Feb 10, 2014 at 5:52 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> It works in enough cases atm that it's worthwile trying to keep it
> working. Sure, it could be better, but it's what we have right now. Atm
> it's e.g. the only realistic way to copy larger amounts of bytea between
> servers without copying the entire cluster.

That's the thing -- it might work today, but what about tomorrow?
We'd be sending the wrong signals. People start building processes
around all of this and now we've painted ourselves into a box. Better
in my mind to simply educate users that this practice is dangerous and
unsupported, as we used to do. I guess until now. It seems completely
odd to me that we're attaching a case to the jsonb type, in the wrong
way -- something that we've never attached to any other type before.
For example, why didn't we attach a version code to the json type send
function? Wasn't the whole point of this is that jsonb send/recv be
more spiritually closer to json? If we want to introduce alternative
type formats in the 9.5 cycle, why can't we attach version based
encoding handling to *that* problem?

The more angles I look at this from the more it looks messy and rushed.

Notwithstanding all the above, I figure here enough smart people
disagree (once again, heh) to call it consensus.

merlin


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-11 00:24:39
Message-ID: 20140211002439.GC31598@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2014-02-10 18:16:15 -0600, Merlin Moncure wrote:
> On Mon, Feb 10, 2014 at 5:52 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> > It works in enough cases atm that it's worthwile trying to keep it
> > working. Sure, it could be better, but it's what we have right now. Atm
> > it's e.g. the only realistic way to copy larger amounts of bytea between
> > servers without copying the entire cluster.
>
> That's the thing -- it might work today, but what about tomorrow?
> We'd be sending the wrong signals. People start building processes
> around all of this and now we've painted ourselves into a box.

That ship has sailed.

> Better in my mind to simply educate users that this practice is dangerous and
> unsupported, as we used to do.

But we don't have any alternatives for such scenarios, so that just
amounts to "screw you". If there are good reason for just breaking
binary protocol compatibility, I can live with that, but that's really
not the case here. The additional amount of code is *miniscule*, even
after adding a real binary protocol format since all the code has to be
there for the plain send/recv functions anyway.

The amount of interesting and acceptable binary protocol changes has
gotten lower in step with the acceptance of on-disk compatibility
changes, which isn't particularly surprising.

> I guess until now. It seems completely
> odd to me that we're attaching a case to the jsonb type, in the wrong
> way -- something that we've never attached to any other type before.
> For example, why didn't we attach a version code to the json type send
> function? Wasn't the whole point of this is that jsonb send/recv be
> more spiritually closer to json? If we want to introduce alternative
> type formats in the 9.5 cycle, why can't we attach version based
> encoding handling to *that* problem?

That doesn't make any sense to me. jsonb is a separate type because it
behaves differently than json. So I don't see how that plays a role
here.

And if we add a new format version in 9.5 we need to make it discernible
from the 9.4 format. Without space for a format indicator we'd have to
resort to ugly tricks like defining the high bit in the first byte set
indicates the new version. I don't see the improvement here.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-11 00:24:44
Message-ID: 27485.1392078284@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
> right, json could be made work, but any other format change introduced
> to any other already existing type will break. That's not a real
> solution unless we decree henceforth that no formats will change from
> here on in, in which case I withdraw my objection.

Well, I don't recall that we've made a practice of changing binary formats
a lot. Doing so would break existing dumps, which is something we
strenuously avoid.

Even granting that sometime in the future we invent infrastructure to do
the kind of protocol negotiation you're talking about, one byte per JSON
value seems like a cheap and worthwhile cross-check that both ends came
to the same conclusion about what to send.

regards, tom lane


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-11 00:33:45
Message-ID: CAHyXU0z_YqxSD=B0PccHk158gvdw+KRQKE9ecyx35pOZDh8Ecw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Feb 10, 2014 at 6:24 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> And if we add a new format version in 9.5 we need to make it discernible
> from the 9.4 format. Without space for a format indicator we'd have to
> resort to ugly tricks like defining the high bit in the first byte set
> indicates the new version. I don't see the improvement here.

Point being: a 9.5 binary format reading server could look for a magic
token in the beginning of the file which would indicate the presence
of a header. The server could then make intelligent decisions about
reading data inside the file which would be follow exactly the same
kinds of decisions binary format consuming client code would make.
Perhaps it would be a simple check on version, or something more
complex that would involve a negotiation. The 'format' indicator,
should version not be precise enough, needs to be in the header, not
passed with every instance of the data type, and certainly not for one
type in the absence of others.

merlin


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-11 00:39:39
Message-ID: 27999.1392079179@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
> On Mon, Feb 10, 2014 at 6:24 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
>> And if we add a new format version in 9.5 we need to make it discernible
>> from the 9.4 format. Without space for a format indicator we'd have to
>> resort to ugly tricks like defining the high bit in the first byte set
>> indicates the new version. I don't see the improvement here.

> Point being: a 9.5 binary format reading server could look for a magic
> token in the beginning of the file which would indicate the presence
> of a header. The server could then make intelligent decisions about
> reading data inside the file which would be follow exactly the same
> kinds of decisions binary format consuming client code would make.
> Perhaps it would be a simple check on version, or something more
> complex that would involve a negotiation. The 'format' indicator,
> should version not be precise enough, needs to be in the header, not
> passed with every instance of the data type, and certainly not for one
> type in the absence of others.

Basically, you want to move the goalposts to somewhere that's not only
out of reach today, but probably a few counties away from the stadium.
I don't see this happening at all frankly, because nobody has been
interested enough to work on something like it up to now. And I
definitely don't see it as appropriate to block improvement of jsonb
until this happens.

regards, tom lane


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-11 01:01:48
Message-ID: CAHyXU0zTWjyLMnF08_CtWyjfHe0_6ZwNxxRWLcsR2x49x4bamQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Feb 10, 2014 at 6:39 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
>> On Mon, Feb 10, 2014 at 6:24 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
>>> And if we add a new format version in 9.5 we need to make it discernible
>>> from the 9.4 format. Without space for a format indicator we'd have to
>>> resort to ugly tricks like defining the high bit in the first byte set
>>> indicates the new version. I don't see the improvement here.
>
>> Point being: a 9.5 binary format reading server could look for a magic
>> token in the beginning of the file which would indicate the presence
>> of a header. The server could then make intelligent decisions about
>> reading data inside the file which would be follow exactly the same
>> kinds of decisions binary format consuming client code would make.
>> Perhaps it would be a simple check on version, or something more
>> complex that would involve a negotiation. The 'format' indicator,
>> should version not be precise enough, needs to be in the header, not
>> passed with every instance of the data type, and certainly not for one
>> type in the absence of others.
>
> Basically, you want to move the goalposts to somewhere that's not only
> out of reach today, but probably a few counties away from the stadium.
> I don't see this happening at all frankly, because nobody has been
> interested enough to work on something like it up to now. And I
> definitely don't see it as appropriate to block improvement of jsonb
> until this happens.

That's completely unfair. I'm arguing *not* to attach version
dependency expectations to the jsonb type, at all, not the other way
around. If you want to do that, fine, but do it *later* as in, 9.5,
or beyond. I just gave an example of how binary format changes could
be worked in later.

merlin


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-11 01:07:46
Message-ID: 20140211010746.GD31598@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2014-02-10 19:01:48 -0600, Merlin Moncure wrote:
> On Mon, Feb 10, 2014 at 6:39 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
> >> On Mon, Feb 10, 2014 at 6:24 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> >>> And if we add a new format version in 9.5 we need to make it discernible
> >>> from the 9.4 format. Without space for a format indicator we'd have to
> >>> resort to ugly tricks like defining the high bit in the first byte set
> >>> indicates the new version. I don't see the improvement here.
> >
> >> Point being: a 9.5 binary format reading server could look for a magic
> >> token in the beginning of the file which would indicate the presence
> >> of a header. The server could then make intelligent decisions about
> >> reading data inside the file which would be follow exactly the same
> >> kinds of decisions binary format consuming client code would make.
> >> Perhaps it would be a simple check on version, or something more
> >> complex that would involve a negotiation. The 'format' indicator,
> >> should version not be precise enough, needs to be in the header, not
> >> passed with every instance of the data type, and certainly not for one
> >> type in the absence of others.
> >
> > Basically, you want to move the goalposts to somewhere that's not only
> > out of reach today, but probably a few counties away from the stadium.
> > I don't see this happening at all frankly, because nobody has been
> > interested enough to work on something like it up to now. And I
> > definitely don't see it as appropriate to block improvement of jsonb
> > until this happens.
>
> That's completely unfair. I'm arguing *not* to attach version
> dependency expectations to the jsonb type, at all, not the other way
> around. If you want to do that, fine, but do it *later* as in, 9.5,
> or beyond. I just gave an example of how binary format changes could
> be worked in later.

Comeon. Your way requires building HEAPS of new and generic
infrastructure in 9.5 and would only work for binary copy. The proposed
way requires about two lines of code. Without the generic infrastructure
we'd end up relying on some intracacies like the meaning of high bit in
the first byte or such.

Anyway, that's it on this subthread from me,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Dunstan <pgsql(at)tomd(dot)cc>
To: Hannu Krosing <hannu(at)krosing(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Oleg Bartunov <obartunov(at)gmail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-11 01:50:54
Message-ID: CAPPfrux8FGVCCMAWoaCCFXv0hQ=JyirJg5CtB04VwU4e0Ja3Qg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 10 February 2014 20:11, Hannu Krosing <hannu(at)krosing(dot)net> wrote:
> The fastest and lowest parsing cost format for "JSON" is tnetstrings
> http://tnetstrings.org/ why not use it as the binary wire format ?
>
> It would be as binary as it gets and still be generally parse-able by
> lots of different platforms, at leas by all of these we care about.

If we do go down the binary encoding path in a future release, can I
please suggest *not* using something like tnetstrings, which suffers
the same problem that a few binary transport formats suffer,
particularly when they're developed by people whose native language
doesn't distinguish between byte arrays and strings - all strings are
considered byte arrays and it's up to an application to decide on
character encoding and which things are data vs strings in the
application.

This makes writing a parser in a language which does treat byte arrays
and strings differently very difficult, see e.g. the java tnetstrings
API [1] which is forced into treating strings as byte arrays until the
programmer then asks it to parse the thing again, but please treat
everything as a string this time. The msgpack people after much
wrangling have ended up issuing a new version of the protocol which
avoids this issue and which they are strongly encouraging users to
switch to, see [2] for the gory details.

While we may not ever store types in our jsonb format other than the
standard json data types (I can foresee people wanting to do it,
though), I would strongly recommend picking a format which at least is
clear that a value is a string (text, whatever), and preferably makes
it clear what the character encoding is. Or maybe it should just
follow whatever the client encoding is at the time - as long as that
is completely unambiguous to a client.

Cheers

Tom

[1] https://github.com/asinger/tnetstringsj
[2] https://github.com/msgpack/msgpack/issues/128


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-11 02:03:17
Message-ID: CAHyXU0zD5Wnii_bgM2NtDvc+vb7eRpzs4O3qmn1Bs6EPP5DbFw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Monday, February 10, 2014, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:

> On 2014-02-10 19:01:48 -0600, Merlin Moncure wrote:
> > On Mon, Feb 10, 2014 at 6:39 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us<javascript:;>>
> wrote:
> > > Merlin Moncure <mmoncure(at)gmail(dot)com <javascript:;>> writes:
> > >> On Mon, Feb 10, 2014 at 6:24 PM, Andres Freund <
> andres(at)2ndquadrant(dot)com <javascript:;>> wrote:
> > >>> And if we add a new format version in 9.5 we need to make it
> discernible
> > >>> from the 9.4 format. Without space for a format indicator we'd have
> to
> > >>> resort to ugly tricks like defining the high bit in the first byte
> set
> > >>> indicates the new version. I don't see the improvement here.
> > >
> > >> Point being: a 9.5 binary format reading server could look for a magic
> > >> token in the beginning of the file which would indicate the presence
> > >> of a header. The server could then make intelligent decisions about
> > >> reading data inside the file which would be follow exactly the same
> > >> kinds of decisions binary format consuming client code would make.
> > >> Perhaps it would be a simple check on version, or something more
> > >> complex that would involve a negotiation. The 'format' indicator,
> > >> should version not be precise enough, needs to be in the header, not
> > >> passed with every instance of the data type, and certainly not for one
> > >> type in the absence of others.
> > >
> > > Basically, you want to move the goalposts to somewhere that's not only
> > > out of reach today, but probably a few counties away from the stadium.
> > > I don't see this happening at all frankly, because nobody has been
> > > interested enough to work on something like it up to now. And I
> > > definitely don't see it as appropriate to block improvement of jsonb
> > > until this happens.
> >
> > That's completely unfair. I'm arguing *not* to attach version
> > dependency expectations to the jsonb type, at all, not the other way
> > around. If you want to do that, fine, but do it *later* as in, 9.5,
> > or beyond. I just gave an example of how binary format changes could
> > be worked in later.
>
> Comeon. Your way requires building HEAPS of new and generic
> infrastructure in 9.5 and would only work for binary copy. The proposed
> way requires about two lines of code. Without the generic infrastructure
> we'd end up relying on some intracacies like the meaning of high bit in
> the first byte or such.
>
> Anyway, that's it on this subthread from me
>

Fair enough. I'll concede the point.

merlin


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-11 02:11:13
Message-ID: 20140211021113.GF15246@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

Is it just me or is jsonapi.h not very well documented?

On 2014-02-06 18:47:31 -0500, Andrew Dunstan wrote:
> +/*
> + * for jsonb we always want the de-escaped value - that's what's in token
> + */
> +static void
> +jsonb_in_scalar(void *state, char *token, JsonTokenType tokentype)
> +{
> + JsonbInState *_state = (JsonbInState *) state;
> + JsonbValue v;
> +
> + v.size = sizeof(JEntry);
> +
> + switch (tokentype)
> + {
> +
> + case JSON_TOKEN_STRING:
> + v.type = jbvString;
> + v.string.len = token ? checkStringLen(strlen(token)) : 0;
> + v.string.val = token ? pnstrdup(token, v.string.len) : NULL;
> + v.size += v.string.len;
> + break;
> + case JSON_TOKEN_NUMBER:
> + v.type = jbvNumeric;
> + v.numeric = DatumGetNumeric(DirectFunctionCall3(numeric_in, CStringGetDatum(token), 0, -1));
> +
> + v.size += VARSIZE_ANY(v.numeric) +sizeof(JEntry) /* alignment */ ;
missing space.

Why does + sizeof(JEntry) change anything about alignment? If it was
aligned before, adding a statically sized value doesn't give any new
guarantees about alignment?

> +/*
> + * jsonb type recv function
> + *
> + * the type is sent as text in binary mode, so this is almost the same
> + * as the input function.
> + */
> +Datum
> +jsonb_recv(PG_FUNCTION_ARGS)
> +{
> + StringInfo buf = (StringInfo) PG_GETARG_POINTER(0);
> + text *result = cstring_to_text_with_len(buf->data, buf->len);
> +
> + return deserialize_json_text(result);
> +}

This is a bit absurd, we're receiving a string in a StringInfo buffer,
just to copy it into text, and then in makeJsonLexContext() access the
raw chars again.

> +static void
> +putEscapedValue(StringInfo out, JsonbValue *v)
> +{
> + switch (v->type)
> + {
> + case jbvNull:
> + appendBinaryStringInfo(out, "null", 4);
> + break;
> + case jbvString:
> + escape_json(out, pnstrdup(v->string.val, v->string.len));
> + break;
> + case jbvBool:
> + if (v->boolean)
> + appendBinaryStringInfo(out, "true", 4);
> + else
> + appendBinaryStringInfo(out, "false", 5);
> + break;
> + case jbvNumeric:
> + appendStringInfoString(out, DatumGetCString(DirectFunctionCall1(numeric_out, PointerGetDatum(v->numeric))));
> + break;
> + default:
> + elog(ERROR, "unknown jsonb scalar type");
> + }
> +}

Hm, will the jbvNumeric always result in correct correct quoting?
datum_to_json() does extra hangups for that case, any reason we don't
need that here?

> +char *
> +JsonbToCString(StringInfo out, char *in, int estimated_len)
> +{
...
> + while (redo_switch || ((type = JsonbIteratorGet(&it, &v, false)) != 0))
> + {
> + redo_switch = false;

Not sure if I see the advantage over the goto here. A comment explaining
what the reason for the goto is wouldhave sufficed.

> + case WJB_KEY:
> + if (first == false)
> + appendBinaryStringInfo(out, ", ", 2);
> + first = true;
> +
> + putEscapedValue(out, &v);
> + appendBinaryStringInfo(out, ": ", 2);

putEscapedValue doesn't gurantee only strings are output, but
datum_to_json does extra hangups for that case.

> + type = JsonbIteratorGet(&it, &v, false);
> + if (type == WJB_VALUE)
> + {
> + first = false;
> + putEscapedValue(out, &v);
> + }
> + else
> + {
> + Assert(type == WJB_BEGIN_OBJECT || type == WJB_BEGIN_ARRAY);
> + /*
> + * We need to rerun current switch() due to put
> + * in current place object which we just got
> + * from iterator.
> + */

"due to put"?

> +/*
> + * jsonb type send function
> + *
> + * Just send jsonb as a string of text
> + */
> +Datum
> +jsonb_send(PG_FUNCTION_ARGS)
> +{
> + Jsonb *jb = PG_GETARG_JSONB(0);
> + StringInfoData buf;
> + char *out;
> +
> + out = JsonbToCString(NULL, (JB_ISEMPTY(jb)) ? NULL : VARDATA(jb), VARSIZE(jb));
> +
> + pq_begintypsend(&buf);
> + pq_sendtext(&buf, out, strlen(out));
> + PG_RETURN_BYTEA_P(pq_endtypsend(&buf));
> +}

Why aren't you using using the stringbuf passing JsonbToCString
convention here to avoid the strlen()?

> +/*
> + * Compare two jbvString JsonbValue values, third argument
> + * 'arg', if it's not null, should be a pointer to bool
> + * value which will be set to true if strings are equal and
> + * untouched otherwise.
> + */
> +int
> +compareJsonbStringValue(const void *a, const void *b, void *arg)
> +{
> + const JsonbValue *va = a;
> + const JsonbValue *vb = b;
> + int res;
> +
> + Assert(va->type == jbvString);
> + Assert(vb->type == jbvString);
> +
> + if (va->string.len == vb->string.len)
> + {
> + res = memcmp(va->string.val, vb->string.val, va->string.len);
> + if (res == 0 && arg)
> + *(bool *) arg = true;

Should be NULL, not 0.

> +/*
> + * qsort helper to compare JsonbPair values, third argument
> + * arg will be trasferred as is to subsequent

*transferred.

> +/*
> + * some constant order of JsonbValue
> + */
> +int
> +compareJsonbValue(JsonbValue *a, JsonbValue *b)
> +{

Called recursively, needs to check for stack depth.

> +JsonbValue *
> +findUncompressedJsonbValueByValue(char *buffer, uint32 flags,
> + uint32 *lowbound, JsonbValue *key)
> +{

Functions like this *REALLY* need documentation for their
parameters. And of their actual purpose.

What's actually the uncompressed bit here? Isn't it actually the
contrary? This is navigating the compressed, non-tree form, no?

> + if (flags & JB_FLAG_ARRAY & header)
> + {
> + JEntry *array = (JEntry *) (buffer + sizeof(header));
> + char *data = (char *) (array + (header & JB_COUNT_MASK));
> + int i;

> + for (i = (lowbound) ? *lowbound : 0; i < (header & JB_COUNT_MASK); i++)
> + {
> + JEntry *e = array + i;

> + else if (JBE_ISSTRING(*e) && key->type == jbvString)
> + {
> + if (key->string.len == JBE_LEN(*e) &&
> + memcmp(key->string.val, data + JBE_OFF(*e),
> + key->string.len) == 0)
> + {

So, here we have our own undocumented! indexing system. Grand.

> + else if (flags & JB_FLAG_OBJECT & header)
> + {
> + JEntry *array = (JEntry *) (buffer + sizeof(header));
> + char *data = (char *) (array + (header & JB_COUNT_MASK) * 2);
> + uint32 stopLow = lowbound ? *lowbound : 0,
> + stopHigh = (header & JB_COUNT_MASK),
> + stopMiddle;

I don't understand what the point of the lowbound logic could be here?
If a key hasn't been found, it hasn't been found? Maybe the idea is to
use it when testing containedness or somesuch? Wouldn't iterating over
the keyspace be a better idea for that case?

> + if (key->type != jbvString)
> + return NULL;

That's not allowed, right?

> +/*
> + * Get i-th value of array or hash. if i < 0 then it counts from
> + * the end of array/hash. Note: returns pointer to statically
> + * allocated JsonbValue.
> + */
> +JsonbValue *
> +getJsonbValue(char *buffer, uint32 flags, int32 i)
> +{
> + uint32 header = *(uint32 *) buffer;
> + static JsonbValue r;

Really? And why on earth would static allocation be a good idea? Specify
it on the caller's stack if need be. Or even return by value, today's
calling convention will just allocate that on the caller's stack without
copying.
Accessing static data isn't even faster.

> + if (JBE_ISSTRING(*e))
> + {
> + r.type = jbvString;
> + r.string.val = data + JBE_OFF(*e);
> + r.string.len = JBE_LEN(*e);
> + r.size = sizeof(JEntry) + r.string.len;
> + }
> + else if (JBE_ISBOOL(*e))
> + {
> + r.type = jbvBool;
> + r.boolean = (JBE_ISBOOL_TRUE(*e)) ? true : false;
> + r.size = sizeof(JEntry);
> + }
> + else if (JBE_ISNUMERIC(*e))
> + {
> + r.type = jbvNumeric;
> + r.numeric = (Numeric) (data + INTALIGN(JBE_OFF(*e)));
> +
> + r.size = 2 * sizeof(JEntry) + VARSIZE_ANY(r.numeric);
> + }
> + else if (JBE_ISNULL(*e))
> + {
> + r.type = jbvNull;
> + r.size = sizeof(JEntry);
> + }
> + else
> + {
> + r.type = jbvBinary;
> + r.binary.data = data + INTALIGN(JBE_OFF(*e));
> + r.binary.len = JBE_LEN(*e) - (INTALIGN(JBE_OFF(*e)) - JBE_OFF(*e));
> + r.size = r.binary.len + 2 * sizeof(JEntry);
> + }

This bit of code exists pretty similarly in several places, maybe consolitate?

> +/****************************************************************************
> + * Walk on tree representation of jsonb *
> + ****************************************************************************/
> +static void
> +walkUncompressedJsonbDo(JsonbValue *v, walk_jsonb_cb cb, void *cb_arg, uint32 level)
> +{
> + int i;

check stack limit.

> +void
> +walkUncompressedJsonb(JsonbValue *v, walk_jsonb_cb cb, void *cb_arg)
> +{
> + if (v)
> + walkUncompressedJsonbDo(v, cb, cb_arg, 0);
> +}
> +
> +/****************************************************************************
> + * Iteration over binary jsonb *
> + ****************************************************************************/

This needs docs.

> +static void
> +parseBuffer(JsonbIterator *it, char *buffer)
> +{

Why invent completely independent naming conventions to the previous
functions here?

> +static bool
> +formAnswer(JsonbIterator **it, JsonbValue *v, JEntry * e, bool skipNested)
> +{

Imaginatively undescriptive name. But if it were slightly more more
abstracted away from JsonbIterator it could be the answer to my prayers
above about removing redundant code.

> +static JsonbIterator *
> +up(JsonbIterator *it)
> +{

Not a good name.

> +int
> +JsonbIteratorGet(JsonbIterator **it, JsonbValue *v, bool skipNested)
> +{
> + int res;

recursive, stack depth check.

> + switch ((*it)->type | (*it)->state)
> + {
> + case JB_FLAG_ARRAY | jbi_start:

I don't know, but I don't see the point in avoid if (), else if()
... constructs if it requires such dirty tricks.

> +/****************************************************************************
> + * Transformation from tree to binary representation of jsonb *
> + ****************************************************************************/
> +typedef struct CompressState
> +{
> + char *begin;
> + char *ptr;
> +
> + struct
> + {
> + uint32 i;
> + uint32 *header;
> + JEntry *array;
> + char *begin;
> + } *levelstate, *lptr, *pptr;
> +
> + uint32 maxlevel;
> +
> +} CompressState;
> +
> +#define curLevelState state->lptr
> +#define prevLevelState state->pptr

brrr.

I stopped looking at code at this point.

> diff --git a/src/backend/utils/adt/jsonfuncs.c b/src/backend/utils/adt/jsonfuncs.c
> index e1d8aae..50ddf50 100644
> --- a/src/backend/utils/adt/jsonfuncs.c
> +++ b/src/backend/utils/adt/jsonfuncs.c

there's lots of whitespace/tab damage in this file. Check git log/diff
--check or such.

This is still a mess, sorry:
* Large and important part continue to be undocumented. Especially in
jsonb_support.c
* Lots of naming inconsistencies.
* There's no documentation about what compressed/uncompressed jsonbs
are. The former is the ondisk representation, the latter the in-memory
tree representation.
* There's no non-code documentation about the on-disk format.

Unfortunately I can't see how this patch could get ready in time for
this CF. There's *lots* of work to be done. The code as is isn't going
to be maintainable. Much of it obvious by simply scanning through the
code, without even looking for higher level issues. And much of it has
previously been pointed out, without getting real attention.

That's not to speak of the nested hstore patch, which I didn't even
start to look at. That's twice this patches size.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Dunstan <pgsql(at)tomd(dot)cc>
Cc: Hannu Krosing <hannu(at)krosing(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Oleg Bartunov <obartunov(at)gmail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-11 03:08:07
Message-ID: 52F99417.5080306@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 02/10/2014 08:50 PM, Tom Dunstan wrote:
> On 10 February 2014 20:11, Hannu Krosing <hannu(at)krosing(dot)net> wrote:
>> The fastest and lowest parsing cost format for "JSON" is tnetstrings
>> http://tnetstrings.org/ why not use it as the binary wire format ?
>>
>> It would be as binary as it gets and still be generally parse-able by
>> lots of different platforms, at leas by all of these we care about.
> If we do go down the binary encoding path in a future release, can I
> please suggest *not* using something like tnetstrings, which suffers
> the same problem that a few binary transport formats suffer,
> particularly when they're developed by people whose native language
> doesn't distinguish between byte arrays and strings - all strings are
> considered byte arrays and it's up to an application to decide on
> character encoding and which things are data vs strings in the
> application.
>
> This makes writing a parser in a language which does treat byte arrays
> and strings differently very difficult, see e.g. the java tnetstrings
> API [1] which is forced into treating strings as byte arrays until the
> programmer then asks it to parse the thing again, but please treat
> everything as a string this time. The msgpack people after much
> wrangling have ended up issuing a new version of the protocol which
> avoids this issue and which they are strongly encouraging users to
> switch to, see [2] for the gory details.
>
> While we may not ever store types in our jsonb format other than the
> standard json data types (I can foresee people wanting to do it,
> though), I would strongly recommend picking a format which at least is
> clear that a value is a string (text, whatever), and preferably makes
> it clear what the character encoding is. Or maybe it should just
> follow whatever the client encoding is at the time - as long as that
> is completely unambiguous to a client.
>

Its treatment of numbers is also broken from my POV (numbers are not
just integers or floats), so no, we're not going to use tnetstrings.
Plus, the whole idea of us moving to text for send/recv was to save
code, not to have to write new code, so to suggest using it now is to
ignore the discussion that went on before.

cheers

andrew


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-11 03:15:21
Message-ID: 52F995C9.2040303@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 02/10/2014 09:11 PM, Andres Freund wrote:
>> diff --git a/src/backend/utils/adt/jsonfuncs.c b/src/backend/utils/adt/jsonfuncs.c
>> index e1d8aae..50ddf50 100644
>> --- a/src/backend/utils/adt/jsonfuncs.c
>> +++ b/src/backend/utils/adt/jsonfuncs.c
> there's lots of whitespace/tab damage in this file. Check git log/diff
> --check or such.

I don't know exactly what you're looking at. Here's what I get:

[andrew(at)emma pg_jsonb]$ git diff --check master
contrib/hstore/hstore--1.3.sql:465: trailing whitespace.
+ WITHOUT FUNCTION AS IMPLICIT;
contrib/hstore/hstore--1.3.sql:468: trailing whitespace.
+ WITHOUT FUNCTION AS IMPLICIT;
[andrew(at)emma pg_jsonb]$

I'll have a look at some of your other complaints when I get back home
in a two or three of days, weather permitting.

cheers

andrew


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-11 03:21:57
Message-ID: 20140211032157.GG15246@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2014-02-10 22:15:21 -0500, Andrew Dunstan wrote:
>
> On 02/10/2014 09:11 PM, Andres Freund wrote:
> >>diff --git a/src/backend/utils/adt/jsonfuncs.c b/src/backend/utils/adt/jsonfuncs.c
> >>index e1d8aae..50ddf50 100644
> >>--- a/src/backend/utils/adt/jsonfuncs.c
> >>+++ b/src/backend/utils/adt/jsonfuncs.c
> >there's lots of whitespace/tab damage in this file. Check git log/diff
> >--check or such.
>
>
> I don't know exactly what you're looking at. Here's what I get:

Sorry, forget that bit.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Hannu Krosing <hannu(at)2ndQuadrant(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-11 09:35:51
Message-ID: 52F9EEF7.3000307@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 02/11/2014 01:16 AM, Merlin Moncure wrote:
> On Mon, Feb 10, 2014 at 5:52 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
>> It works in enough cases atm that it's worthwile trying to keep it
>> working. Sure, it could be better, but it's what we have right now. Atm
>> it's e.g. the only realistic way to copy larger amounts of bytea between
>> servers without copying the entire cluster.
> That's the thing -- it might work today, but what about tomorrow?
> We'd be sending the wrong signals. People start building processes
> around all of this and now we've painted ourselves into a box. Better
> in my mind to simply educate users that this practice is dangerous and
> unsupported, as we used to do. I guess until now. It seems completely
> odd to me that we're attaching a case to the jsonb type, in the wrong
> way -- something that we've never attached to any other type before.
> For example, why didn't we attach a version code to the json type send
> function?
JSON is supposed to be a *standard* way of encoding data in
strings. If the ever changes, it will not be JSON type anymore.

Cheers

--
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Hannu Krosing <hannu(at)2ndquadrant(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-11 17:58:47
Message-ID: CAHyXU0xKV-8QwYju-=rJrXRmO8U3fM=4Fr3LMK+ynLCifD+FLQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Feb 11, 2014 at 3:35 AM, Hannu Krosing <hannu(at)2ndquadrant(dot)com> wrote:
> On 02/11/2014 01:16 AM, Merlin Moncure wrote:
>> On Mon, Feb 10, 2014 at 5:52 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
>>> It works in enough cases atm that it's worthwile trying to keep it
>>> working. Sure, it could be better, but it's what we have right now. Atm
>>> it's e.g. the only realistic way to copy larger amounts of bytea between
>>> servers without copying the entire cluster.
>> That's the thing -- it might work today, but what about tomorrow?
>> We'd be sending the wrong signals. People start building processes
>> around all of this and now we've painted ourselves into a box. Better
>> in my mind to simply educate users that this practice is dangerous and
>> unsupported, as we used to do. I guess until now. It seems completely
>> odd to me that we're attaching a case to the jsonb type, in the wrong
>> way -- something that we've never attached to any other type before.
>> For example, why didn't we attach a version code to the json type send
>> function?
> JSON is supposed to be a *standard* way of encoding data in
> strings. If the ever changes, it will not be JSON type anymore.

My point was that as we reserved the right to change jsonb binary
format we'd probably want to reserve the right to change json's as
well. This was in support of the theme of 'why is jsonb a special
case?'. However, I think it's pretty much settled that the any
potential concerns I raised in terms of providing a version flag are
outweighed by it's potential usefulness.

merlin


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-21 00:51:36
Message-ID: CAM3SWZR2Ov1Tsq3_oM9bE_dy4X2WiCFxgNbjQt_J4_G2EKeJ7Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Jan 30, 2014 at 11:07 AM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
> Updated patches for both pieces. Included is some tidying done by Teodor,
> and fixes for remaining whitespace issues. This now passes "git diff --check
> master" cleanly for me.

So one thing that isn't clear from these patches is how jsonb will
have the benefit of the new hstore functions and operators. The cast
is not implicit. I believe that Teodor made the cast implicit on
Github on February 8th (which has not been formally submitted), but
that has problems of its own.

Does anyone have any ideas about how best to enable jsonb to take
advantage of the new functions and operators?

--
Peter Geoghegan


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-24 03:23:37
Message-ID: 530ABB39.20900@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Teodor, Oleg:

Some bitrot on the nested-hstore patch on current HEAD, possibly due to
the recent update release?

josh(at)radegast:~/git/pg94$ patch -p1 -i nested-hstore-10.patch
patching file contrib/hstore/.gitignore
patching file contrib/hstore/Makefile
patching file contrib/hstore/crc32.c
patching file contrib/hstore/crc32.h
patching file contrib/hstore/expected/hstore.out
patching file contrib/hstore/expected/nested.out
patching file contrib/hstore/expected/types.out
patching file contrib/hstore/hstore--1.2--1.3.sql
patching file contrib/hstore/hstore--1.2.sql
patching file contrib/hstore/hstore--1.3.sql
patching file contrib/hstore/hstore.control
patching file contrib/hstore/hstore.h
Hunk #2 FAILED at 13.
Hunk #3 succeeded at 201 (offset 9 lines).
1 out of 3 hunks FAILED -- saving rejects to file
contrib/hstore/hstore.h.rej
patching file contrib/hstore/hstore_compat.c
patching file contrib/hstore/hstore_gin.c
patching file contrib/hstore/hstore_gist.c
patching file contrib/hstore/hstore_gram.y
patching file contrib/hstore/hstore_io.c
Hunk #1 FAILED at 2.
Hunk #2 succeeded at 23 (offset 1 line).
Hunk #3 succeeded at 53 (offset 1 line).
Hunk #4 FAILED at 63.
Hunk #5 succeeded at 297 (offset 13 lines).
Hunk #6 succeeded at 309 (offset 13 lines).
Hunk #7 succeeded at 348 (offset 13 lines).
Hunk #8 succeeded at 359 (offset 13 lines).
Hunk #9 succeeded at 394 with fuzz 2 (offset 20 lines).
Hunk #10 succeeded at 406 (offset 20 lines).
Hunk #11 succeeded at 462 (offset 20 lines).
Hunk #12 FAILED at 508.
Hunk #13 succeeded at 551 (offset 21 lines).
Hunk #14 succeeded at 561 (offset 21 lines).
Hunk #15 succeeded at 651 (offset 21 lines).
Hunk #16 succeeded at 696 (offset 21 lines).
Hunk #17 succeeded at 703 (offset 21 lines).
Hunk #18 succeeded at 767 (offset 21 lines).
Hunk #19 succeeded at 776 (offset 21 lines).
Hunk #20 succeeded at 791 (offset 21 lines).
Hunk #21 succeeded at 807 (offset 21 lines).
Hunk #22 succeeded at 820 (offset 21 lines).
Hunk #23 succeeded at 856 (offset 21 lines).
Hunk #24 FAILED at 1307.
Hunk #25 FAILED at 1433.
5 out of 25 hunks FAILED -- saving rejects to file
contrib/hstore/hstore_io.c.rej
patching file contrib/hstore/hstore_op.c
Hunk #1 FAILED at 25.
Hunk #2 succeeded at 202 (offset 14 lines).
Hunk #3 succeeded at 247 (offset 14 lines).
Hunk #4 FAILED at 253.
Hunk #5 succeeded at 756 (offset 15 lines).
Hunk #6 succeeded at 799 (offset 15 lines).
Hunk #7 succeeded at 885 (offset 15 lines).
Hunk #8 succeeded at 1416 (offset 15 lines).
Hunk #9 succeeded at 1605 (offset 15 lines).
Hunk #10 succeeded at 1720 (offset 15 lines).
2 out of 10 hunks FAILED -- saving rejects to file
contrib/hstore/hstore_op.c.rej

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-24 06:20:14
Message-ID: 530AE49E.6060500@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

All,

Here's a draft cleanup on the JSON section of the Datatype docs. Since
there's been a bunch of incremental patches on this, I just did a diff
against HEAD.

I looked over json-functions a bit, but am not clear on what needs to
change there; the docs are pretty similar to other sections of
Functions, and if they're complex it's because of the sheer number of
JSON-related functions.

Anyway, this version of datatypes introduces a comparison table, which I
think should make things a bit clearer for users.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

Attachment Content-Type Size
datatype.sgml.jsonb-jmb1.diff text/x-patch 5.1 KB

From: Tomas Vondra <tv(at)fuzzy(dot)cz>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: jsonb and nested hstore
Date: 2014-02-24 13:34:00
Message-ID: 530B4A48.1000809@fuzzy.cz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 7.2.2014 00:47, Andrew Dunstan wrote:
>
> On 02/05/2014 10:36 AM, Teodor Sigaev wrote:
>> Should I make new version of patch? Right now it's placed on github.
>> May be Andrew wants to change something?
>>
>
>
> Attached are updated patches.
>
> Apart from the things Teodor has fixed, this includes
>
> * switching to using text representation in jsonb send/recv
> * implementation of jsonb_array_elements_text that we need now we have
> json_array_elements_text
> * some code fixes requested in code reviews, plus some other tidying
> and refactoring.
>
> cheers

Hi,

I'm slightly uncertain if this is the current version of the patches, or
whether I should look at
https://github.com/feodor/postgres/tree/jsonb_and_hstore which contains
slightly modified code.

Anyway, the only thing I noticed in the v10 version so far is slight
difference in naming - while we have json_to_hstore/hstore_to_json, we
have jsonb2hstore/hstore2jsonb. I propose to change this to
jsonb_to_hstore/hstore_to_jsonb.

May not be needed if the implicit casts go through.

regards
Tomas


From: Oleg Bartunov <obartunov(at)gmail(dot)com>
To: Tomas Vondra <tv(at)fuzzy(dot)cz>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-24 14:34:06
Message-ID: CAF4Au4wOojzqACWv9Pgiyo=m_uATBMTP8Yeu7Oga7+doqOwHzg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Yes, the repository you mentioned is the last version of our
development. It contains various fixes of issues by Andres, but we are
waiting Andrew, who is working on jsonb stuff.

On Mon, Feb 24, 2014 at 5:34 PM, Tomas Vondra <tv(at)fuzzy(dot)cz> wrote:
> On 7.2.2014 00:47, Andrew Dunstan wrote:
>>
>> On 02/05/2014 10:36 AM, Teodor Sigaev wrote:
>>> Should I make new version of patch? Right now it's placed on github.
>>> May be Andrew wants to change something?
>>>
>>
>>
>> Attached are updated patches.
>>
>> Apart from the things Teodor has fixed, this includes
>>
>> * switching to using text representation in jsonb send/recv
>> * implementation of jsonb_array_elements_text that we need now we have
>> json_array_elements_text
>> * some code fixes requested in code reviews, plus some other tidying
>> and refactoring.
>>
>> cheers
>
> Hi,
>
> I'm slightly uncertain if this is the current version of the patches, or
> whether I should look at
> https://github.com/feodor/postgres/tree/jsonb_and_hstore which contains
> slightly modified code.
>
> Anyway, the only thing I noticed in the v10 version so far is slight
> difference in naming - while we have json_to_hstore/hstore_to_json, we
> have jsonb2hstore/hstore2jsonb. I propose to change this to
> jsonb_to_hstore/hstore_to_jsonb.
>
> May not be needed if the implicit casts go through.
>
> regards
> Tomas
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-24 14:46:06
Message-ID: CAHyXU0xYKbUJxevvz02ZXzzeJ5kGA8BZC-e3i3kgEH0gkUqL6w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Feb 24, 2014 at 12:20 AM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> All,
>
> Here's a draft cleanup on the JSON section of the Datatype docs. Since
> there's been a bunch of incremental patches on this, I just did a diff
> against HEAD.
>
> I looked over json-functions a bit, but am not clear on what needs to
> change there; the docs are pretty similar to other sections of
> Functions, and if they're complex it's because of the sheer number of
> JSON-related functions.
>
> Anyway, this version of datatypes introduces a comparison table, which I
> think should make things a bit clearer for users.

I still find the phrasing "as jsonb is more efficient for most
purposes" to be a bit off Basically, the text json type is faster for
serialization/deserialization pattern (not just document preservation)
and jsonb is preferred when storing json and doing repeated
subdocument accesses.

merlin


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-24 15:08:43
Message-ID: CAHyXU0wNf2Ke+yjC=HMkobYk2i-mUsE+a6f=ffksf4ZVMEi7gg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Feb 24, 2014 at 8:46 AM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
> I still find the phrasing "as jsonb is more efficient for most
> purposes" to be a bit off Basically, the text json type is faster for
> serialization/deserialization pattern (not just document preservation)
> and jsonb is preferred when storing json and doing repeated
>subdocument accesses.

Hm, I'm going to withdraw that. I had done some testing of simple
deserialization (cast to text and the like) and noted that jsonb was
as much as 5x slower. However, I just did some checking on
json[b]_populate_recordset though and it's pretty much a wash.

merlin


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-24 16:06:37
Message-ID: CAHyXU0zM2NxoVM2LMfWRWN_7aJBE64LHixDqGJeULSDSqPRDtQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Feb 24, 2014 at 9:08 AM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
> On Mon, Feb 24, 2014 at 8:46 AM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
>> I still find the phrasing "as jsonb is more efficient for most
>> purposes" to be a bit off Basically, the text json type is faster for
>> serialization/deserialization pattern (not just document preservation)
>> and jsonb is preferred when storing json and doing repeated
>>subdocument accesses.
>
> Hm, I'm going to withdraw that. I had done some testing of simple
> deserialization (cast to text and the like) and noted that jsonb was
> as much as 5x slower. However, I just did some checking on
> json[b]_populate_recordset though and it's pretty much a wash.

[sorry for noise on this].

Here's the use case coverage as I see it today:

CASE: json jsonb hstore
Static document: yes poor poor
Precise document: yes no no
Serialization: yes no no****
Deserialization: poor*** yes* no****
Repeated Access: poor yes yes
Manipulation: no no** yes
GIST/GIN searching: no no** yes

notes:
* jsonb gets 'yes' for deserialization assuming andrew's 'two level'
deserialization fix goes in (otherwise 'poor').
** jsonb can't do this today, but presumably will be able to soon
*** 'poor' unless json type also gets the deserialization fix, then 'yes'.
**** hstore can deserialize hstore format, but will rely on json/jsonb
for deserializing json

'Static document' represents edge cases where the json is opaque to
the database but performance -- for example large map polygons.
'Precise document' represents cases where whitespace or key order is important.

Peter asked upthread how to access the various features. Well, today,
it basically means a bit of nimble casting to different structures
depending on which particular features are important to you, which
IMNSHO is not bad at all as long as we understand that most people who
rely on jsonb will also need hstore for its searching and operators.
Down the line when hstore and jsonb are more flushed out it's going to
come down to an API style choice.

merlin


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-24 17:31:16
Message-ID: 530B81E4.6080206@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 02/24/2014 07:08 AM, Merlin Moncure wrote:
> On Mon, Feb 24, 2014 at 8:46 AM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
>> I still find the phrasing "as jsonb is more efficient for most
>> purposes" to be a bit off Basically, the text json type is faster for
>> serialization/deserialization pattern (not just document preservation)
>> and jsonb is preferred when storing json and doing repeated
>> subdocument accesses.
>
> Hm, I'm going to withdraw that. I had done some testing of simple
> deserialization (cast to text and the like) and noted that jsonb was
> as much as 5x slower. However, I just did some checking on
> json[b]_populate_recordset though and it's pretty much a wash.

Aside from that, I want our docs to make a strong endorsement of using
jsonb over json for most users. jsonb will continue to be developed and
improved in the future; it is very unlikely that json will. Maybe
that's what I should say rather than anything about efficiency.

In other words: having an ambiguous, complex evaluation of json vs.
jsonb does NOT benefit most users. The result will be some users
choosing json and then pitching fit when they want jsonb in 9.5 and have
to rewrite all their tables.

Mind you, we'll need to fix the slow deserialization, though.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-24 19:15:51
Message-ID: 530B9A67.3050800@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 02/24/2014 11:06 AM, Merlin Moncure wrote:
> On Mon, Feb 24, 2014 at 9:08 AM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
>> On Mon, Feb 24, 2014 at 8:46 AM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
>>> I still find the phrasing "as jsonb is more efficient for most
>>> purposes" to be a bit off Basically, the text json type is faster for
>>> serialization/deserialization pattern (not just document preservation)
>>> and jsonb is preferred when storing json and doing repeated
>>> subdocument accesses.
>> Hm, I'm going to withdraw that. I had done some testing of simple
>> deserialization (cast to text and the like) and noted that jsonb was
>> as much as 5x slower. However, I just did some checking on
>> json[b]_populate_recordset though and it's pretty much a wash.
> [sorry for noise on this].
>
> Here's the use case coverage as I see it today:
>
> CASE: json jsonb hstore
> Static document: yes poor poor
> Precise document: yes no no
> Serialization: yes no no****
> Deserialization: poor*** yes* no****
> Repeated Access: poor yes yes
> Manipulation: no no** yes
> GIST/GIN searching: no no** yes
>
> notes:
> * jsonb gets 'yes' for deserialization assuming andrew's 'two level'
> deserialization fix goes in (otherwise 'poor').
> ** jsonb can't do this today, but presumably will be able to soon
> *** 'poor' unless json type also gets the deserialization fix, then 'yes'.
> **** hstore can deserialize hstore format, but will rely on json/jsonb
> for deserializing json
>
> 'Static document' represents edge cases where the json is opaque to
> the database but performance -- for example large map polygons.
> 'Precise document' represents cases where whitespace or key order is important.
>
> Peter asked upthread how to access the various features. Well, today,
> it basically means a bit of nimble casting to different structures
> depending on which particular features are important to you, which
> IMNSHO is not bad at all as long as we understand that most people who
> rely on jsonb will also need hstore for its searching and operators.
> Down the line when hstore and jsonb are more flushed out it's going to
> come down to an API style choice.
>

Frankly, a lot of the above doesn't make much sense to me. WTF is
"Manipulation'?

Unless I see much more actual info on the tests being conducted it's
just about impossible to comment. The performance assessment at this
stage is simply anecdotal as far as I'm concerned.

populate_record() is likely to be a *very* poor point of comparison
anyway, I would expect the performance numbers to be dominated by the
input function calls for the object's component types, and that's going
to be the same in both cases. If you want to prove something here you'll
need to supply profiling numbers showing where it spends its time in
each case.

Having had my schedule very seriously disrupted by the storm in the US
South East a week or so ago, I am finally getting back to being able to
devote some time to jsonb. I hope to have new patches available today or
tomorrow at the latest.

cheers

andrew


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-24 21:02:54
Message-ID: 530BB37E.3090803@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 02/24/2014 02:15 PM, Andrew Dunstan wrote:
>
>
> Having had my schedule very seriously disrupted by the storm in the US
> South East a week or so ago, I am finally getting back to being able
> to devote some time to jsonb. I hope to have new patches available
> today or tomorrow at the latest.
>
>

Update to this: A recent commit caused an unfortunate merge conflict in
the hstore code that I have asked Teodor to resolve. I can't post new
clean patches until that's been done.

cheers

andrew


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-25 14:46:34
Message-ID: CAHyXU0w11O5u=Y54_HjRrhY_GQbNX_k=Y9xLwoyf_q6-qOEsvQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Feb 24, 2014 at 1:15 PM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
> On 02/24/2014 11:06 AM, Merlin Moncure wrote:
>>
>> On Mon, Feb 24, 2014 at 9:08 AM, Merlin Moncure <mmoncure(at)gmail(dot)com>
>> wrote:
>>>
>>> On Mon, Feb 24, 2014 at 8:46 AM, Merlin Moncure <mmoncure(at)gmail(dot)com>
>>> wrote:
>>>>
>>>> I still find the phrasing "as jsonb is more efficient for most
>>>> purposes" to be a bit off Basically, the text json type is faster for
>>>> serialization/deserialization pattern (not just document preservation)
>>>> and jsonb is preferred when storing json and doing repeated
>>>> subdocument accesses.
>>>
>>> Hm, I'm going to withdraw that. I had done some testing of simple
>>> deserialization (cast to text and the like) and noted that jsonb was
>>> as much as 5x slower. However, I just did some checking on
>>> json[b]_populate_recordset though and it's pretty much a wash.
>>
>> [sorry for noise on this].
>>
>> Here's the use case coverage as I see it today:
>>
>> CASE: json jsonb hstore
>> Static document: yes poor poor
>> Precise document: yes no no
>> Serialization: yes no no****
>> Deserialization: poor*** yes* no****
>> Repeated Access: poor yes yes
>> Manipulation: no no** yes
>> GIST/GIN searching: no no** yes
>>
>> notes:
>> * jsonb gets 'yes' for deserialization assuming andrew's 'two level'
>> deserialization fix goes in (otherwise 'poor').
>> ** jsonb can't do this today, but presumably will be able to soon
>> *** 'poor' unless json type also gets the deserialization fix, then 'yes'.
>> **** hstore can deserialize hstore format, but will rely on json/jsonb
>> for deserializing json
>>
>> 'Static document' represents edge cases where the json is opaque to
>> the database but performance -- for example large map polygons.
>> 'Precise document' represents cases where whitespace or key order is
>> important.
>>
>> Peter asked upthread how to access the various features. Well, today,
>> it basically means a bit of nimble casting to different structures
>> depending on which particular features are important to you, which
>> IMNSHO is not bad at all as long as we understand that most people who
>> rely on jsonb will also need hstore for its searching and operators.
>> Down the line when hstore and jsonb are more flushed out it's going to
>> come down to an API style choice.
>
> Frankly, a lot of the above doesn't make much sense to me. WTF is
> "Manipulation'?
>
> Unless I see much more actual info on the tests being conducted it's just
> about impossible to comment. The performance assessment at this stage is
> simply anecdotal as far as I'm concerned.

Er, I wasn't making performance assessments (except in cases where it
was obvious like poor support for arbitrary access with json) , but
API coverage of use cases. "Manipulation" I thought obvious: the
ability to manipulate the document (say, change some value to
something else): the nosql pattern. through the API. Neither json or
jsonb can do that at present...only hstore can. jsonb cant't; it only
covers some of what json type currently covers (but some of the thing
it does cover is much faster).

On Mon, Feb 24, 2014 at 11:31 AM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
>> Hm, I'm going to withdraw that. I had done some testing of simple
>> deserialization (cast to text and the like) and noted that jsonb was
>> as much as 5x slower. However, I just did some checking on
>> json[b]_populate_recordset though and it's pretty much a wash.
>
> Aside from that, I want our docs to make a strong endorsement of using
> jsonb over json for most users. jsonb will continue to be developed and
> improved in the future; it is very unlikely that json will. Maybe
> that's what I should say rather than anything about efficiency.

I would hope that endorsement doesn't extend to misinforming users.
Moreover, json type is handling all serialization at present and will
continue to do so for some years. In fact, in this release we got a
bunch of new very necessary enhancements (json_build) to
serialization! You're trying to deprecate and enhance the type at the
same time!

The disconnect here is that your statements would be correct if the
only usage for the json type would be for storing data in json.
However, people (including myself) are doing lots of wonderful things
storing data in the traditional way and moving into and out of json in
queries and that, besides working better in the json type, is only
possible in json. That might change in the future by figuring out a
way to cover json serialization cases through jsonb but that's not how
things work today, end of story.

Look, I definitely feel the frustration and weariness here in terms of
my critiquing the proposed API along with the other arguments I've
made. Please understand that nobody wants this to go out the door
more than me if the objective is to lock in the API 'as is' then let's
be polite to our users and try to document various use cases and
what's good at what.

merlin


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-25 16:13:16
Message-ID: CA+Tgmob1SrWWo1miyJeOyKgPxLJ4cWKiSt6GWT3RweDEu=Yw3A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Feb 24, 2014 at 12:31 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> Aside from that, I want our docs to make a strong endorsement of using
> jsonb over json for most users. jsonb will continue to be developed and
> improved in the future; it is very unlikely that json will. Maybe
> that's what I should say rather than anything about efficiency.
>
> In other words: having an ambiguous, complex evaluation of json vs.
> jsonb does NOT benefit most users. The result will be some users
> choosing json and then pitching fit when they want jsonb in 9.5 and have
> to rewrite all their tables.
>
> Mind you, we'll need to fix the slow deserialization, though.

I think you've got your head stuck deeply in the sand. The json data
type works exactly like the xml data type has always worked. There
have been occasional noises about making an xmlb data type, but
nobody's minded enough to do anything about it, or at least not in
this forum. So if the json data type has no future and is crap, then
the same presumably holds of the xml data type. But I don't think
anyone here believes that, unless they just hate xml on general
principle, which I can certainly understand.

You really *can't* fix the fact that jsonb takes longer to
(deserialize than json. I mean, it's possible the code can be
optimized. But since json is stored in the exact format in which it
is to be emitted, the output function is basically just memcpy().
You're never going to get that kind of speed out of code that actually
has to do something, and I suspect you're going to find that it's hard
to come close.

In short, I think you're viewing everything about jsonb with
rose-colored glasses on, and that your enthusiasm is mostly wishful
thinking. Will there be good things about jsonb? Of course. Will
lots of people want to use it for those reasons? Very likely. Will
it be better than json in all ways and for all purposes? No, and
implying the contrary is just plain wrong.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-25 17:38:50
Message-ID: 530CD52A.4000209@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 02/25/2014 08:13 AM, Robert Haas wrote:
> I think you've got your head stuck deeply in the sand. The json data
> type works exactly like the xml data type has always worked. There
> have been occasional noises about making an xmlb data type, but
> nobody's minded enough to do anything about it, or at least not in
> this forum. So if the json data type has no future and is crap, then
> the same presumably holds of the xml data type. But I don't think
> anyone here believes that, unless they just hate xml on general
> principle, which I can certainly understand.

Well, if we had an XMLB, I would in fact be making the same argument.
I'll point out the only reason we're keeping the original json instead
of forcing an upgrade to jsonb, per earlier discussions, is
backwards-compatibility. If we had never had a json-text, and Merlin
was proposing adding one now alongside jsonb, I'd be arguing against
doing so.

> In short, I think you're viewing everything about jsonb with
> rose-colored glasses on, and that your enthusiasm is mostly wishful
> thinking. Will there be good things about jsonb? Of course. Will
> lots of people want to use it for those reasons? Very likely. Will
> it be better than json in all ways and for all purposes? No, and
> implying the contrary is just plain wrong.

It hurts our adoption substantially to confuse developers. We need to
recommend one type over the other, hence "Use jsonb unless you need X".
Merlin is pushing the type of multivariable comparison where *I*
wouldn't be able to make sense of which one I should pick, let alone
some web developer who's just trying to get a site built. That sort of
thing *really* doesn't help our users.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-25 17:45:10
Message-ID: 20140225174510.GC1507@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Feb 25, 2014 at 09:38:50AM -0800, Josh Berkus wrote:
> > In short, I think you're viewing everything about jsonb with
> > rose-colored glasses on, and that your enthusiasm is mostly wishful
> > thinking. Will there be good things about jsonb? Of course. Will
> > lots of people want to use it for those reasons? Very likely. Will
> > it be better than json in all ways and for all purposes? No, and
> > implying the contrary is just plain wrong.
>
> It hurts our adoption substantially to confuse developers. We need to
> recommend one type over the other, hence "Use jsonb unless you need X".
> Merlin is pushing the type of multivariable comparison where *I*
> wouldn't be able to make sense of which one I should pick, let alone
> some web developer who's just trying to get a site built. That sort of
> thing *really* doesn't help our users.

I agree it would be nice to have something simple, like "Use JSON if you
wish to just store/retrieve entire JSON structures, and JSONB if you
wish to do any kind of lookup or manipulation of JSON values on the
server".

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ Everyone has their own god. +


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-25 18:31:38
Message-ID: CA+TgmobUp2f64n9nove+PgFoN3pYqj-XVKzg-G7Gn81ULtLvUA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Feb 25, 2014 at 12:38 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> On 02/25/2014 08:13 AM, Robert Haas wrote:
>> I think you've got your head stuck deeply in the sand. The json data
>> type works exactly like the xml data type has always worked. There
>> have been occasional noises about making an xmlb data type, but
>> nobody's minded enough to do anything about it, or at least not in
>> this forum. So if the json data type has no future and is crap, then
>> the same presumably holds of the xml data type. But I don't think
>> anyone here believes that, unless they just hate xml on general
>> principle, which I can certainly understand.
>
> Well, if we had an XMLB, I would in fact be making the same argument.
> I'll point out the only reason we're keeping the original json instead
> of forcing an upgrade to jsonb, per earlier discussions, is
> backwards-compatibility. If we had never had a json-text, and Merlin
> was proposing adding one now alongside jsonb, I'd be arguing against
> doing so.

You can argue that all you like. But the same argument was made and
rejected at the time we (I) added the original json type. So I don't
believe that you can claim that your argument is backed by any sort of
consensus, because AFAICS it isn't.

>> In short, I think you're viewing everything about jsonb with
>> rose-colored glasses on, and that your enthusiasm is mostly wishful
>> thinking. Will there be good things about jsonb? Of course. Will
>> lots of people want to use it for those reasons? Very likely. Will
>> it be better than json in all ways and for all purposes? No, and
>> implying the contrary is just plain wrong.
>
> It hurts our adoption substantially to confuse developers. We need to
> recommend one type over the other, hence "Use jsonb unless you need X".
> Merlin is pushing the type of multivariable comparison where *I*
> wouldn't be able to make sense of which one I should pick, let alone
> some web developer who's just trying to get a site built. That sort of
> thing *really* doesn't help our users.

I don't have any objection to editing what Merlin wrote to be clear
and concise; I don't think he meant for it to be considered for
inclusion in the documentation in exactly that form anyway. I do have
an objection to including your unjustified partisanship in our
documentation as fact.

The reality is that if you have a bunch of JSON documents indexed by
some ID number and expect to usually retrieve the whole document, you
probably don't want jsonb. You probably want one integer column and
one json column, because it's gonna be faster that way. And if you
expect to usually retrieve only part of the document, then you are
probably better off using separate columns for the separate parts of
the document, because I bet that extracting a portion of a large
document is still going to require de-TOASTing the whole thing, or at
least all the data preceding the last byte offset of interest, which
is full of lose. The situation where jsonb is going to win is where
either (1) you or your client are so stuck in the document database
model that you can't fathom the idea that using a real database schema
might improve performance or (2) you have so many different things
(pseudocolumns, as it were) that you might want to extract from any
given JSON blob that it's impractical to use real columns for all of
those. I agree those are both real use cases. I do not agree that
they are the only or most common use cases. And I definitely don't
agree that our documentation should push people towards stuffing
everything in a JSON blob instead of using real column definitions.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-25 18:45:20
Message-ID: 530CE4C0.7070200@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 02/25/2014 10:31 AM, Robert Haas wrote:
> And I definitely don't
> agree that our documentation should push people towards stuffing
> everything in a JSON blob instead of using real column definitions.

????

Where did you get this out of my doc patch?

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-25 18:50:25
Message-ID: CA+TgmoaiP8Xj-cdtTLhRj20twbcU+fF2091SB0Jo8nzNvdAe2g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Feb 25, 2014 at 1:45 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> On 02/25/2014 10:31 AM, Robert Haas wrote:
>> And I definitely don't
>> agree that our documentation should push people towards stuffing
>> everything in a JSON blob instead of using real column definitions.
>
> ????
>
> Where did you get this out of my doc patch?

Way to quote what I said out of context.

But to make a long story short, I get that from the fact that you want
to railroad everyone into using jsonb.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-25 18:54:28
Message-ID: 530CE6E4.7030102@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 02/25/2014 10:50 AM, Robert Haas wrote:
> On Tue, Feb 25, 2014 at 1:45 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
>> On 02/25/2014 10:31 AM, Robert Haas wrote:
>>> And I definitely don't
>>> agree that our documentation should push people towards stuffing
>>> everything in a JSON blob instead of using real column definitions.
>>
>> ????
>>
>> Where did you get this out of my doc patch?
>
> Way to quote what I said out of context.

Way to put words in my mouth.

> But to make a long story short, I get that from the fact that you want
> to railroad everyone into using jsonb.

That's called a "straw man argument", Robert.

Me: We should recommend that people use jsonb unless they have a
specific reason for using json.

Merlin: We should present them side-by-side with a complex comparison.

Robert: Josh wants to junk all relational data and use only jsonb!

I mean, really, WTF?

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-25 19:07:57
Message-ID: 530CEA0D.6090603@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 02/25/2014 09:45 AM, Bruce Momjian wrote:
>> It hurts our adoption substantially to confuse developers. We need to
>> recommend one type over the other, hence "Use jsonb unless you need X".
>> Merlin is pushing the type of multivariable comparison where *I*
>> wouldn't be able to make sense of which one I should pick, let alone
>> some web developer who's just trying to get a site built. That sort of
>> thing *really* doesn't help our users.
>
> I agree it would be nice to have something simple, like "Use JSON if you
> wish to just store/retrieve entire JSON structures, and JSONB if you
> wish to do any kind of lookup or manipulation of JSON values on the
> server".

(to clarify below: "json" refers to the current varlena datatype; JSON
refers to JSON serialized data).

I don't think that's decisive enough, which is why I wrote the doc the
way I did. The problem is that most users would prefer that we tell
them which one to use, which is why I want to structure the doc as "Use
jsonb unless you need one of these things", or more specifically:

In general, most applications will find it advantageous to store
JSON data
as <type>jsonb</type>, as jsonb is more efficient when using JSON
manipulation functions, and will
support future advanced json index, operator and search features. The
<type>json</type> will primarily be useful for applications which
need to
preserve exact formatting of the input JSON, or users with existing
<type>json</type> columns which they do not want to convert to
<type>jsonb</type>.

Part of my reason for wanting to recommend jsonb over json is in the
context of the third storage option for JSON, namely TEXT. The only
things which distinguish json from TEXT for JSON storage are validation
and a set of json manipulation functions. jsonb works with the
manipulation functions better/faster, causing the old json type to start
looking like more of a DOMAIN over TEXT than a real type comparatively.
In other words, if you ask the question "Why would I want to use json
instead of either jsonb or TEXT", the answer becomes quite narrow.

Possibly I should expand the little chart and add a column for TEXT?
It's a viable option for storing JSON data, especially if you store a
lot of broken JSON or fragments.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-25 19:12:00
Message-ID: 530CEB00.8070104@aklaver.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 02/25/2014 10:54 AM, Josh Berkus wrote:
> On 02/25/2014 10:50 AM, Robert Haas wrote:
>> On Tue, Feb 25, 2014 at 1:45 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
>>> On 02/25/2014 10:31 AM, Robert Haas wrote:
>>>> And I definitely don't
>>>> agree that our documentation should push people towards stuffing
>>>> everything in a JSON blob instead of using real column definitions.
>>>
>>> ????
>>>
>>> Where did you get this out of my doc patch?
>>
>> Way to quote what I said out of context.
>
> Way to put words in my mouth.
>
>> But to make a long story short, I get that from the fact that you want
>> to railroad everyone into using jsonb.
>
> That's called a "straw man argument", Robert.
>
> Me: We should recommend that people use jsonb unless they have a
> specific reason for using json.
>
> Merlin: We should present them side-by-side with a complex comparison.

From the cheap seats.

To me the whole hstore/json/jsonb family is a WIP and any enlightenment
in the form of comparisons would be greatly appreciated by me and other
end users I would suspect.

>
> Robert: Josh wants to junk all relational data and use only jsonb!
>
> I mean, really, WTF?

Seems to be a hot topic all the way around. I am neck deep in learning
Web development and am coming to grips with the role of JSON in that world.

>


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-25 20:12:55
Message-ID: CA+Tgmoa+EqcR1NwgkfMBP7bsMC3TYM6HOOnJYZYFxhBAyUM_8w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Feb 25, 2014 at 1:54 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> On 02/25/2014 10:50 AM, Robert Haas wrote:
>> On Tue, Feb 25, 2014 at 1:45 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
>>> On 02/25/2014 10:31 AM, Robert Haas wrote:
>>>> And I definitely don't
>>>> agree that our documentation should push people towards stuffing
>>>> everything in a JSON blob instead of using real column definitions.
>>>
>>> ????
>>>
>>> Where did you get this out of my doc patch?
>>
>> Way to quote what I said out of context.
>
> Way to put words in my mouth.
>
>> But to make a long story short, I get that from the fact that you want
>> to railroad everyone into using jsonb.
>
> That's called a "straw man argument", Robert.
>
> Me: We should recommend that people use jsonb unless they have a
> specific reason for using json.
>
> Merlin: We should present them side-by-side with a complex comparison.
>
> Robert: Josh wants to junk all relational data and use only jsonb!
>
> I mean, really, WTF?

OK, since what I said seems to have become distorted somewhere along
the line, allow me to rephrase:

I don't agree that jsonb should be preferred in all but a handful of
situations. Nor do I agree that partisanship belongs in our
documentation. Therefore, -1 for your proposal to recommend that, and
+1 for Merlin's proposal to present a comparison which fairly
illustrates the situations in which each will outperform the other.

Thanks,

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-25 21:08:22
Message-ID: 530D0646.8020407@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 02/24/2014 04:02 PM, Andrew Dunstan wrote:
>
> On 02/24/2014 02:15 PM, Andrew Dunstan wrote:
>>
>>
>> Having had my schedule very seriously disrupted by the storm in the
>> US South East a week or so ago, I am finally getting back to being
>> able to devote some time to jsonb. I hope to have new patches
>> available today or tomorrow at the latest.
>>
>>
>
>
> Update to this: A recent commit caused an unfortunate merge conflict
> in the hstore code that I have asked Teodor to resolve. I can't post
> new clean patches until that's been done.
>
>

OK, here we go, with bitrot fixed (thanks Teodor), Teodor's latest
changes, and versioning for jsonb binary input/output.

This reflects what is currently on the jsonb_and_hstore branch of
<https://github.com/feodor/postgres.git>

cheers

andrew

Attachment Content-Type Size
nested-hstore-11.patch.gz application/x-gzip 66.8 KB
jsonb-11.patch.gz application/x-gzip 31.6 KB

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-25 21:27:40
Message-ID: 20140225212740.GA4759@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Josh Berkus escribió:

> (to clarify below: "json" refers to the current varlena datatype; JSON
> refers to JSON serialized data).

FWIW the term "varlena json" is misleading. jsonb is also varlena, only
different. I think you need a different term to say that json uses the
text representation.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Hannu Krosing <hannu(at)2ndQuadrant(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-25 21:57:48
Message-ID: 530D11DC.9070303@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 02/25/2014 08:54 PM, Josh Berkus wrote:
> That's called a "straw man argument", Robert.
> Me: We should recommend that people use jsonb unless they have a
> specific reason for using json.
We could also make the opposite argument - people use json unless they
have a specific reason for using jsonb.

btw, there is one more thing about JSON which I recently learned - a lot of
JavaScript people actually expect the JSON binary form to retain field order

It is not in any specs, but nevertheless all major imlementations do it and
some code depends on it.
IIRC, this behaviour is currently also met only by json and not by jsonb.

> Merlin: We should present them side-by-side with a complex comparison.
> Robert: Josh wants to junk all relational data and use only jsonb! I
> mean, really, WTF?

Cheers

--
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-25 22:03:32
Message-ID: 530D1334.2080303@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 02/25/2014 12:12 PM, Robert Haas wrote:
> I don't agree that jsonb should be preferred in all but a handful of
> situations. Nor do I agree that partisanship belongs in our
> documentation. Therefore, -1 for your proposal to recommend that, and
> +1 for Merlin's proposal to present a comparison which fairly
> illustrates the situations in which each will outperform the other.

Awaiting doc patch from Merlin, then. It will need to be clear enough
that an ordinary user can distinguish which type they want.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-25 22:21:57
Message-ID: CAHyXU0z25MJU+6uagW+LPnjLN-6Fr7YfjMa-dGcA5YfHstGLBg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Feb 25, 2014 at 4:03 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> On 02/25/2014 12:12 PM, Robert Haas wrote:
>> I don't agree that jsonb should be preferred in all but a handful of
>> situations. Nor do I agree that partisanship belongs in our
>> documentation. Therefore, -1 for your proposal to recommend that, and
>> +1 for Merlin's proposal to present a comparison which fairly
>> illustrates the situations in which each will outperform the other.
>
> Awaiting doc patch from Merlin, then. It will need to be clear enough
> that an ordinary user can distinguish which type they want.

Sure.

merlin


From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-26 04:07:45
Message-ID: 530D6891.9010206@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 02/26/2014 06:21 AM, Merlin Moncure wrote:
> On Tue, Feb 25, 2014 at 4:03 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
>> On 02/25/2014 12:12 PM, Robert Haas wrote:
>>> I don't agree that jsonb should be preferred in all but a handful of
>>> situations. Nor do I agree that partisanship belongs in our
>>> documentation. Therefore, -1 for your proposal to recommend that, and
>>> +1 for Merlin's proposal to present a comparison which fairly
>>> illustrates the situations in which each will outperform the other.
>>
>> Awaiting doc patch from Merlin, then. It will need to be clear enough
>> that an ordinary user can distinguish which type they want.
>
> Sure.

Please also highlight that any change will require a full table rewrite
with an exclusive lock, so data type choices on larger tables may be
hard to change later.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-26 04:33:38
Message-ID: CAM3SWZRfj9Kut+VT=FC37VNACOG3x5DtsUh0jH5cGAM7p=bn-w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Feb 25, 2014 at 8:07 PM, Craig Ringer <craig(at)2ndquadrant(dot)com> wrote:
> Please also highlight that any change will require a full table rewrite
> with an exclusive lock, so data type choices on larger tables may be
> hard to change later.

It sure looks like they're binary-coercible to me:

+ CREATE CAST (hstore AS jsonb)
+ WITHOUT FUNCTION AS IMPLICIT;
+
+ CREATE CAST (jsonb AS hstore)
+ WITHOUT FUNCTION AS IMPLICIT;

Is this okay?
--
Peter Geoghegan


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Craig Ringer <craig(at)2ndquadrant(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-26 04:50:42
Message-ID: 20140226045041.GL2921@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Peter Geoghegan (pg(at)heroku(dot)com) wrote:
> On Tue, Feb 25, 2014 at 8:07 PM, Craig Ringer <craig(at)2ndquadrant(dot)com> wrote:
> > Please also highlight that any change will require a full table rewrite
> > with an exclusive lock, so data type choices on larger tables may be
> > hard to change later.
>
> It sure looks like they're binary-coercible to me:
>
> + CREATE CAST (hstore AS jsonb)
> + WITHOUT FUNCTION AS IMPLICIT;
> +
> + CREATE CAST (jsonb AS hstore)
> + WITHOUT FUNCTION AS IMPLICIT;
>
> Is this okay?

Err, I'm not following this thread all *that* closely, but I was pretty
sure the issue was json vs. jsonb, and I'd be mighty confused as to wtf
was going on if those were binary-coercible...

Thanks,

Stephen


From: Christophe Pettus <xof(at)thebuild(dot)com>
To: PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-26 07:17:16
Message-ID: 529BF3C7-8568-45F6-BD0E-158D4B430B78@thebuild.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On Feb 25, 2014, at 1:57 PM, Hannu Krosing <hannu(at)2ndQuadrant(dot)com> wrote:

> It is not in any specs, but nevertheless all major imlementations do it and
> some code depends on it.

I have no doubt that some code depends on it, but "all major implementations" is too strong a statement. BSON, in particular, does not have stable field order.

--
-- Christophe Pettus
xof(at)thebuild(dot)com


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-26 14:54:46
Message-ID: CAHyXU0zyOpUh_nbB36gHzN=uzg+RwzeSJ3hwkKCfJFCHmkNSGw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Feb 25, 2014 at 10:07 PM, Craig Ringer <craig(at)2ndquadrant(dot)com> wrote:
> On 02/26/2014 06:21 AM, Merlin Moncure wrote:
>> On Tue, Feb 25, 2014 at 4:03 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
>>> On 02/25/2014 12:12 PM, Robert Haas wrote:
>>>> I don't agree that jsonb should be preferred in all but a handful of
>>>> situations. Nor do I agree that partisanship belongs in our
>>>> documentation. Therefore, -1 for your proposal to recommend that, and
>>>> +1 for Merlin's proposal to present a comparison which fairly
>>>> illustrates the situations in which each will outperform the other.
>>>
>>> Awaiting doc patch from Merlin, then. It will need to be clear enough
>>> that an ordinary user can distinguish which type they want.
>>
>> Sure.
>
> Please also highlight that any change will require a full table rewrite
> with an exclusive lock, so data type choices on larger tables may be
> hard to change later.

Yeah. Good idea. Also gonna make a table of what happens when you
cast from A to B (via text, json, jsonb, hstore).

merlin


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Hannu Krosing <hannu(at)2ndquadrant(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-26 15:02:46
Message-ID: CAHyXU0xsrN_Zaf=4taqr2DQDsOGSp54Ab0Dd0NJvX6fqjQCZbg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Feb 25, 2014 at 3:57 PM, Hannu Krosing <hannu(at)2ndquadrant(dot)com> wrote:
> On 02/25/2014 08:54 PM, Josh Berkus wrote:
>> That's called a "straw man argument", Robert.
>> Me: We should recommend that people use jsonb unless they have a
>> specific reason for using json.
> We could also make the opposite argument - people use json unless they
> have a specific reason for using jsonb.
>
> btw, there is one more thing about JSON which I recently learned - a lot of
> JavaScript people actually expect the JSON binary form to retain field order
>
> It is not in any specs, but nevertheless all major imlementations do it and
> some code depends on it.
> IIRC, this behaviour is currently also met only by json and not by jsonb.

Yes: This was the agreement that was struck and is the main reason why
there are two json types, not one. JSON does not guarantee field
ordering as I read the spec and for the binary form ordering is not
maintained as a concession to using the hstore implementation.

You can always use the standard text json type for storage and cast
into the index for searching; what you give up there is some
performance and the ability to manipulate the json over the hstore
API. I think that will have to do for now and field ordering for
hstore/jsonb can be reserved as a research item.

merlin


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Christophe Pettus <xof(at)thebuild(dot)com>
Cc: PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-26 15:39:12
Message-ID: 530E0AA0.7010803@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 02/26/2014 02:17 AM, Christophe Pettus wrote:
> On Feb 25, 2014, at 1:57 PM, Hannu Krosing <hannu(at)2ndQuadrant(dot)com> wrote:
>
>> It is not in any specs, but nevertheless all major imlementations do it and
>> some code depends on it.
> I have no doubt that some code depends on it, but "all major implementations" is too strong a statement. BSON, in particular, does not have stable field order.
>

Not only is it "not in any specs", it's counter to the spec I have been
following <https://www.ietf.org/rfc/rfc4627.txt>, which quite
categorically states that an object is an UNORDERED collection. Any
application which relies on the ordering of object fields being
preserved is broken IMNSHO, and I would not feel the least guilt about
exposing their breakage.

cheers

andrew


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-26 17:41:13
Message-ID: 530E2739.1040806@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 02/26/2014 07:02 AM, Merlin Moncure wrote:
> On Tue, Feb 25, 2014 at 3:57 PM, Hannu Krosing <hannu(at)2ndquadrant(dot)com> wrote:
>> It is not in any specs, but nevertheless all major imlementations do it and
>> some code depends on it.
>> IIRC, this behaviour is currently also met only by json and not by jsonb.
>
> Yes: This was the agreement that was struck and is the main reason why
> there are two json types, not one. JSON does not guarantee field
> ordering as I read the spec and for the binary form ordering is not
> maintained as a concession to using the hstore implementation.

Actually, that's not true; neither Mongo/BSON nor CouchDB preserve field
ordering. So users who are familiar with JSONish data *storage* should
be aware that field ordering is not preserved.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-26 17:42:20
Message-ID: 530E277C.6090302@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 02/25/2014 08:07 PM, Craig Ringer wrote:
> On 02/26/2014 06:21 AM, Merlin Moncure wrote:
>> On Tue, Feb 25, 2014 at 4:03 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
>>> On 02/25/2014 12:12 PM, Robert Haas wrote:
>>>> I don't agree that jsonb should be preferred in all but a handful of
>>>> situations. Nor do I agree that partisanship belongs in our
>>>> documentation. Therefore, -1 for your proposal to recommend that, and
>>>> +1 for Merlin's proposal to present a comparison which fairly
>>>> illustrates the situations in which each will outperform the other.
>>>
>>> Awaiting doc patch from Merlin, then. It will need to be clear enough
>>> that an ordinary user can distinguish which type they want.
>>
>> Sure.
>
> Please also highlight that any change will require a full table rewrite
> with an exclusive lock, so data type choices on larger tables may be
> hard to change later.

Oh, point. I'll add that text if Merlin doesn't.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Hannu Krosing <hannu(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-26 17:57:38
Message-ID: CAHyXU0yEKOGdJ2kb0=GYA=Puw9RwDJ4_6B9CbuvyaBwxRfyC_A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Feb 26, 2014 at 11:41 AM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> On 02/26/2014 07:02 AM, Merlin Moncure wrote:
>> On Tue, Feb 25, 2014 at 3:57 PM, Hannu Krosing <hannu(at)2ndquadrant(dot)com> wrote:
>>> It is not in any specs, but nevertheless all major imlementations do it and
>>> some code depends on it.
>>> IIRC, this behaviour is currently also met only by json and not by jsonb.
>>
>> Yes: This was the agreement that was struck and is the main reason why
>> there are two json types, not one. JSON does not guarantee field
>> ordering as I read the spec and for the binary form ordering is not
>> maintained as a concession to using the hstore implementation.
>
> Actually, that's not true; neither Mongo/BSON nor CouchDB preserve field
> ordering. So users who are familiar with JSONish data *storage* should
> be aware that field ordering is not preserved.

right (although I'm not sure what wasn't true there). I think the
status quo is fine; If you have to have the document precisely
preserved for whatever reason you can do that -- you just have to be
prepared to give up some things. As noted in the other thread
serialization is more interesting but that also works fine. The
breakdown in terms of usage between json/jsonb to me is very clear
(json will handle serialization/deserializaton heavy patterns and a
few edge cases for storage). The split between json and jsonb in
hindsight made a lot of sense.

What is not going to be so clear for users (particularly without good
supporting documentation) is how things break down in terms of usage
between hstore and jsonb.

merlin


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Hannu Krosing <hannu(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-26 18:05:14
Message-ID: 530E2CDA.5020509@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 02/26/2014 09:57 AM, Merlin Moncure wrote:
> On Wed, Feb 26, 2014 at 11:41 AM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
>> On 02/26/2014 07:02 AM, Merlin Moncure wrote:
>>> On Tue, Feb 25, 2014 at 3:57 PM, Hannu Krosing <hannu(at)2ndquadrant(dot)com> wrote:
>>>> It is not in any specs, but nevertheless all major imlementations do it and
>>>> some code depends on it.
>>>> IIRC, this behaviour is currently also met only by json and not by jsonb.
>>>
>>> Yes: This was the agreement that was struck and is the main reason why
>>> there are two json types, not one. JSON does not guarantee field
>>> ordering as I read the spec and for the binary form ordering is not
>>> maintained as a concession to using the hstore implementation.
>>
>> Actually, that's not true; neither Mongo/BSON nor CouchDB preserve field
>> ordering. So users who are familiar with JSONish data *storage* should
>> be aware that field ordering is not preserved.
>
> right (although I'm not sure what wasn't true there). I think the

Sorry, I was referring to Hannu's statement that "all major
implementations preserve order", which simply isn't true.

> status quo is fine; If you have to have the document precisely
> preserved for whatever reason you can do that -- you just have to be
> prepared to give up some things. As noted in the other thread
> serialization is more interesting but that also works fine. The
> breakdown in terms of usage between json/jsonb to me is very clear
> (json will handle serialization/deserializaton heavy patterns and a
> few edge cases for storage). The split between json and jsonb in
> hindsight made a lot of sense.
>
> What is not going to be so clear for users (particularly without good
> supporting documentation) is how things break down in terms of usage
> between hstore and jsonb.

Realistically? Once we get done with mapping the indexes and operators,
users who are used to Hstore1 use Hstore2, and everyone else uses jsonb.
jsonb is nothing other than a standardized syntax interface to hstore2,
and most users will choose the syntax similar to what they already know
over learning new stuff.

A real, full comparison chart would include text, json, jsonb and
hstore, I guess. Although I'm wondering if that's way too complex for
the main docs. Seems like more of a wiki item.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Hannu Krosing <hannu(at)2ndQuadrant(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-26 18:08:32
Message-ID: 530E2DA0.3020108@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 02/26/2014 07:41 PM, Josh Berkus wrote:
> On 02/26/2014 07:02 AM, Merlin Moncure wrote:
>> On Tue, Feb 25, 2014 at 3:57 PM, Hannu Krosing <hannu(at)2ndquadrant(dot)com> wrote:
>>> It is not in any specs, but nevertheless all major imlementations do it and
>>> some code depends on it.
>>> IIRC, this behaviour is currently also met only by json and not by jsonb.
>> Yes: This was the agreement that was struck and is the main reason why
>> there are two json types, not one. JSON does not guarantee field
>> ordering as I read the spec and for the binary form ordering is not
>> maintained as a concession to using the hstore implementation.
> Actually, that's not true; neither Mongo/BSON nor CouchDB preserve field
> ordering.
That is strange at least for BSON, as it does not have any nearly as
sophisticated
internal format as hstore - no hash tables or anything, just a binary
serialisation.
It would take an extra effort to *not* keep the order there :)

http://bsonspec.org/#/specification

Cheers

--
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Hannu Krosing <hannu(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-26 19:39:41
Message-ID: CAHyXU0wgUg6UsAy9tGRDSU+xRJx7xzxgoAad4u7QF-x4853B9Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Feb 26, 2014 at 12:05 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> On 02/26/2014 09:57 AM, Merlin Moncure wrote:
>> What is not going to be so clear for users (particularly without good
>> supporting documentation) is how things break down in terms of usage
>> between hstore and jsonb.
>
> Realistically? Once we get done with mapping the indexes and operators,
> users who are used to Hstore1 use Hstore2, and everyone else uses jsonb.
> jsonb is nothing other than a standardized syntax interface to hstore2,
> and most users will choose the syntax similar to what they already know
> over learning new stuff.

The problem is that as of today, they are not done and AFAICT will not
be for 9.4. Developers wanting to utilize the nosql pattern are going
to have to lean heavily on hstore API and that's a simple
fact...people reading about all the great new feature of postgres are
going to want to learn how to do things and it's reasonable to want to
anticipate the things they want to do and explain how to use them. I
would like to extend that case coverage to include the json type as
well as its documentation is pretty lousy for that (I should know: I
wrote most of it).

merlin


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Hannu Krosing <hannu(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-26 20:45:06
Message-ID: 530E5252.9010003@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 02/26/2014 11:39 AM, Merlin Moncure wrote:
> On Wed, Feb 26, 2014 at 12:05 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
>> On 02/26/2014 09:57 AM, Merlin Moncure wrote:
>>> What is not going to be so clear for users (particularly without good
>>> supporting documentation) is how things break down in terms of usage
>>> between hstore and jsonb.
>>
>> Realistically? Once we get done with mapping the indexes and operators,
>> users who are used to Hstore1 use Hstore2, and everyone else uses jsonb.
>> jsonb is nothing other than a standardized syntax interface to hstore2,
>> and most users will choose the syntax similar to what they already know
>> over learning new stuff.
>
> The problem is that as of today, they are not done and AFAICT will not
> be for 9.4.

Well, we plan to push to have the indexes and operators available as an
extension by the time that 9.4 comes out.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-26 21:23:12
Message-ID: 530E5B40.6040606@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Apologies for top-post.

I have made some fixes based in these comments. A new patch for the
jsonb portion is attached.

Responses interspersed below.

Teodor, many of these comments are basically for you. Please respond.

cheers

andrew

On 02/10/2014 09:11 PM, Andres Freund wrote:
> Hi,
>
> Is it just me or is jsonapi.h not very well documented?

What about it do you think is missing? In any case, it's hardly relevant
to this patch, so I'll take that as obiter dicta.

>
> On 2014-02-06 18:47:31 -0500, Andrew Dunstan wrote:
>> +/*
>> + * for jsonb we always want the de-escaped value - that's what's in token
>> + */
>> +static void
>> +jsonb_in_scalar(void *state, char *token, JsonTokenType tokentype)
>> +{
>> + JsonbInState *_state = (JsonbInState *) state;
>> + JsonbValue v;
>> +
>> + v.size = sizeof(JEntry);
>> +
>> + switch (tokentype)
>> + {
>> +
>> + case JSON_TOKEN_STRING:
>> + v.type = jbvString;
>> + v.string.len = token ? checkStringLen(strlen(token)) : 0;
>> + v.string.val = token ? pnstrdup(token, v.string.len) : NULL;
>> + v.size += v.string.len;
>> + break;
>> + case JSON_TOKEN_NUMBER:
>> + v.type = jbvNumeric;
>> + v.numeric = DatumGetNumeric(DirectFunctionCall3(numeric_in, CStringGetDatum(token), 0, -1));
>> +
>> + v.size += VARSIZE_ANY(v.numeric) +sizeof(JEntry) /* alignment */ ;
> missing space.

Fixed.

>
> Why does + sizeof(JEntry) change anything about alignment? If it was
> aligned before, adding a statically sized value doesn't give any new
> guarantees about alignment?

Teodor, please comment.

>
>> +/*
>> + * jsonb type recv function
>> + *
>> + * the type is sent as text in binary mode, so this is almost the same
>> + * as the input function.
>> + */
>> +Datum
>> +jsonb_recv(PG_FUNCTION_ARGS)
>> +{
>> + StringInfo buf = (StringInfo) PG_GETARG_POINTER(0);
>> + text *result = cstring_to_text_with_len(buf->data, buf->len);
>> +
>> + return deserialize_json_text(result);
>> +}
> This is a bit absurd, we're receiving a string in a StringInfo buffer,
> just to copy it into text, and then in makeJsonLexContext() access the
> raw chars again.

I have fixed this so that we don't construct a text object just so we
can json parse a cstring.

>> +static void
>> +putEscapedValue(StringInfo out, JsonbValue *v)
>> +{
>> + switch (v->type)
>> + {
>> + case jbvNull:
>> + appendBinaryStringInfo(out, "null", 4);
>> + break;
>> + case jbvString:
>> + escape_json(out, pnstrdup(v->string.val, v->string.len));
>> + break;
>> + case jbvBool:
>> + if (v->boolean)
>> + appendBinaryStringInfo(out, "true", 4);
>> + else
>> + appendBinaryStringInfo(out, "false", 5);
>> + break;
>> + case jbvNumeric:
>> + appendStringInfoString(out, DatumGetCString(DirectFunctionCall1(numeric_out, PointerGetDatum(v->numeric))));
>> + break;
>> + default:
>> + elog(ERROR, "unknown jsonb scalar type");
>> + }
>> +}
> Hm, will the jbvNumeric always result in correct correct quoting?
> datum_to_json() does extra hangups for that case, any reason we don't
> need that here?

Yes, there is a reason we don't need it here. datum_to_json is
converting SQL numerics to json, and these might be strings such as
'Nan'. But we never store something in a jsonb numeric field unless it
came in as a json numeric format, which never needs quoting. The json
parser will never parse 'NaN' as a numeric value.

>> +char *
>> +JsonbToCString(StringInfo out, char *in, int estimated_len)
>> +{
> ...
>> + while (redo_switch || ((type = JsonbIteratorGet(&it, &v, false)) != 0))
>> + {
>> + redo_switch = false;
> Not sure if I see the advantage over the goto here. A comment explaining
> what the reason for the goto is wouldhave sufficed.

I think you're being pretty damn picky here. You whined about the goto,
I removed it, now you don't like that either. Personally I think this is
cleaner.

>
>> + case WJB_KEY:
>> + if (first == false)
>> + appendBinaryStringInfo(out, ", ", 2);
>> + first = true;
>> +
>> + putEscapedValue(out, &v);
>> + appendBinaryStringInfo(out, ": ", 2);
> putEscapedValue doesn't gurantee only strings are output, but
> datum_to_json does extra hangups for that case.

But the key here will always be a string. It's enforced by the JSON
rules. I suppose we could call escape_json directly here and save a
function call, but I don't agree that there is any problem here.

>
>> + type = JsonbIteratorGet(&it, &v, false);
>> + if (type == WJB_VALUE)
>> + {
>> + first = false;
>> + putEscapedValue(out, &v);
>> + }
>> + else
>> + {
>> + Assert(type == WJB_BEGIN_OBJECT || type == WJB_BEGIN_ARRAY);
>> + /*
>> + * We need to rerun current switch() due to put
>> + * in current place object which we just got
>> + * from iterator.
>> + */
> "due to put"?

I think that's due to the author not being a native English speaker.
I've tried to improve it a bit.

Teodor, please comment if you like.

>
>> +/*
>> + * jsonb type send function
>> + *
>> + * Just send jsonb as a string of text
>> + */
>> +Datum
>> +jsonb_send(PG_FUNCTION_ARGS)
>> +{
>> + Jsonb *jb = PG_GETARG_JSONB(0);
>> + StringInfoData buf;
>> + char *out;
>> +
>> + out = JsonbToCString(NULL, (JB_ISEMPTY(jb)) ? NULL : VARDATA(jb), VARSIZE(jb));
>> +
>> + pq_begintypsend(&buf);
>> + pq_sendtext(&buf, out, strlen(out));
>> + PG_RETURN_BYTEA_P(pq_endtypsend(&buf));
>> +}
> Why aren't you using using the stringbuf passing JsonbToCString
> convention here to avoid the strlen()?

Fixed.

>
>> +/*
>> + * Compare two jbvString JsonbValue values, third argument
>> + * 'arg', if it's not null, should be a pointer to bool
>> + * value which will be set to true if strings are equal and
>> + * untouched otherwise.
>> + */
>> +int
>> +compareJsonbStringValue(const void *a, const void *b, void *arg)
>> +{
>> + const JsonbValue *va = a;
>> + const JsonbValue *vb = b;
>> + int res;
>> +
>> + Assert(va->type == jbvString);
>> + Assert(vb->type == jbvString);
>> +
>> + if (va->string.len == vb->string.len)
>> + {
>> + res = memcmp(va->string.val, vb->string.val, va->string.len);
>> + if (res == 0 && arg)
>> + *(bool *) arg = true;
> Should be NULL, not 0.

No, the compiler doesn't like that for int values.

>
>> +/*
>> + * qsort helper to compare JsonbPair values, third argument
>> + * arg will be trasferred as is to subsequent
> *transferred.
>

fixed.

>> +/*
>> + * some constant order of JsonbValue
>> + */
>> +int
>> +compareJsonbValue(JsonbValue *a, JsonbValue *b)
>> +{
> Called recursively, needs to check for stack depth.

fixed.

Teodor, please examine and comment on all comments below this point.

>
>> +JsonbValue *
>> +findUncompressedJsonbValueByValue(char *buffer, uint32 flags,
>> + uint32 *lowbound, JsonbValue *key)
>> +{
> Functions like this *REALLY* need documentation for their
> parameters. And of their actual purpose.
>
> What's actually the uncompressed bit here? Isn't it actually the
> contrary? This is navigating the compressed, non-tree form, no?
>
>> + if (flags & JB_FLAG_ARRAY & header)
>> + {
>> + JEntry *array = (JEntry *) (buffer + sizeof(header));
>> + char *data = (char *) (array + (header & JB_COUNT_MASK));
>> + int i;
>> + for (i = (lowbound) ? *lowbound : 0; i < (header & JB_COUNT_MASK); i++)
>> + {
>> + JEntry *e = array + i;
>> + else if (JBE_ISSTRING(*e) && key->type == jbvString)
>> + {
>> + if (key->string.len == JBE_LEN(*e) &&
>> + memcmp(key->string.val, data + JBE_OFF(*e),
>> + key->string.len) == 0)
>> + {
> So, here we have our own undocumented! indexing system. Grand.
>
>> + else if (flags & JB_FLAG_OBJECT & header)
>> + {
>> + JEntry *array = (JEntry *) (buffer + sizeof(header));
>> + char *data = (char *) (array + (header & JB_COUNT_MASK) * 2);
>> + uint32 stopLow = lowbound ? *lowbound : 0,
>> + stopHigh = (header & JB_COUNT_MASK),
>> + stopMiddle;
> I don't understand what the point of the lowbound logic could be here?
> If a key hasn't been found, it hasn't been found? Maybe the idea is to
> use it when testing containedness or somesuch? Wouldn't iterating over
> the keyspace be a better idea for that case?
>
>> + if (key->type != jbvString)
>> + return NULL;
> That's not allowed, right?
>
>> +/*
>> + * Get i-th value of array or hash. if i < 0 then it counts from
>> + * the end of array/hash. Note: returns pointer to statically
>> + * allocated JsonbValue.
>> + */
>> +JsonbValue *
>> +getJsonbValue(char *buffer, uint32 flags, int32 i)
>> +{
>> + uint32 header = *(uint32 *) buffer;
>> + static JsonbValue r;
> Really? And why on earth would static allocation be a good idea? Specify
> it on the caller's stack if need be. Or even return by value, today's
> calling convention will just allocate that on the caller's stack without
> copying.
> Accessing static data isn't even faster.
>
>> + if (JBE_ISSTRING(*e))
>> + {
>> + r.type = jbvString;
>> + r.string.val = data + JBE_OFF(*e);
>> + r.string.len = JBE_LEN(*e);
>> + r.size = sizeof(JEntry) + r.string.len;
>> + }
>> + else if (JBE_ISBOOL(*e))
>> + {
>> + r.type = jbvBool;
>> + r.boolean = (JBE_ISBOOL_TRUE(*e)) ? true : false;
>> + r.size = sizeof(JEntry);
>> + }
>> + else if (JBE_ISNUMERIC(*e))
>> + {
>> + r.type = jbvNumeric;
>> + r.numeric = (Numeric) (data + INTALIGN(JBE_OFF(*e)));
>> +
>> + r.size = 2 * sizeof(JEntry) + VARSIZE_ANY(r.numeric);
>> + }
>> + else if (JBE_ISNULL(*e))
>> + {
>> + r.type = jbvNull;
>> + r.size = sizeof(JEntry);
>> + }
>> + else
>> + {
>> + r.type = jbvBinary;
>> + r.binary.data = data + INTALIGN(JBE_OFF(*e));
>> + r.binary.len = JBE_LEN(*e) - (INTALIGN(JBE_OFF(*e)) - JBE_OFF(*e));
>> + r.size = r.binary.len + 2 * sizeof(JEntry);
>> + }
> This bit of code exists pretty similarly in several places, maybe consolitate?
>
>> +/****************************************************************************
>> + * Walk on tree representation of jsonb *
>> + ****************************************************************************/
>> +static void
>> +walkUncompressedJsonbDo(JsonbValue *v, walk_jsonb_cb cb, void *cb_arg, uint32 level)
>> +{
>> + int i;
> check stack limit.
>
>> +void
>> +walkUncompressedJsonb(JsonbValue *v, walk_jsonb_cb cb, void *cb_arg)
>> +{
>> + if (v)
>> + walkUncompressedJsonbDo(v, cb, cb_arg, 0);
>> +}
>> +
>> +/****************************************************************************
>> + * Iteration over binary jsonb *
>> + ****************************************************************************/
> This needs docs.
>
>> +static void
>> +parseBuffer(JsonbIterator *it, char *buffer)
>> +{
> Why invent completely independent naming conventions to the previous
> functions here?
>
>> +static bool
>> +formAnswer(JsonbIterator **it, JsonbValue *v, JEntry * e, bool skipNested)
>> +{
> Imaginatively undescriptive name. But if it were slightly more more
> abstracted away from JsonbIterator it could be the answer to my prayers
> above about removing redundant code.
>
>> +static JsonbIterator *
>> +up(JsonbIterator *it)
>> +{
> Not a good name.
>
>> +int
>> +JsonbIteratorGet(JsonbIterator **it, JsonbValue *v, bool skipNested)
>> +{
>> + int res;
> recursive, stack depth check.
>
>> + switch ((*it)->type | (*it)->state)
>> + {
>> + case JB_FLAG_ARRAY | jbi_start:
> I don't know, but I don't see the point in avoid if (), else if()
> ... constructs if it requires such dirty tricks.
>
>
>> +/****************************************************************************
>> + * Transformation from tree to binary representation of jsonb *
>> + ****************************************************************************/
>> +typedef struct CompressState
>> +{
>> + char *begin;
>> + char *ptr;
>> +
>> + struct
>> + {
>> + uint32 i;
>> + uint32 *header;
>> + JEntry *array;
>> + char *begin;
>> + } *levelstate, *lptr, *pptr;
>> +
>> + uint32 maxlevel;
>> +
>> +} CompressState;
>> +
>> +#define curLevelState state->lptr
>> +#define prevLevelState state->pptr
> brrr.
>
> I stopped looking at code at this point.
>
>> diff --git a/src/backend/utils/adt/jsonfuncs.c b/src/backend/utils/adt/jsonfuncs.c
>> index e1d8aae..50ddf50 100644
>> --- a/src/backend/utils/adt/jsonfuncs.c
>> +++ b/src/backend/utils/adt/jsonfuncs.c
> there's lots of whitespace/tab damage in this file. Check git log/diff
> --check or such.
>
> This is still a mess, sorry:
> * Large and important part continue to be undocumented. Especially in
> jsonb_support.c
> * Lots of naming inconsistencies.
> * There's no documentation about what compressed/uncompressed jsonbs
> are. The former is the ondisk representation, the latter the in-memory
> tree representation.
> * There's no non-code documentation about the on-disk format.
>
> Unfortunately I can't see how this patch could get ready in time for
> this CF. There's *lots* of work to be done. The code as is isn't going
> to be maintainable. Much of it obvious by simply scanning through the
> code, without even looking for higher level issues. And much of it has
> previously been pointed out, without getting real attention.
>
> That's not to speak of the nested hstore patch, which I didn't even
> start to look at. That's twice this patches size.
>
> Greetings,
>
> Andres Freund
>

Attachment Content-Type Size
jsonb-12.patch.gz application/x-gzip 32.4 KB

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-26 21:59:24
Message-ID: CAM3SWZSk4Bmg3GDF9PDhtj1ZcY8=fdLtquFuF7r0JafEKXF6ww@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Feb 26, 2014 at 1:23 PM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
>>> + if (va->string.len == vb->string.len)
>>> + {
>>> + res = memcmp(va->string.val, vb->string.val,
>>> va->string.len);
>>> + if (res == 0 && arg)
>>> + *(bool *) arg = true;
>>
>> Should be NULL, not 0.
>
>
> No, the compiler doesn't like that for int values.

I'm confused. I just pulled from feodor/jsonb_and_hstore, and I do see
a compiler warning (because the code reads "res == NULL", unlike
above). It appears to have been that way in Git since last year. So,
maybe Andres meant that it *should* look like this?

--
Peter Geoghegan


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-26 22:10:42
Message-ID: 530E6662.8060101@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 02/26/2014 04:59 PM, Peter Geoghegan wrote:
> On Wed, Feb 26, 2014 at 1:23 PM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
>>>> + if (va->string.len == vb->string.len)
>>>> + {
>>>> + res = memcmp(va->string.val, vb->string.val,
>>>> va->string.len);
>>>> + if (res == 0 && arg)
>>>> + *(bool *) arg = true;
>>> Should be NULL, not 0.
>>
>> No, the compiler doesn't like that for int values.
> I'm confused. I just pulled from feodor/jsonb_and_hstore, and I do see
> a compiler warning (because the code reads "res == NULL", unlike
> above). It appears to have been that way in Git since last year. So,
> maybe Andres meant that it *should* look like this?
>
>

argh!

I forgot to save a file.

Here's what I get if it's NULL:

gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith
-Wdeclaration-after-statement -Wendif-labels
-Wmissing-format-attribute -Wformat-security -fno-strict-aliasing
-fwrapv -fexcess-precision=standard -g -I../../../../src/include
-D_GNU_SOURCE -I/usr/include/libxml2 -c -o jsonb_support.o
jsonb_support.c -MMD -MP -MF .deps/jsonb_support.Po
jsonb_support.c: In function ‘compareJsonbStringValue’:
jsonb_support.c:137:11: warning: comparison between pointer and
integer [enabled by default]

With 0 there is no complaint.

new patch attached, change pushed to github.

cheers

andrew

Attachment Content-Type Size
jsonb-13.patch.gz application/x-gzip 32.4 KB

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-26 22:45:41
Message-ID: 20140226224541.GF6718@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2014-02-26 16:23:12 -0500, Andrew Dunstan wrote:
> On 02/10/2014 09:11 PM, Andres Freund wrote:

> >Is it just me or is jsonapi.h not very well documented?
>
>
> What about it do you think is missing? In any case, it's hardly relevant to
> this patch, so I'll take that as obiter dicta.

It's relevant insofer because I tried to understand it, to understand
whether this patch's usage is sensible.

O n a quick reread of the header, what I am missing is:
* what's semstate in JsonSemAction? Private data?
* what's object_start and object_field_start? Presumably object vs
keypair? Why not use element as ifor the array?
* scalar_action is called for which types of tokens?
* what's exactly the meaning of the isnull parameter for ofield_action
and aelem_action?
* How is one supposed to actually access data in the callbacks, not
obvious for all the callbacks.
* are scalar callbacks triggered for object keys, object/array values?
...

> >>+static void
> >>+putEscapedValue(StringInfo out, JsonbValue *v)
> >>+{
> >>+ switch (v->type)
> >>+ {
> >>+ case jbvNull:
> >>+ appendBinaryStringInfo(out, "null", 4);
> >>+ break;
> >>+ case jbvString:
> >>+ escape_json(out, pnstrdup(v->string.val, v->string.len));
> >>+ break;
> >>+ case jbvBool:
> >>+ if (v->boolean)
> >>+ appendBinaryStringInfo(out, "true", 4);
> >>+ else
> >>+ appendBinaryStringInfo(out, "false", 5);
> >>+ break;
> >>+ case jbvNumeric:
> >>+ appendStringInfoString(out, DatumGetCString(DirectFunctionCall1(numeric_out, PointerGetDatum(v->numeric))));
> >>+ break;
> >>+ default:
> >>+ elog(ERROR, "unknown jsonb scalar type");
> >>+ }
> >>+}
> >Hm, will the jbvNumeric always result in correct correct quoting?
> >datum_to_json() does extra hangups for that case, any reason we don't
> >need that here?

> Yes, there is a reason we don't need it here. datum_to_json is converting
> SQL numerics to json, and these might be strings such as 'Nan'. But we never
> store something in a jsonb numeric field unless it came in as a json numeric
> format, which never needs quoting. The json parser will never parse 'NaN' as
> a numeric value.

Ah, yuck. Makes sense. Not your fault at all, but I do dislike json's
definition of numeric values.

> >>+char *
> >>+JsonbToCString(StringInfo out, char *in, int estimated_len)
> >>+{
> >...
> >>+ while (redo_switch || ((type = JsonbIteratorGet(&it, &v, false)) != 0))
> >>+ {
> >>+ redo_switch = false;
> >Not sure if I see the advantage over the goto here. A comment explaining
> >what the reason for the goto is wouldhave sufficed.
>
> I think you're being pretty damn picky here. You whined about the goto, I
> removed it, now you don't like that either. Personally I think this is
> cleaner.

Sorry, should perhaps have been a bit more precise in my disagreement
abou the goto version. I didn't dislike the goto itself, but that it was
a undocumented and unobvious change in control flow.

It's the reviewers job to be picky, I pretty damn sure don't expect you
to agree with all my points and I am perfectly fine if you disregard
several of them. I've just read through the patch quickly, so it's not
surprising if I misidentify some.

> >
> >>+ case WJB_KEY:
> >>+ if (first == false)
> >>+ appendBinaryStringInfo(out, ", ", 2);
> >>+ first = true;
> >>+
> >>+ putEscapedValue(out, &v);
> >>+ appendBinaryStringInfo(out, ": ", 2);
> >putEscapedValue doesn't gurantee only strings are output, but
> >datum_to_json does extra hangups for that case.
>
> But the key here will always be a string. It's enforced by the JSON rules. I
> suppose we could call escape_json directly here and save a function call,
> but I don't agree that there is any problem here.

Ah, yes, it will already have been converted to a string during the
initial conversion, right. /* json rules guarantee this is a string */
or something?

> >
> >>+ type = JsonbIteratorGet(&it, &v, false);
> >>+ if (type == WJB_VALUE)
> >>+ {
> >>+ first = false;
> >>+ putEscapedValue(out, &v);
> >>+ }
> >>+ else
> >>+ {
> >>+ Assert(type == WJB_BEGIN_OBJECT || type == WJB_BEGIN_ARRAY);
> >>+ /*
> >>+ * We need to rerun current switch() due to put
> >>+ * in current place object which we just got
> >>+ * from iterator.
> >>+ */
> >"due to put"?
>
>
> I think that's due to the author not being a native English speaker. I've
> tried to improve it a bit.

Oh, I perfectly understand that problem, believe me... I make many of
those myself, and I often don't see them in my own patches without them
being pointed out...
> >>+ if (va->string.len == vb->string.len)
> >>+ {
> >>+ res = memcmp(va->string.val, vb->string.val, va->string.len);
> >>+ if (res == 0 && arg)
> >>+ *(bool *) arg = true;
> >Should be NULL, not 0.
>
> No, the compiler doesn't like that for int values.

Yes, please disregard, I misread. I think I wanted actually to say that
the test for arg should be arg != NULL, because we don't usually do
pointer truth tests (which I personally find odd, but well).

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: "Erik Rijkers" <er(at)xs4all(dot)nl>
To: "Andrew Dunstan" <andrew(at)dunslane(dot)net>
Cc: "Peter Geoghegan" <pg(at)heroku(dot)com>, "Andres Freund" <andres(at)2ndquadrant(dot)com>, "Teodor Sigaev" <teodor(at)sigaev(dot)ru>, "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-26 22:48:26
Message-ID: 1f18bd66a5e0971070753a7b295d8aee.squirrel@webmail.xs4all.nl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, February 26, 2014 23:10, Andrew Dunstan wrote:
>
> new patch attached, change pushed to github.
>
> [jsonb-13.patch.gz]
>

This does not apply, see attached: src/backend/utils/adt/jsonfuncs.c.rej

Please ignore if this was not supposed to work together with the earlier nested-hstore-11.patch

github branch jsonb_and_hstore built fine; I'll use that instead.
( https://github.com/feodor/postgres.git )

thanks,

Erik Rijkers

patch -b -l -F 25 -p 1 < /home/aardvark/download/pgpatches/0094/nested_hstore/20140225/jsonb-13.patch
patching file doc/src/sgml/datatype.sgml
patching file doc/src/sgml/func.sgml
patching file src/backend/catalog/system_views.sql
Hunk #1 succeeded at 822 (offset 3 lines).
patching file src/backend/utils/adt/Makefile
patching file src/backend/utils/adt/json.c
patching file src/backend/utils/adt/jsonb.c
patching file src/backend/utils/adt/jsonb_support.c
patching file src/backend/utils/adt/jsonfuncs.c
Hunk #20 succeeded at 1419 with fuzz 2 (offset 4 lines).
Hunk #21 succeeded at 1691 (offset 4 lines).
Hunk #22 succeeded at 1843 (offset 4 lines).
Hunk #23 succeeded at 1940 (offset 4 lines).
Hunk #24 succeeded at 2013 (offset 4 lines).
Hunk #25 succeeded at 2035 (offset 4 lines).
Hunk #26 succeeded at 2054 (offset 4 lines).
Hunk #27 FAILED at 2090.
Hunk #28 succeeded at 2129 (offset 10 lines).
Hunk #29 succeeded at 2200 (offset 10 lines).
Hunk #30 succeeded at 2211 (offset 10 lines).
Hunk #31 succeeded at 2236 (offset 10 lines).
Hunk #32 succeeded at 2252 (offset 10 lines).
Hunk #33 succeeded at 2263 (offset 10 lines).
Hunk #34 succeeded at 2461 (offset 10 lines).
Hunk #35 succeeded at 2606 (offset 10 lines).
Hunk #36 succeeded at 2619 (offset 10 lines).
Hunk #37 FAILED at 2644.
Hunk #38 succeeded at 2692 (offset 14 lines).
Hunk #39 succeeded at 2702 (offset 14 lines).
Hunk #40 succeeded at 2730 (offset 14 lines).
2 out of 40 hunks FAILED -- saving rejects to file src/backend/utils/adt/jsonfuncs.c.rej
patching file src/include/catalog/pg_cast.h
patching file src/include/catalog/pg_operator.h
patching file src/include/catalog/pg_proc.h
patching file src/include/catalog/pg_type.h
patching file src/include/utils/json.h
patching file src/include/utils/jsonapi.h
patching file src/include/utils/jsonb.h
patching file src/test/regress/expected/jsonb.out
patching file src/test/regress/expected/jsonb_1.out
patching file src/test/regress/parallel_schedule
patching file src/test/regress/serial_schedule
patching file src/test/regress/sql/jsonb.sql

Attachment Content-Type Size
jsonfuncs.c.rej application/x-reject 2.0 KB

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Erik Rijkers <er(at)xs4all(dot)nl>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-27 00:12:05
Message-ID: 530E82D5.8070004@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 02/26/2014 05:48 PM, Erik Rijkers wrote:
> On Wed, February 26, 2014 23:10, Andrew Dunstan wrote:
>> new patch attached, change pushed to github.
>>
>> [jsonb-13.patch.gz]
>>
> This does not apply, see attached: src/backend/utils/adt/jsonfuncs.c.rej
>
> Please ignore if this was not supposed to work together with the earlier nested-hstore-11.patch
>
> github branch jsonb_and_hstore built fine; I'll use that instead.
> ( https://github.com/feodor/postgres.git )
>
>
>

Ugh, my master repo was 24 hours behind.

New patch attached..

Attachment Content-Type Size
jsonb-14.patch.gz application/x-gzip 32.5 KB

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-27 00:17:24
Message-ID: CAM3SWZT8sOiC_qWgLR0pRGDqWVgwjLAi59qnve-HD6fNvgbHjQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Feb 26, 2014 at 2:10 PM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
> new patch attached, change pushed to github.

> + /* GUC variables */
> + static bool pretty_print_var = false;
> + #define SET_PRETTY_PRINT_VAR(x) ((pretty_print_var) ? \
> + ((x) | PrettyPrint) : (x))

I think that this is not a great idea. I think that we should do away
with the GUC, but keep the function hstore_print() so we can pretty
print that way. I don't believe that this falls afoul of the usual
obvious reasons for not varying the behavior of IO routines with a
GUC, since it only varies whitespace, but it is surely pretty
questionable to have this GUC's setting vary the output of hstore_out,
an IMMUTABLE function.

--
Peter Geoghegan


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-27 01:01:54
Message-ID: 20140227010154.GK4759@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andres Freund wrote:
> On 2014-02-26 16:23:12 -0500, Andrew Dunstan wrote:

> > >>+ if (va->string.len == vb->string.len)
> > >>+ {
> > >>+ res = memcmp(va->string.val, vb->string.val, va->string.len);
> > >>+ if (res == 0 && arg)
> > >>+ *(bool *) arg = true;
> > >Should be NULL, not 0.
> >
> > No, the compiler doesn't like that for int values.
>
> Yes, please disregard, I misread. I think I wanted actually to say that
> the test for arg should be arg != NULL, because we don't usually do
> pointer truth tests (which I personally find odd, but well).

Pointer validity tests seem to be mostly a matter of personal
preference. I know I sometimes use just "if (foo)" and other times "if
(foo != NULL)". Both idioms are used inconsistently all over the place.
We even have a PointerIsValid() macro.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Andres Freund <andres(at)2ndquadrant(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-27 01:06:55
Message-ID: 20140227010655.GL4759@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Peter Geoghegan wrote:
> On Wed, Feb 26, 2014 at 2:10 PM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
> > new patch attached, change pushed to github.
>
> > + /* GUC variables */
> > + static bool pretty_print_var = false;
> > + #define SET_PRETTY_PRINT_VAR(x) ((pretty_print_var) ? \
> > + ((x) | PrettyPrint) : (x))
>
> I think that this is not a great idea. I think that we should do away
> with the GUC, but keep the function hstore_print() so we can pretty
> print that way. I don't believe that this falls afoul of the usual
> obvious reasons for not varying the behavior of IO routines with a
> GUC, since it only varies whitespace, but it is surely pretty
> questionable to have this GUC's setting vary the output of hstore_out,
> an IMMUTABLE function.

I don't see this in the submitted patch. What's going on?

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Andres Freund <andres(at)2ndquadrant(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-27 01:09:07
Message-ID: CAM3SWZQgntuwDtXn7nYE835rbVM8peYjSGhObQgOa0rgLY6aQg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Feb 26, 2014 at 5:06 PM, Alvaro Herrera
<alvherre(at)2ndquadrant(dot)com> wrote:
>> I think that this is not a great idea. I think that we should do away
>> with the GUC, but keep the function hstore_print() so we can pretty
>> print that way. I don't believe that this falls afoul of the usual
>> obvious reasons for not varying the behavior of IO routines with a
>> GUC, since it only varies whitespace, but it is surely pretty
>> questionable to have this GUC's setting vary the output of hstore_out,
>> an IMMUTABLE function.
>
> I don't see this in the submitted patch. What's going on?

I'm working off the Github branch here, as of an hour ago, since I was
under the impression that the patches submitted are merely snapshots
of that (plus I happen to strongly prefer not dealing with patch files
for something this big). Which submitted patch?

--
Peter Geoghegan


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-27 01:44:33
Message-ID: 530E9881.9030201@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 02/26/2014 08:09 PM, Peter Geoghegan wrote:
> On Wed, Feb 26, 2014 at 5:06 PM, Alvaro Herrera
> <alvherre(at)2ndquadrant(dot)com> wrote:
>>> I think that this is not a great idea. I think that we should do away
>>> with the GUC, but keep the function hstore_print() so we can pretty
>>> print that way. I don't believe that this falls afoul of the usual
>>> obvious reasons for not varying the behavior of IO routines with a
>>> GUC, since it only varies whitespace, but it is surely pretty
>>> questionable to have this GUC's setting vary the output of hstore_out,
>>> an IMMUTABLE function.
>> I don't see this in the submitted patch. What's going on?
> I'm working off the Github branch here, as of an hour ago, since I was
> under the impression that the patches submitted are merely snapshots
> of that (plus I happen to strongly prefer not dealing with patch files
> for something this big). Which submitted patch?
>
>

It's in the nested hstore patch. I've been splitting this into two
pieces. See
<http://www.postgresql.org/message-id/530D0646.8020407@dunslane.net> for
the latest hstore piece.

cheers

andrew


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-27 02:29:57
Message-ID: CA+TgmoaRH=B25NTb8JqrZtRgPsHk4_DmtkeV+TvX0n18heHSpQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Feb 26, 2014 at 3:45 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> On 02/26/2014 11:39 AM, Merlin Moncure wrote:
>> On Wed, Feb 26, 2014 at 12:05 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
>>> On 02/26/2014 09:57 AM, Merlin Moncure wrote:
>>>> What is not going to be so clear for users (particularly without good
>>>> supporting documentation) is how things break down in terms of usage
>>>> between hstore and jsonb.
>>>
>>> Realistically? Once we get done with mapping the indexes and operators,
>>> users who are used to Hstore1 use Hstore2, and everyone else uses jsonb.
>>> jsonb is nothing other than a standardized syntax interface to hstore2,
>>> and most users will choose the syntax similar to what they already know
>>> over learning new stuff.
>>
>> The problem is that as of today, they are not done and AFAICT will not
>> be for 9.4.
>
> Well, we plan to push to have the indexes and operators available as an
> extension by the time that 9.4 comes out.

Why can't this whole thing be shipped as an extension? It might well
be more convenient to have the whole thing packaged as an extension
than to have parts of it in core and parts of it not in core.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-27 02:43:45
Message-ID: CAM3SWZQSkirorRpxOkRag_W4O4bXBaxE5Xdm4OJe3qaoj7j-TA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Feb 26, 2014 at 6:29 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> Why can't this whole thing be shipped as an extension? It might well
> be more convenient to have the whole thing packaged as an extension
> than to have parts of it in core and parts of it not in core.

That's a good question. I think having everything in contrib would
make it easier to resolve the disconnect between jsonb and hstore. As
things stand, there is a parallel set of functions and operators for
hstore and jsonb, with the former set much larger than the latter. I'm
not terribly happy with that.

--
Peter Geoghegan


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-27 03:42:44
Message-ID: 530EB434.6060204@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 02/26/2014 09:43 PM, Peter Geoghegan wrote:
> On Wed, Feb 26, 2014 at 6:29 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> Why can't this whole thing be shipped as an extension? It might well
>> be more convenient to have the whole thing packaged as an extension
>> than to have parts of it in core and parts of it not in core.
> That's a good question. I think having everything in contrib would
> make it easier to resolve the disconnect between jsonb and hstore. As
> things stand, there is a parallel set of functions and operators for
> hstore and jsonb, with the former set much larger than the latter. I'm
> not terribly happy with that.
>
>

The jsonb set will get larger as time goes on. I don't think either of
you are thinking very clearly about how we would do this. Extensions
can't call each other's code. So the whole notion we have here of
sharing the tree-ish data representation and a lot of the C API would go
out the window, unless you want to shoehorn jsonb into hstore. Frankly,
we'll look silly with json as a core type and the more capable jsonb not.

Not to mention that if at this stage people suddenly decide we should
change direction on a course that has been very publicly discussed over
quite a considerable period, and for which Teodor and I and others have
put in a great deal of work, I at least am going to be extremely annoyed
(note the characteristic Australian used of massive understatement.)

cheers

andrew


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-27 04:12:25
Message-ID: 20140227041225.GJ2921@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Andrew Dunstan (andrew(at)dunslane(dot)net) wrote:
> The jsonb set will get larger as time goes on. I don't think either
> of you are thinking very clearly about how we would do this.
> Extensions can't call each other's code.

Yeah, that was puzzling me too.

Agree with the rest of your comments as well.

Thanks,

Stephen


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-27 09:56:54
Message-ID: CAM3SWZSEdsrFscBv3_ono-BMVRZ0w1dC-JB_X5ryhGpkQY_gWA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Feb 26, 2014 at 7:42 PM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
> The jsonb set will get larger as time goes on. I don't think either of you
> are thinking very clearly about how we would do this. Extensions can't call
> each other's code. So the whole notion we have here of sharing the tree-ish
> data representation and a lot of the C API would go out the window, unless
> you want to shoehorn jsonb into hstore. Frankly, we'll look silly with json
> as a core type and the more capable jsonb not.

When are you going to add more jsonb functions? ISTM that you have a
bunch of new ones right here (i.e. the hstore functions and
operators). Why not add those ones right now?

I don't understand why you'd consider it to be a matter of shoehorning
jsonb into hstore (and yes, that is what I was suggesting). jsonb is a
type with an implict cast to hstore, that is binary coercible both
ways. Oleg and Teodor had at one point considered having the ouput
format controlled entirely by a GUC, so there'd be no new jsonb type
at all. While I'm not asserting that you should definitely not
structure things this way (i.e. have substantial in-core changes), it
isn't obvious to me why this can't work as an extension, especially if
doing everything as part of an extension helps the implementation.
Please point out anything that I may have missed.

Speaking from a Heroku perspective, I know the company places a huge
value on jsonb. However, I believe it matters not a whit to adoption
whether or not it's an extension, except insofar as having it be an
extension helps the implementation effort (that is, that it helps
there be something to adopt), or hinders that effort.

--
Peter Geoghegan


From: Hannu Krosing <hannu(at)2ndQuadrant(dot)com>
To: Christophe Pettus <xof(at)thebuild(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-27 11:19:46
Message-ID: 530F1F52.10509@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 02/26/2014 09:17 AM, Christophe Pettus wrote:
> On Feb 25, 2014, at 1:57 PM, Hannu Krosing <hannu(at)2ndQuadrant(dot)com> wrote:
>
>> It is not in any specs, but nevertheless all major imlementations do it and
>> some code depends on it.
> I have no doubt that some code depends on it, but "all major implementations" is
> too strong a statement. BSON, in particular, does not have stable field order.
First, BSON is not JSON :)

And I do not really see how the don't preserve the field order - the
structure
is pretty similar to tnetstrings, just binary concatenation of datums
with a bit
more types.

It is possible that some functions on BSON do not preserve it for some
reason ...

Cheers

--
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-27 11:54:51
Message-ID: CA+TgmoYZj7f3WzkhVZ_6vp=twu7_KvZxnHtkJQ6J6BnkdNqVyQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Feb 26, 2014 at 10:42 PM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
>>> Why can't this whole thing be shipped as an extension? It might well
>>> be more convenient to have the whole thing packaged as an extension
>>> than to have parts of it in core and parts of it not in core.
>>
>> That's a good question. I think having everything in contrib would
>> make it easier to resolve the disconnect between jsonb and hstore. As
>> things stand, there is a parallel set of functions and operators for
>> hstore and jsonb, with the former set much larger than the latter. I'm
>> not terribly happy with that.
>>
> The jsonb set will get larger as time goes on. I don't think either of you
> are thinking very clearly about how we would do this. Extensions can't call
> each other's code. So the whole notion we have here of sharing the tree-ish
> data representation and a lot of the C API would go out the window, unless
> you want to shoehorn jsonb into hstore. Frankly, we'll look silly with json
> as a core type and the more capable jsonb not.

It's not very clear to me why we think it's a good idea to share the
tree-ish representation between json and hstore. In deference to your
comments that this has been very publicly discussed over quite a
considerable period, I went back and tried to find the email in which
the drivers for that design decision were laid out. I can find no
such email; in fact, the first actual nested hstore patch I can find
is from January 13th and the first jsonb patch I can find is from
February 9th. Neither contains anything much more than the patch
itself, without anything at all describing the design, let alone
explaining why it was chosen. And although there are earlier mentions
of both nested hstore and jsonb, there's nothing that says, OK, this
is why we're doing it that way. Or if there is, I couldn't find it.

So I tried to tease it out from looking at the patches. As nearly as
I can tell, the reason for making jsonb use hstore's binary format is
because then we can build indexes on jsonbfield::hstore, and the
actual type conversion will be a no-op; and the reason for upgrading
hstore to allow nested keys is so that jsonb can map onto it. So from
where I sit this whole thing looks like a very complicated exercise to
try to reuse parts of the existing hstore opclasses until such time as
jsonb opclasses of its own. But if, as Josh postulates, those
opclasses are going to materialize within a matter of months, then the
whole need for these things to share the same binary format is going
to go away before 9.4 is even out the door. That may not be a good
enough reason to tie these things together inextricably. Once jsonb
has its own opclasses, it can ship as a standalone data type without
needing to depend on hstore or anything else.

I may well be missing some other benefit here, so please feel free to
enlighten me.

> Not to mention that if at this stage people suddenly decide we should change
> direction on a course that has been very publicly discussed over quite a
> considerable period, and for which Teodor and I and others have put in a
> great deal of work, I at least am going to be extremely annoyed (note the
> characteristic Australian used of massive understatement.)

Unless I've missed some emails sent earlier than the dates noted
above, which is possible, the comments by myself and others on this
thread ought to be regarded as timely review. The basic problem here
is that this patch wasn't timely submitted, still doesn't seem to be
very done, and it's getting rather late. We therefore face the usual
problem of deciding whether to commit something that we might regret
later. If jsonb turns out to the wrong solution to the json problem,
will there be community support for adding a jsonc type next year? I
bet not. You may think this is most definitely the right direction to
go and you may even be right, but our ability to maneuver and back out
of things goes down to nearly zero once a release goes out the door,
so I think it's entirely appropriate to question whether we're
charting the best possible course. But I certainly understand the
annoyance.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: David E(dot) Wheeler <david(at)justatheory(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Peter Geoghegan <pg(at)heroku(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-27 17:37:46
Message-ID: 262DA211-38DF-410F-8BE7-FD0CDC814B89@justatheory.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Feb 27, 2014, at 3:54 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> It's not very clear to me why we think it's a good idea to share the
> tree-ish representation between json and hstore. In deference to your
> comments that this has been very publicly discussed over quite a
> considerable period, I went back and tried to find the email in which
> the drivers for that design decision were laid out. I can find no
> such email; in fact, the first actual nested hstore patch I can find
> is from January 13th and the first jsonb patch I can find is from
> February 9th. Neither contains anything much more than the patch
> itself, without anything at all describing the design, let alone
> explaining why it was chosen. And although there are earlier mentions
> of both nested hstore and jsonb, there's nothing that says, OK, this
> is why we're doing it that way. Or if there is, I couldn't find it.

FWIW, It was discussed quite a bit in meatspace, at the PGCon unconference last spring.

> Unless I've missed some emails sent earlier than the dates noted
> above, which is possible, the comments by myself and others on this
> thread ought to be regarded as timely review. The basic problem here
> is that this patch wasn't timely submitted, still doesn't seem to be
> very done, and it's getting rather late.

The hstore patch landed in the Nov/Dec patch fest, sent to the list on Nov 12. The discussion that led to the decision to implement jsonb was carried out for the week after that. Here’s the thread:

http://www.postgresql.org/message-id/528274F3.3060403@sigaev.ru

There was also quite a bit of discussion that week in the “additional json functionality” thread.

http://www.postgresql.org/message-id/528274D0.7070709@dunslane.net

I submitted a review of hstore2, adding documentation, on Dec 20. Andrew got the patch updated with jsonb type, per discussion, and based on a first cut by Teodor, in January, I forget when. v7 was sent to the list on Jan 29. So while some stuff has been added a bit late, it was based on discussion and the example of hstore's code.

I think you might have missed quite a bit of the earlier discussion because it was in an hstore thread, not a JSON or JSONB thread.

> We therefore face the usual
> problem of deciding whether to commit something that we might regret
> later. If jsonb turns out to the wrong solution to the json problem,
> will there be community support for adding a jsonc type next year? I
> bet not.

Bit of a red herring, that. You could make that argument about just about *any* data type. I realize it's more loaded for object data types, but personally I have a hard time imagining something other than a text-based type or a binary type. There was disagreement as to whether the binary type should replace the text type, and the consensus of the discussion was to have both. (And then we had 10,000 messages bike-sheadding the name of the binary type, naturally.)

> You may think this is most definitely the right direction to
> go and you may even be right, but our ability to maneuver and back out
> of things goes down to nearly zero once a release goes out the door,
> so I think it's entirely appropriate to question whether we're
> charting the best possible course. But I certainly understand the
> annoyance.

Like the hstore type, the jsonb type has a version bit, so if we decide to change its representation to make it more efficient in the future, we will be able to do so without having to introduce a new type. Maybe someday we will want a completely different JSON implementation based on genetic mappings or quantum superpositions or something, but I would not hold up the ability to improve the speed of accessing values, let alone full path indexing via GIN indexing, because we might want to do something different in the future. Besides, hstore has proved itself pretty well over time, so I think it’s pretty safe to adopt its implementation to make an awesome jsonb type.

Best,

David


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-27 19:11:09
Message-ID: 530F8DCD.40401@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 02/27/2014 01:56 AM, Peter Geoghegan wrote:
> I don't understand why you'd consider it to be a matter of shoehorning
> jsonb into hstore (and yes, that is what I was suggesting).

Because the course Andrew is following is the one which *this list*
decided on in CF3, no matter that people who participated in that
discussion seem to have collective amnesia. There was a considerable
amount of effort involved in implementing things this way, so if Hackers
suddenly want to retroactively change a collective decision, I think
they should be prepared to pitch in and help implement the changed plan.

One of the issues there is that, due to how we handle types, a type
which has been available as an extension can never ever become a core
type because it breaks upgrading, per the discussion about hstore2. For
better or for worse, we chose to make json-text a core type when it was
introduced (and XML before it, although that was before CREATE
EXTENSION). This means that, if we have jsonb as an extension, we'll
eventually be in the position where the recommended json type with all
the features is an extension, whereas the legacy json type is in core.

However, we had this discussion already in November-December, which
resulted in the current patch. Now you and Robert want to change the
rules on Andrew, which means Andrew is ready to quit, and we go another
year without JSON indexing.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-27 20:06:33
Message-ID: 530F9AC9.3080807@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 02/26/2014 05:45 PM, Andres Freund wrote:
> On 2014-02-26 16:23:12 -0500, Andrew Dunstan wrote:
>> On 02/10/2014 09:11 PM, Andres Freund wrote:
>>> Is it just me or is jsonapi.h not very well documented?
>>
>> What about it do you think is missing? In any case, it's hardly relevant to
>> this patch, so I'll take that as obiter dicta.
> It's relevant insofer because I tried to understand it, to understand
> whether this patch's usage is sensible.
>
> O n a quick reread of the header, what I am missing is:
> * what's semstate in JsonSemAction? Private data?
> * what's object_start and object_field_start? Presumably object vs
> keypair? Why not use element as ifor the array?
> * scalar_action is called for which types of tokens?
> * what's exactly the meaning of the isnull parameter for ofield_action
> and aelem_action?
> * How is one supposed to actually access data in the callbacks, not
> obvious for all the callbacks.
> * are scalar callbacks triggered for object keys, object/array values?
> ...

You realize that this API dates from 9.3 and has been used in numerous
extensions, right? So the names are pretty well fixed, for good or ill.

semstate is private data. This is at least implied:

* parse_json will parse the string in the lex calling the
* action functions in sem at the appropriate points. It is
* up to them to keep what state they need in semstate. If they
* need access to the state of the lexer, then its pointer
* should be passed to them as a member of whatever semstate
* points to.

object_start is called, as its name suggests, at the start of on object.
object_field_start is called at the start of a key/value pair.

isnull is true iff the value in question is a json null.

scalar action as not called for object keys, but is called for scalar
object values or array elements, in fact for any value that's not an
object or array (i.e. for a (non-key) string, number, true, false, null).

You access json fragments by pulling them from the lexical object.
jsonfuncs.c is chock full of examples.

cheers

andrew


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-27 21:28:34
Message-ID: CAHyXU0w-VGHw8V-0Vn8hoYwYVATi3wU3+thUA6-UV=cQar6ZEg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Feb 27, 2014 at 1:11 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> However, we had this discussion already in November-December, which
> resulted in the current patch. Now you and Robert want to change the
> rules on Andrew, which means Andrew is ready to quit, and we go another
> year without JSON indexing.

How we got here is not the point. All that matters is what's going to
happen from here. Here are the facts as I see them:

1) we've worked ourselves into a situation where we're simultaneously
developing two APIs that do essentially exactly the same thing (hstore
and jsonb). Text json is not the problem and is irrelevant to the
discussion.

2) The decision to do that was made a long time ago. I complained
loudly as my mousy no-programming-only-griping voice would allow here:
http://postgresql.1045698.n5.nabble.com/JSON-Function-Bike-Shedding-tp5744932p5746152.html.
The decision was made (and Robert cast one of the deciding votes in
support of that decision) to bifurcate hstore/json. I firmly believe
that was a mistake but there's no point in revisiting it. Done is
done.

3) In it's current state jsonb is not very useful and we have to
recognize that; it optimizes text json but OTOH covers, maybe 30-40%
of what hstore offers. In particular, it's missing manipulation and
GIST/GIN. The stuff it does offer however is how Andrew, Josh and
others perceive the API will be used and I defer to them with the
special exception of deserialization (the mirror of to_json) which is
currently broken or near-useless in all three types. Andrew
recognized that and has suggested a fix; even then to me it only
matters to the extent that the API is clean and forward compatible.

Here are the options on the table:
a) Push everything to 9.5 and introduce out of core hstore2/jsonb
extensions to meet market demand. Speaking practically, 'out of core'
translates to "Can't be used" to most industrial IT shops. I hate
this option but recognize it's the only choice if the code isn't ready
in time.

b) Accept hstore2 but push jsonb on the premise they should be married
in some way or that jsonb simply isn't ready. I'm not a fan of this
option either unless Andrew specifically thinks it's a good idea. The
stuff that is there seems to work pretty well (again, except
deserialization which I haven't tested recently) and the jsonb
patterns that are in place have some precedent in terms of the text
json type.

c) Accept hstore2 and jsonb as in-core extensions (assuming code
worthiness). Since extensions can't call into each other (this really
ought to be solved at some point) this means a lot of code copy/pasto.
The main advantage here is that it reduces the penalty of failure
and avoids pollution of the public schema. I did not find the
rationale upthread that there was a stigma to in-core extensions in
any way convincing. In fact I'd go further and suggest that we really
ought to have a project policy to have all non-SQL standard functions,
operators and types as extensions from here on out. Each in-core type
introduction after having introduced the extension system has left me
scratching my head.

d) The status quo. This essentially means we'll have to liberally
document how things are (to avoid confusing our hapless users) and
take Andrew at his word that a separate extension will materialize
making jsonb more broadly useful. The main concern here is that the
market will vote with their feet and adopt hstore API style broadly,
sticking us with a bunch of marginally used functions in the public
namespace to support forever.

My personal preference is c) but am perfectly ok with d), particularly
if there was more visibility into the long term planning. Good
documentation will help either way and that's why I signed up for it.

merlin


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-27 21:49:16
Message-ID: CAM3SWZR2mWUNFoQdWQmEsJsvaEBqq6jhfCM1Wevwc7r=tPFuRw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Feb 27, 2014 at 11:11 AM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> Because the course Andrew is following is the one which *this list*
> decided on in CF3, no matter that people who participated in that
> discussion seem to have collective amnesia. There was a considerable
> amount of effort involved in implementing things this way, so if Hackers
> suddenly want to retroactively change a collective decision, I think
> they should be prepared to pitch in and help implement the changed plan.

I think you've completely misunderstood my remarks. For the most part
I agree that there are advantages to having hstore and jsonb share the
same tree representation, and this may be where Robert and I differ.
My concern is: Having gone to that considerable amount of effort, why
on earth does jsonb not get the benefit of the hstore stuff
immediately, since it's virtually the same thing? What is it that
we're actually being asked to wait for?

Let me be more concrete about what my concern is right now:

postgres=# select '{"foo":{"bar":"yellow"}}'::jsonb || '{}'::jsonb;
?column?
--------------------------
"foo"=>{"bar"=>"yellow"}
(1 row)

I put in jsonb, but got out hstore, for this totally innocent use of
the concatenation operator. Now, maybe the answer here is that we
require people to cast for this kind of thing while using jsonb. The
problems I see with that are:

1. It's pretty ugly, in a way that people that care about jsonb are
particularly unlikely to find acceptable. When you mix in GIN/GiST
compatibility to the mix, it gets uglier.

2. Don't we already have a much simpler way of casting from hstore to json?

> One of the issues there is that, due to how we handle types, a type
> which has been available as an extension can never ever become a core
> type because it breaks upgrading, per the discussion about hstore2. For
> better or for worse, we chose to make json-text a core type when it was
> introduced (and XML before it, although that was before CREATE
> EXTENSION). This means that, if we have jsonb as an extension, we'll
> eventually be in the position where the recommended json type with all
> the features is an extension, whereas the legacy json type is in core.

I take issue with characterizing the original json type as legacy
(it's json, not anything else - jsonb isn't quite json, much like
BSON), but leaving that aside: So? I mean, really: what are the
practical consequences of packing everything as an extension? I can
see some benefits to doing it, but like Robert I have a harder time
seeing a cost.

To be clear: I would really like for jsonb to have parity with hstore.
I don't understand how you can argue for it being unfortunate that the
original json may occupy a privileged position as a core type over
jsonb on the one hand, while not also taking issue with jsonb clearly
playing second fiddle to hstore. Wasn't the whole point of their
sharing a binary representation that that didn't have to happen?

--
Peter Geoghegan


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Josh Berkus <josh(at)agliodbs(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 00:27:57
Message-ID: CAM3SWZSuf=9GPrcXHCcVBtnmk+rc_mq=p_qeuEkbYFvytS=dXg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Feb 27, 2014 at 3:54 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> So I tried to tease it out from looking at the patches. As nearly as
> I can tell, the reason for making jsonb use hstore's binary format is
> because then we can build indexes on jsonbfield::hstore, and the
> actual type conversion will be a no-op; and the reason for upgrading
> hstore to allow nested keys is so that jsonb can map onto it.

I think that a typed, nested hstore has considerable independent
value, and would have had just the same value 10 years ago, before
JSON existed. I'm told that broadly speaking most people would prefer
the interface to speak JSON, and I'd like to give people what they
want, but that's as far as it goes. While I see problems with some
aspects of the patches as implemented, I think that the reason that
the two types share a binary format is that they're basically the same
thing. It might be that certain facets of the nested hstore
implementation reflect a need to accommodate jsonb, but there are no
ones that I'm currently aware of that I find at all objectionable.

--
Peter Geoghegan


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 01:31:29
Message-ID: CAM3SWZSLybxywH6p2pGhHFGZMzkHqBWkfr83mrzQVsoyqFB9xw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Feb 27, 2014 at 1:28 PM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
> 3) In it's current state jsonb is not very useful and we have to
> recognize that; it optimizes text json but OTOH covers, maybe 30-40%
> of what hstore offers. In particular, it's missing manipulation and
> GIST/GIN. The stuff it does offer however is how Andrew, Josh and
> others perceive the API will be used and I defer to them with the
> special exception of deserialization (the mirror of to_json) which is
> currently broken or near-useless in all three types. Andrew
> recognized that and has suggested a fix; even then to me it only
> matters to the extent that the API is clean and forward compatible.

It's missing manipulation (in the sense that the implicit cast
sometimes produces surprising results, in particular for operators
that return hstore), but it isn't really missing GiST/GIN support as
compared to hstore, AFAICT:

postgres=# select * from foo;
i
-------------------------------
{"foo": {"bar": "yellow"}}
{"foozzz": {"bar": "orange"}}
{"foozzz": {"bar": "orange"}}
(3 rows)

postgres=# select * from foo where i ? 'foo';
i
----------------------------
{"foo": {"bar": "yellow"}}
(1 row)

postgres=# explain analyze select * from foo where i ? 'foo';
QUERY PLAN
---------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on foo (cost=12.00..16.01 rows=1 width=32) (actual
time=0.051..0.051 rows=1 loops=1)
Recheck Cond: ((i)::hstore ? 'foo'::text)
Heap Blocks: exact=1
-> Bitmap Index Scan on hidxb (cost=0.00..12.00 rows=1 width=0)
(actual time=0.041..0.041 rows=1 loops=1)
Index Cond: ((i)::hstore ? 'foo'::text)
Planning time: 0.172 ms
Total runtime: 0.128 ms
(7 rows)

Now, it's confusing that it has to go through hstore, perhaps, but
that's hardly all that bad in and of itself. It may be a matter of
reconsidering how to make the two work together. Certainly, queries
like the following fail, because the parser thinks the rhs string is
an hstore literal, not a jsonb literal:

postgres=# select * from foo where i @> '{"foo":4}';
ERROR: 42601: bad hstore representation
LINE 1: select * from foo where i @> '{"foo":4}';
^
DETAIL: syntax error, unexpected STRING_P, expecting '}' or ',' at end of input
LOCATION: hstore_yyerror, hstore_scan.l:172

Other than that, I'm not sure in what sense you consider that jsonb is
"missing GIN/GiST". If you mean that it doesn't have some of the
capabilities that I believe are planned for the VODKA infrastructure
[1], which one might hope to have immediately available to index this
new nested structure, that is hardly a criticism of jsonb in
particular.

[1] http://www.pgcon.org/2014/schedule/events/696.en.html

--
Peter Geoghegan


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Peter Geoghegan <pg(at)heroku(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 01:54:05
Message-ID: 530FEC3D.2080709@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 02/27/2014 01:28 PM, Merlin Moncure wrote:
> How we got here is not the point. All that matters is what's going to
> happen from here. Here are the facts as I see them:

Well, it certainly matters if we want it in this release.

As far as I can tell, moving jsonb to contrib basically requires
rewriting a bunch of code, without actually fixing any of the bugs which
have been discussed in the more technical reviews. I'm really unclear
what, at this point, moving jsonb to /contrib would improve.

On 02/27/2014 04:27 PM, Peter Geoghegan wrote:
> I think that a typed, nested hstore has considerable independent
> value, and would have had just the same value 10 years ago, before
> JSON existed. I'm told that broadly speaking most people would prefer
> the interface to speak JSON, and I'd like to give people what they
> want, but that's as far as it goes. While I see problems with some
> aspects of the patches as implemented, I think that the reason that
> the two types share a binary format is that they're basically the same
> thing. It might be that certain facets of the nested hstore
> implementation reflect a need to accommodate jsonb, but there are no
> ones that I'm currently aware of that I find at all objectionable.

We discussed this with Oleg & Teodor at pgCon 2013. From the
perspective of several of us, we were mystified as to why hstore2 has
it's own syntax at all; that is, why not just implement the JSONish
syntax? Their answer was to provide a smooth upgrade path to existing
hstore users, which makes sense. This was also the reason for not
making hstore a core type.

But again ... we discussed all of this at pgCon and in
November-December. It's not like the people on this thread now weren't
around for both of those discussions.

And it's not just that "broadly speaking most people would prefer
the interface to speak JSON"; it's that a JSONish interface for indexed
heirachical data is a Big Feature which will drive adoption among web
developers, and hstore2 without JSON support simply is not. At trade
shows and developer conferences, I get more questions about PostgreSQL's
JSON support than I do for any new feature since streaming replication.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Christophe Pettus <xof(at)thebuild(dot)com>
To: PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 01:55:03
Message-ID: 2F46BB9F-E44A-4412-9AFA-4B6D25834A29@thebuild.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On Feb 27, 2014, at 5:31 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:

> Now, it's confusing that it has to go through hstore, perhaps, but
> that's hardly all that bad in and of itself.

Yes, it is. It strikes me as irrational to have jsonb depend on hstore. Let's be honest with ourselves: if we were starting over, we wouldn't start by creating our own proprietary hierarchical type and then making the hierarchical type everyone else uses depend on it. hstore exists because json didn't. But json does now, and we shouldn't create a jsonb dependency on hstore.

--
-- Christophe Pettus
xof(at)thebuild(dot)com


From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 02:02:36
Message-ID: 530FEE3C.2060103@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 02/28/2014 09:54 AM, Josh Berkus wrote:
> On 02/27/2014 01:28 PM, Merlin Moncure wrote:
>> How we got here is not the point. All that matters is what's going to
>> happen from here. Here are the facts as I see them:
>
> Well, it certainly matters if we want it in this release.
>
> As far as I can tell, moving jsonb to contrib basically requires
> rewriting a bunch of code, without actually fixing any of the bugs which
> have been discussed in the more technical reviews. I'm really unclear
> what, at this point, moving jsonb to /contrib would improve.

It's also make it a lot harder to use in other extensions, something
that's already an issue with hstore.

It should be a core type.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 02:04:10
Message-ID: CAM3SWZQsYbp=sQ3K9dP_f5=hRgym+Xc2cx+pnu5vXpJKmwfWSw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Feb 27, 2014 at 6:02 PM, Craig Ringer <craig(at)2ndquadrant(dot)com> wrote:
> It's also make it a lot harder to use in other extensions, something
> that's already an issue with hstore.

What do you mean?

--
Peter Geoghegan


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Craig Ringer <craig(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 02:08:50
Message-ID: 20140228020850.GS2921@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Peter Geoghegan (pg(at)heroku(dot)com) wrote:
> On Thu, Feb 27, 2014 at 6:02 PM, Craig Ringer <craig(at)2ndquadrant(dot)com> wrote:
> > It's also make it a lot harder to use in other extensions, something
> > that's already an issue with hstore.
>
> What do you mean?

Extensions can't depend on other extensions directly- hence you can't
write an extension that depends on hstore, which sucks. It'd be
preferrable to not have that issue w/ json/jsonb/whatever.

Yes, it'd be nice to solve that problem, but I don't see it happening in
the next few weeks...

Thanks,

Stephen


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Craig Ringer <craig(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 02:17:53
Message-ID: CAM3SWZR7XJP_iRwsBTETyFDZq6ZsvzEGVpHd8vwj8fRtBzJE5Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Feb 27, 2014 at 6:08 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
>> On Thu, Feb 27, 2014 at 6:02 PM, Craig Ringer <craig(at)2ndquadrant(dot)com> wrote:
>> > It's also make it a lot harder to use in other extensions, something
>> > that's already an issue with hstore.
>>
>> What do you mean?
>
> Extensions can't depend on other extensions directly- hence you can't
> write an extension that depends on hstore, which sucks. It'd be
> preferrable to not have that issue w/ json/jsonb/whatever.

I think it depends of what you mean by "depend". The earthdistance
extension "requires" 'cube', for example, "a data type cube for
representing multidimensional cubes". Although I am aware of the
lengths that drivers like psycopg2 go to to support hstore because
it's an extension, which is undesirable.

--
Peter Geoghegan


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Craig Ringer <craig(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 02:27:01
Message-ID: 20140228022700.GV2921@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Peter Geoghegan (pg(at)heroku(dot)com) wrote:
> On Thu, Feb 27, 2014 at 6:08 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> > Extensions can't depend on other extensions directly- hence you can't
> > write an extension that depends on hstore, which sucks. It'd be
> > preferrable to not have that issue w/ json/jsonb/whatever.
>
> I think it depends of what you mean by "depend". The earthdistance
> extension "requires" 'cube', for example, "a data type cube for
> representing multidimensional cubes". Although I am aware of the
> lengths that drivers like psycopg2 go to to support hstore because
> it's an extension, which is undesirable.

What earthdistance does is simply use the 'cube' data type- that's quite
different from needing to be able to make calls from one .so into the
other .so directly. With earthdistance/cube, everything goes through
PG.

Thanks,

Stephen


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Josh Berkus <josh(at)agliodbs(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 03:09:22
Message-ID: 530FFDE2.6020905@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2/27/14, 2:11 PM, Josh Berkus wrote:
> This means that, if we have jsonb as an extension, we'll
> eventually be in the position where the recommended json type with all
> the features is an extension, whereas the legacy json type is in core.

Well that wouldn't be a new situation. Compare geometry types vs
postgis, inet vs ip4(r). It's not bad being an extension. You can
iterate faster and don't have to discuss so much. ;-)


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 03:09:52
Message-ID: CAM3SWZT9O-sSjSxDYQNi7LLEzFMJATCTSoHLmn5Ep=OEtX7s2Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Feb 27, 2014 at 5:54 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> And it's not just that "broadly speaking most people would prefer
> the interface to speak JSON"; it's that a JSONish interface for indexed
> heirachical data is a Big Feature which will drive adoption among web
> developers, and hstore2 without JSON support simply is not. At trade
> shows and developer conferences, I get more questions about PostgreSQL's
> JSON support than I do for any new feature since streaming replication.

I work for Heroku; believe me, I get it. I'd go along with abandoning
nested hstore as a user-visible thing if I thought it bought jsonb
something and I thought we could, but I have doubts about that.

I understand why the nested hstore approach was taken. It isn't that
desirable to maintain something like a jsonb in parallel, while also
having the old key/value, untyped hstore. They are still fairly
similar as these things go. Robert said something about re-using op
classes rather than waiting for new op classes to be developed, but
why do we need to wait? These ones look like they work fine - what
will be better about the ones we develop later that justifies their
independent existence? Why should we believe that they won't just be
copied and pasted? The major problem is that conceptually, hstore
"owns" them (which is at least in part due to old hstore code rather
than nested hstore code), and so we need a better way to make that
work. We need some commonality and variability analysis, because
duplicating large amounts of hstore isn't very appealing.

> As far as I can tell, moving jsonb to contrib basically requires
> rewriting a bunch of code, without actually fixing any of the bugs which
> have been discussed in the more technical reviews. I'm really unclear
> what, at this point, moving jsonb to /contrib would improve.

These are all of the additions to core, excluding regression tests and docs:

***SNIP***
src/backend/catalog/system_views.sql | 8 +
src/backend/utils/adt/Makefile | 2 +-
src/backend/utils/adt/json.c | 44 ++--
src/backend/utils/adt/jsonb.c | 455 ++++++++++++++++++++++++++++++++
src/backend/utils/adt/jsonb_support.c | 1268
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
src/backend/utils/adt/jsonfuncs.c | 1159
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-------
src/include/catalog/pg_cast.h | 4 +
src/include/catalog/pg_operator.h | 12 +
src/include/catalog/pg_proc.h | 44 +++-
src/include/catalog/pg_type.h | 6 +
src/include/funcapi.h | 9 +
src/include/utils/json.h | 15 ++
src/include/utils/jsonapi.h | 8 +-
src/include/utils/jsonb.h | 245 +++++++++++++++++
**SNIP**

It's not immediately obvious to me why moving that into contrib
requires much work at all (relatively speaking), especially since
that's where much of it came from to begin with, although I grant that
I don't grok the patch.

Here is the line of reasoning that suggests to me that putting jsonb
in contrib is useful:

* It is not desirable to maintain some amount of common code between
hstore (as it exists today) and jsonb. This is of course a question of
degree (not an absolute), so feel free to call me out on the details
here, but I'm of the distinct impression that jsonb doesn't have that
much of an independent existence from hstore - what you could loosely
call "the jsonb parts" includes in no small part historic hstore code,
and not just new nested hstore code (that could reasonably be broken
out if we decided to jettison nested hstore as a user-visible thing
and concentrated on jsonb alone, as you would have us do). In other
words, Oleg and Teodor built nested hstore on hstore because of
practical considerations, and not just because they were attached to
hstore's perl-like syntax. They didn't start from scratch because that
was harder, or didn't make sense.

* We can't throw hstore users under the bus. It has to stay in contrib
for various reasons.

* It hardly makes any sense to have an in-core jsonb if it comes with
no batteries included. You need to install hstore for this jsonb
implementation to be of *any* use anyway. When you don't have the
extension installed, expect some really confusing error messages when
you go to create a GIN index. jsonb is no use on its own; why not just
make it all or nothing?

Another way of resolving this tension might be to push a lot more of
hstore into core than is presently proposed, but that seems like a
more difficult solution with little to no upside.
--
Peter Geoghegan


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>, Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 03:10:22
Message-ID: 530FFE1E.7060302@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2/26/14, 10:42 PM, Andrew Dunstan wrote:
> Extensions can't call each other's code.

That's not necessarily so.


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Stephen Frost <sfrost(at)snowman(dot)net>, Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Craig Ringer <craig(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 03:15:44
Message-ID: 530FFF60.4030003@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2/27/14, 9:08 PM, Stephen Frost wrote:
> Extensions can't depend on other extensions directly- hence you can't
> write an extension that depends on hstore, which sucks.

Sure they can, see transforms.

(Or if you disagree, download that patch and demo it, because I'd like
to know. ;-) )


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 03:27:55
Message-ID: 5310023B.9080900@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 02/27/2014 10:09 PM, Peter Geoghegan wrote:

> * It hardly makes any sense to have an in-core jsonb if it comes with
> no batteries included. You need to install hstore for this jsonb
> implementation to be of *any* use anyway.

This is complete nonsense. Right out of the box today a considerable
number of the json operations are likely to be considerable faster.

cheers

andrew


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 04:04:00
Message-ID: CAM3SWZT7yD+mNYfPdPYBCi_dsZfJ5FC6ZoNOoOadr_u82X54YA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Feb 27, 2014 at 7:27 PM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
> On 02/27/2014 10:09 PM, Peter Geoghegan wrote:
>
>> * It hardly makes any sense to have an in-core jsonb if it comes with
>> no batteries included. You need to install hstore for this jsonb
>> implementation to be of *any* use anyway.
>
>
>
> This is complete nonsense. Right out of the box today a considerable number
> of the json operations are likely to be considerable faster.

We need the hstore operator classes to have something interesting.
That's what those people at trade shows and developer conferences that
Josh refers to actually care about. But in any case, even that's kind
of beside the point.

I'm hearing a lot about how important jsonb is, but not much on how to
make the simple jsonb cases that are currently broken (as illustrated
by my earlier examples [1], [2]) work. Surely you'd agree that those
are problematic. We need a better solution than an implicit cast. What
do you propose? I think we might be able to fix at least some things
with judicious use of function overloading, or we could if it didn't
seem incongruous to have to do so given the role of the hstore module
in the extant patch.

[1] http://www.postgresql.org/message-id/CAM3SWZR2mWUNFoQdWQmEsJsvaEBqq6jhfCM1Wevwc7r=tPFuRw@mail.gmail.com

[2] http://www.postgresql.org/message-id/CAM3SWZSLybxywH6p2pGhHFGZMzkHqBWkfr83mrzQVsoyqFB9xw@mail.gmail.com
--
Peter Geoghegan


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 04:05:33
Message-ID: 20140228040533.GA2921@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Peter,

* Peter Eisentraut (peter_e(at)gmx(dot)net) wrote:
> On 2/27/14, 9:08 PM, Stephen Frost wrote:
> > Extensions can't depend on other extensions directly- hence you can't
> > write an extension that depends on hstore, which sucks.
>
> Sure they can, see transforms.
>
> (Or if you disagree, download that patch and demo it, because I'd like
> to know. ;-) )

The issue is if there's a direct reference from one extension to another
extension- we're talking C level function call here. If the extensions
aren't loaded in the correct order then you'll run into problems. I've
not tried to work out getting one to actually link to the other, so
they're pulled in together, but it doesn't strike me as great answer
either. Then there's the questions around versioning, etc...

Presumably, using shared_preload_libraries would work to get the .so's
loaded in the right order, but it doesn't strike me as appropriate to
require that.

And, for my 2c, I'd like to see jsonb as a built-in type *anyway*. Even
if it's possible to fight with things and make inter-extension
dependency work, it's not trivial and would likely discourage new
developers trying to use it.

Thanks,

Stephen


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 04:07:14
Message-ID: CAM3SWZSBe=BQ-ZTwbrDZtTC8qC0RXJm3hwEqQmRgaGXVG2qxkA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Feb 27, 2014 at 8:05 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> And, for my 2c, I'd like to see jsonb as a built-in type *anyway*. Even
> if it's possible to fight with things and make inter-extension
> dependency work, it's not trivial and would likely discourage new
> developers trying to use it.

I'm not advocating authoring two extensions. I am tentatively
suggesting that we look at one extension for everything. That may well
be the least worst thing.

--
Peter Geoghegan


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 04:08:09
Message-ID: CAM3SWZTpPxYz_HwMJq9Q0CFjO7o+vBN4WZp5eEJNFpL997PtHQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Feb 27, 2014 at 8:07 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
> On Thu, Feb 27, 2014 at 8:05 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
>> And, for my 2c, I'd like to see jsonb as a built-in type *anyway*. Even
>> if it's possible to fight with things and make inter-extension
>> dependency work, it's not trivial and would likely discourage new
>> developers trying to use it.
>
> I'm not advocating authoring two extensions. I am tentatively
> suggesting that we look at one extension for everything. That may well
> be the least worst thing.

(Not that it's clear that you imagined I was, but I note it all the same).

--
Peter Geoghegan


From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 04:21:30
Message-ID: 53100ECA.1090307@commandprompt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 02/27/2014 05:54 PM, Josh Berkus wrote:
>

>
> And it's not just that "broadly speaking most people would prefer
> the interface to speak JSON"; it's that a JSONish interface for indexed
> heirachical data is a Big Feature which will drive adoption among web
> developers, and hstore2 without JSON support simply is not. At trade
> shows and developer conferences, I get more questions about PostgreSQL's
> JSON support than I do for any new feature since streaming replication.
>

Just to back this up. This is not anecdotal. I have multiple customers
performing very large development projects right now. Every single one
of them is interested in the pros/cons of using PostgreSQL and JSON.

JD

--
Command Prompt, Inc. - http://www.commandprompt.com/ 509-416-6579
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
For my dreams of your image that blossoms
a rose in the deeps of my heart. - W.B. Yeats


From: Christophe Pettus <xof(at)thebuild(dot)com>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Josh Berkus <josh(at)agliodbs(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 04:23:51
Message-ID: 67260E8D-E229-48FC-A6F1-4A1FCBF59908@thebuild.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On Feb 27, 2014, at 8:04 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:

> I'm hearing a lot about how important jsonb is, but not much on how to
> make the simple jsonb cases that are currently broken (as illustrated
> by my earlier examples [1], [2]) work.

Surely, the answer is to define a jsonb || jsonb (and likely the other combinatorics of json and jsonb), along with the appropriate GIN and GiST interfaces for jsonb. Why would that not work?

--
-- Christophe Pettus
xof(at)thebuild(dot)com


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Christophe Pettus <xof(at)thebuild(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Josh Berkus <josh(at)agliodbs(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 04:31:16
Message-ID: CAM3SWZTzic3XRLTM424KGYSwYhP2gVNJ4PLJwfCYpBwSwK3sPA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Feb 27, 2014 at 8:23 PM, Christophe Pettus <xof(at)thebuild(dot)com> wrote:
> On Feb 27, 2014, at 8:04 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
>
>> I'm hearing a lot about how important jsonb is, but not much on how to
>> make the simple jsonb cases that are currently broken (as illustrated
>> by my earlier examples [1], [2]) work.
>
> Surely, the answer is to define a jsonb || jsonb (and likely the other combinatorics of json and jsonb), along with the appropriate GIN and GiST interfaces for jsonb. Why would that not work?

I'm not the one opposed to putting jsonb stuff in the hstore module!

--
Peter Geoghegan


From: Christophe Pettus <xof(at)thebuild(dot)com>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Josh Berkus <josh(at)agliodbs(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 04:43:10
Message-ID: F1CF141E-0238-482A-BDDC-82FEA10E8B1C@thebuild.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On Feb 27, 2014, at 8:31 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:

> On Thu, Feb 27, 2014 at 8:23 PM, Christophe Pettus <xof(at)thebuild(dot)com> wrote:
>> Surely, the answer is to define a jsonb || jsonb (and likely the other combinatorics of json and jsonb), along with the appropriate GIN and GiST interfaces for jsonb. Why would that not work?
>
> I'm not the one opposed to putting jsonb stuff in the hstore module!

My proposal is that we break the dependencies of jsonb (at least, at the user-visible level) on hstore2, thus allowing it in core successfully. jsonb || jsonb returning hstore seems like a bug to me, not a feature we should be supporting.

--
-- Christophe Pettus
xof(at)thebuild(dot)com


From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Christophe Pettus <xof(at)thebuild(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Josh Berkus <josh(at)agliodbs(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 05:12:04
Message-ID: 53101AA4.70201@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 02/28/2014 12:43 PM, Christophe Pettus wrote:
> My proposal is that we break the dependencies of jsonb (at least, at the user-visible level) on hstore2, thus allowing it in core successfully. jsonb || jsonb returning hstore seems like a bug to me, not a feature we should be supporting.

Urgh, really?

That's not something I'd be excited to be stuck with into the future.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Christophe Pettus <xof(at)thebuild(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Josh Berkus <josh(at)agliodbs(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 05:28:52
Message-ID: CAM3SWZTMx=Q_nmJN5LEUPZwBRxTL0h=3OyZfkSW_o2UM7Qht9Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Feb 27, 2014 at 8:43 PM, Christophe Pettus <xof(at)thebuild(dot)com> wrote:
>> I'm not the one opposed to putting jsonb stuff in the hstore module!
>
> My proposal is that we break the dependencies of jsonb (at least, at the user-visible level) on
> hstore2, thus allowing it in core successfully. jsonb || jsonb returning hstore seems like a bug
> to me, not a feature we should be supporting.

Of course it's a bug.

The only problem with that is now you have to move the implementation
of ||, plus a bunch of other hstore operators into core. That seems
like a more difficult direction to move in from a practical
perspective, and I'm not sure that you won't hit a snag elsewhere. But
you must do this in order to make what you describe work; obviously
you can't break jsonb's dependency on hstore if users must have hstore
installed to get a || operator. In short, jsonb and hstore are tied at
the hip (which I don't think is unreasonable), and if you insist on
having one in core, you almost need to have both there (with hstore
proper perhaps just consisting of stub functions and io routines).

I don't understand the aversion to putting jsonb in the hstore
extension. What's wrong with having the code live in an extension,
really? I suppose that putting it in core would be slightly preferable
given the strategic importance of jsonb, but it's not something that
I'd weigh too highly. Right now, I'm much more concerned about finding
*some* way of integrating jsonb that is broadly acceptable.

--
Peter Geoghegan


From: Christophe Pettus <xof(at)thebuild(dot)com>
To: Craig Ringer <craig(at)2ndQuadrant(dot)com>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Josh Berkus <josh(at)agliodbs(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 05:28:59
Message-ID: 971DA843-E21B-49E4-B98C-397B6403B90B@thebuild.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On Feb 27, 2014, at 9:12 PM, Craig Ringer <craig(at)2ndQuadrant(dot)com> wrote:

> On 02/28/2014 12:43 PM, Christophe Pettus wrote:
>> My proposal is that we break the dependencies of jsonb (at least, at the user-visible level) on hstore2, thus allowing it in core successfully. jsonb || jsonb returning hstore seems like a bug to me, not a feature we should be supporting.
>
> Urgh, really?
>
> That's not something I'd be excited to be stuck with into the future.

The reason that we're even here is that there's no jsonb || jsonb operator (or the other operators that one would expect).

If you try || without the hstore, you get an error, of course:

postgres=# select '{"foo":{"bar":"yellow"}}'::jsonb || '{}'::jsonb;
ERROR: operator does not exist: jsonb || jsonb
LINE 1: select '{"foo":{"bar":"yellow"}}'::jsonb || '{}'::jsonb;
^
HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts.

The reason it works with hstore installed is that there's an implicit cast from hstore to jsonb:

postgres=# create extension hstore;
CREATE EXTENSION
postgres=# select '{"foo":{"bar":"yellow"}}'::jsonb || '{}'::jsonb;
?column?
--------------------------
"foo"=>{"bar"=>"yellow"}
(1 row)

--

But I think we're piling broken on broken here. Just creating an appropriate jsonb || jsonb operator solves this problem. That seems the clear route forward.

--
-- Christophe Pettus
xof(at)thebuild(dot)com


From: Christophe Pettus <xof(at)thebuild(dot)com>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Josh Berkus <josh(at)agliodbs(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 05:35:29
Message-ID: E99DC525-D05C-4EA7-A4CD-97E7E72EE603@thebuild.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On Feb 27, 2014, at 9:28 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:

> The only problem with that is now you have to move the implementation
> of ||, plus a bunch of other hstore operators into core. That seems
> like a more difficult direction to move in from a practical
> perspective, and I'm not sure that you won't hit a snag elsewhere.

Implementing operators for new types in PostgreSQL is pretty well-trod ground. I really don't know what snags we might hit.

> I suppose that putting it in core would be slightly preferable
> given the strategic importance of jsonb, but it's not something that
> I'd weigh too highly.

I'm completely unsure how to parse the idea that something is strategically important but we shouldn't put it in core. If json was important enough to make it into core, jsonb certainly is.

Honestly, I really don't understand the resistance to putting jsonb in core. There are missing operators, yes; that's a very straight-forward hole to plug.

--
-- Christophe Pettus
xof(at)thebuild(dot)com


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Christophe Pettus <xof(at)thebuild(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Josh Berkus <josh(at)agliodbs(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 05:59:44
Message-ID: CAM3SWZTyWBgAL4XA8sG_14c8fSdR0G7EgDQDEej_PAsfyEzjHA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Feb 27, 2014 at 9:35 PM, Christophe Pettus <xof(at)thebuild(dot)com> wrote:
>> The only problem with that is now you have to move the implementation
>> of ||, plus a bunch of other hstore operators into core. That seems
>> like a more difficult direction to move in from a practical
>> perspective, and I'm not sure that you won't hit a snag elsewhere.
>
> Implementing operators for new types in PostgreSQL is pretty well-trod ground. I really don't know what snags we might hit.

I don't find that very reassuring.

>> I suppose that putting it in core would be slightly preferable
>> given the strategic importance of jsonb, but it's not something that
>> I'd weigh too highly.
> I'm completely unsure how to parse the idea that something is strategically important but we shouldn't put it in core. If json was important enough to make it into core, jsonb certainly is.

That is completely orthogonal to everything I've said. To be clear:
I'm not suggesting that we don't put jsonb in core because it's not
important enough - it has nothing to do with that whatsoever - and
besides, I don't understand why an extension is seen as not befitting
of a more important feature.

> Honestly, I really don't understand the resistance to putting jsonb in core. There are missing operators, yes; that's a very straight-forward hole to plug.

You are basically suggesting putting all of hstore in core, because
jsonb and hstore are approximately the same thing. That seem quite a
bit more controversial than putting everything in the hstore
extension. I doubt that you can reasonably take any half measure
between those two extremes, and one seems a lot less controversial
than the other. This patch already seems controversial enough to me.
It's as simple as that.

--
Peter Geoghegan


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Peter Geoghegan <pg(at)heroku(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 06:12:02
Message-ID: 20140228061202.GA15628@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2014-02-27 22:10:22 -0500, Peter Eisentraut wrote:
> On 2/26/14, 10:42 PM, Andrew Dunstan wrote:
> > Extensions can't call each other's code.
>
> That's not necessarily so.

I don't think we have portable infrastructure to it properly yet,
without a detour via the fmgr. If I am wrong, what's the infrastructure?

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Christophe Pettus <xof(at)thebuild(dot)com>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Josh Berkus <josh(at)agliodbs(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 07:02:21
Message-ID: C2417A54-0937-4069-862D-CC7A369EDF1C@thebuild.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On Feb 27, 2014, at 9:59 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:

> I don't find that very reassuring.

Obviously, we have to try it, and that will decide it.

> I don't understand why an extension is seen as not befitting
> of a more important feature.

contrib/ is considered a secondary set of features; I routinely get pushback from clients about using hstore because it's not in core, and they are thus suspicious of it. The educational project required to change that far exceeds any technical work we are talking about here.. There's a very large presentational difference between having a feature in contrib/ and in core, at the minimum, setting aside the technical issues (such as the extensions-calling-extensions problem).

We have an existence proof of this already: if there was absolutely no difference between having things being in contrib/ and being in core, full text search would still be in contrib/.

> You are basically suggesting putting all of hstore in core, because
> jsonb and hstore are approximately the same thing. That seem quite a
> bit more controversial than putting everything in the hstore
> extension.

Well, "controversy" is just a way of saying there are people who don't like the idea, and I get that. But I don't see the basis for the dislike.

--
-- Christophe Pettus
xof(at)thebuild(dot)com


From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Christophe Pettus <xof(at)thebuild(dot)com>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Josh Berkus <josh(at)agliodbs(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 07:10:29
Message-ID: 53103665.3050208@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 02/28/2014 09:02 AM, Christophe Pettus wrote:
> contrib/ is considered a secondary set of features; I routinely get pushback from clients about using hstore because it's not in core, and they are thus suspicious of it. The educational project required to change that far exceeds any technical work we are talking about here.. There's a very large presentational difference between having a feature in contrib/ and in core, at the minimum, setting aside the technical issues (such as the extensions-calling-extensions problem).
>
> We have an existence proof of this already: if there was absolutely no difference between having things being in contrib/ and being in core, full text search would still be in contrib/.

Although presentation was probably the main motivation for moving
full-text search into core, there was good technical reasons for that
too. Full-text search in contrib had a bunch of catalog-like tables to
store the dictionaries etc, and cumbersome functions to manipulate them.
When it was moved into core, we created new SQL commands for that stuff,
which is much clearer. The json doesn't have that; it would be well
suited to be an extension from technical point of view.

(This is not an opinion statement on what I think we should do. I
haven't been following the discussion, so I'm going to just whine
afterwards ;-) )

- Heikki


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Christophe Pettus <xof(at)thebuild(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Josh Berkus <josh(at)agliodbs(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 07:15:23
Message-ID: CAM3SWZT=ZGssasyWrNBegaSCPY_ym++h4Yd0GTTLtabE-jJncA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Feb 27, 2014 at 11:02 PM, Christophe Pettus <xof(at)thebuild(dot)com> wrote:
> On Feb 27, 2014, at 9:59 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
>> I don't find that very reassuring.
>
> Obviously, we have to try it, and that will decide it.

I don't think that's obvious at all. Anyone is free to spend their
time however they please, but personally I don't think that that's a
wise use of anyone's time.

> contrib/ is considered a secondary set of features; I routinely get pushback from clients about using hstore because it's not in core, and they are thus suspicious of it. The educational project required to change that far exceeds any technical work we are talking about here.. There's a very large presentational difference between having a feature in contrib/ and in core, at the minimum, setting aside the technical issues (such as the extensions-calling-extensions problem).

There are no technical issues of any real consequence in this specific instance.

> We have an existence proof of this already: if there was absolutely no difference between having things being in contrib/ and being in core, full text search would still be in contrib/.

I never said there was no difference, and whatever difference exists
varies considerably, as Heikki points out. I myself want to move
pg_stat_statements to core, for example, for exactly one very specific
reason: so that I can reserve a small amount of shared memory by
default so that it can be enabled without a restart at short notice.

>> You are basically suggesting putting all of hstore in core, because
>> jsonb and hstore are approximately the same thing. That seem quite a
>> bit more controversial than putting everything in the hstore
>> extension.
>
> Well, "controversy" is just a way of saying there are people who don't like the idea, and I get that. But I don't see the basis for the dislike.

Yes, people who have the ability to block the feature entirely. I am
attempting to build consensus by reaching a compromise that weighs
everyone's concerns.

--
Peter Geoghegan


From: Christophe Pettus <xof(at)thebuild(dot)com>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Josh Berkus <josh(at)agliodbs(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 07:36:07
Message-ID: 44605179-FAFA-41A6-B46E-AD67E0824846@thebuild.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On Feb 27, 2014, at 11:15 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:

> I don't think that's obvious at all. Anyone is free to spend their
> time however they please, but personally I don't think that that's a
> wise use of anyone's time.

I believe you are misunderstanding me. If there are actual technical problems or snags to migrating jsonb into core with full operator and index support, then the way we find out is to do the implementation, unless you know of a specific technical holdup already.

> There are no technical issues of any real consequence in this specific instance.

There was no technical reason that json couldn't have been an extension, either, but there were very compelling presentational reasons to have it in core. jsonb has exactly the same presentational issues.

> Yes, people who have the ability to block the feature entirely. I am
> attempting to build consensus by reaching a compromise that weighs
> everyone's concerns.

The thing I still haven't heard is why jsonb in core is a bad idea, except that it is too much code. Is that the objection?

--
-- Christophe Pettus
xof(at)thebuild(dot)com


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Christophe Pettus <xof(at)thebuild(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Josh Berkus <josh(at)agliodbs(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 07:54:47
Message-ID: CAM3SWZSsM0LVrmz2jX=qK4i183F7OBxVjQX7Ejf+pmqAGrJywg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Feb 27, 2014 at 11:36 PM, Christophe Pettus <xof(at)thebuild(dot)com> wrote:
> There was no technical reason that json couldn't have been an extension, either, but there were very compelling presentational reasons to have it in core. jsonb has exactly the same presentational issues.

There were also no compelling reasons why json should have been an
extension. The two situations are not at all comparable. In any case,
no author of this patch has proposed any solution to the casting
problems described with using the jsonb type with the new (and
existing) hstore operators. For that reason, I won't comment further
on this until I hear a more concrete proposal.

>> Yes, people who have the ability to block the feature entirely. I am
>> attempting to build consensus by reaching a compromise that weighs
>> everyone's concerns.
>
> The thing I still haven't heard is why jsonb in core is a bad idea, except that it is too much code. Is that the objection?

I suspect that it's going to be considered odd to have code in core
that considers compatibility with earlier versions of hstore, back
when it was an extension, with calling stub functions, for one thing.
Having hstore be almost but not quite in core may be seen as a
contortion. Is that really the conversation you'd prefer to have at
this late stage? In any case, as I say, if that's the patch that
Andres or Oleg or Teodor really want to submit, then by all means let
them submit it. I maintain that the *current* state of affairs, where
jsonb isn't sure if it's in core or is an extension will not fly.

--
Peter Geoghegan


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Christophe Pettus <xof(at)thebuild(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Josh Berkus <josh(at)agliodbs(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 08:01:25
Message-ID: 20140228080125.GG15628@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2014-02-27 23:54:47 -0800, Peter Geoghegan wrote:
> In any case, as I say, if that's the patch that Andres or Oleg or
> Teodor really want to submit, then by all means let them submit it.

Just to make that clear, I am not one of the authors, I just did a
couple of light review passes.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Christophe Pettus <xof(at)thebuild(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Josh Berkus <josh(at)agliodbs(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 08:03:48
Message-ID: CAM3SWZT5VtB=3kgDrj3-vWzrZ360ZyjR6-teUQ4rgx_eeo9X0g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 28, 2014 at 12:01 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> On 2014-02-27 23:54:47 -0800, Peter Geoghegan wrote:
>> In any case, as I say, if that's the patch that Andres or Oleg or
>> Teodor really want to submit, then by all means let them submit it.
>
> Just to make that clear, I am not one of the authors, I just did a
> couple of light review passes.

Sorry, that was a typo. I meant Andrew.

--
Peter Geoghegan


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 08:12:28
Message-ID: 20140228081228.GH15628@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2014-02-27 15:06:33 -0500, Andrew Dunstan wrote:
> You realize that this API dates from 9.3 and has been used in numerous
> extensions, right? So the names are pretty well fixed, for good or ill.

Sure. Doesn't prevent adding a couple more comments tho. I've only
noticed this because I opened the header as a reference when reading
your patch. Anyway, do something based on that feedback or not, your
choice ;)

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Thom Brown <thom(at)linux(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 12:19:32
Message-ID: CAA-aLv7OXt9f6wjgo1EH2sj5vNv3gx_bf6+1zdnAw4ZZuF6e_g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 28 February 2014 08:12, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:

> On 2014-02-27 15:06:33 -0500, Andrew Dunstan wrote:
> > You realize that this API dates from 9.3 and has been used in numerous
> > extensions, right? So the names are pretty well fixed, for good or ill.
>
> Sure. Doesn't prevent adding a couple more comments tho. I've only
> noticed this because I opened the header as a reference when reading
> your patch. Anyway, do something based on that feedback or not, your
> choice ;)
>

Can I ask why I can do this:

SELECT review %> 'product'->'title' as product_title
FROM rating;

But I can't do this:

SELECT review->'product'->'title' as product_title
FROM rating;

ERROR: operator does not exist: hstore -> hstore
LINE 1: explain select review -> 'product'::hstore ->'title' as prod...

Yet I can do this:

SELECT review::json->'product'->'title' as product_title
FROM rating;

Yours oblivious,
--
Thom


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Thom Brown <thom(at)linux(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 13:01:08
Message-ID: 53108894.5020703@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 02/28/2014 07:19 AM, Thom Brown wrote:
> On 28 February 2014 08:12, Andres Freund <andres(at)2ndquadrant(dot)com
> <mailto:andres(at)2ndquadrant(dot)com>> wrote:
>
> On 2014-02-27 15:06:33 -0500, Andrew Dunstan wrote:
> > You realize that this API dates from 9.3 and has been used in
> numerous
> > extensions, right? So the names are pretty well fixed, for good
> or ill.
>
> Sure. Doesn't prevent adding a couple more comments tho. I've only
> noticed this because I opened the header as a reference when reading
> your patch. Anyway, do something based on that feedback or not, your
> choice ;)
>
>
> Can I ask why I can do this:
>
> SELECT review %> 'product'->'title' as product_title
> FROM rating;
>
> But I can't do this:
>
> SELECT review->'product'->'title' as product_title
> FROM rating;
>
> ERROR: operator does not exist: hstore -> hstore
> LINE 1: explain select review -> 'product'::hstore ->'title' as prod...
>
> Yet I can do this:
>
> SELECT review::json->'product'->'title' as product_title
> FROM rating;
>
>

I don't think this complaint has anything to do with the text you
quoted, so you've kinda hijacked the thread slightly.

But anyway, I think we've seen enough of these to conclude that the
casts from hstore to jsonb and back should not be implicit. I am fairly
confident that changing that would fix your complaint and the similar
one that Peter Geoghegan had.

cheers

andrew


From: Thom Brown <thom(at)linux(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 13:02:57
Message-ID: CAA-aLv4qAz2H5dcUrP8N-YksqjrE-B2Z10arxjHrWa7yS0OhOw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 28 February 2014 13:01, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:

>
> On 02/28/2014 07:19 AM, Thom Brown wrote:
>
> On 28 February 2014 08:12, Andres Freund <andres(at)2ndquadrant(dot)com <mailto:
>> andres(at)2ndquadrant(dot)com>> wrote:
>>
>> On 2014-02-27 15:06:33 -0500, Andrew Dunstan wrote:
>> > You realize that this API dates from 9.3 and has been used in
>> numerous
>> > extensions, right? So the names are pretty well fixed, for good
>> or ill.
>>
>> Sure. Doesn't prevent adding a couple more comments tho. I've only
>> noticed this because I opened the header as a reference when reading
>> your patch. Anyway, do something based on that feedback or not, your
>> choice ;)
>>
>>
>> Can I ask why I can do this:
>>
>> SELECT review %> 'product'->'title' as product_title
>> FROM rating;
>>
>> But I can't do this:
>>
>> SELECT review->'product'->'title' as product_title
>> FROM rating;
>>
>> ERROR: operator does not exist: hstore -> hstore
>> LINE 1: explain select review -> 'product'::hstore ->'title' as prod...
>>
>> Yet I can do this:
>>
>> SELECT review::json->'product'->'title' as product_title
>> FROM rating;
>>
>>
>>
> I don't think this complaint has anything to do with the text you quoted,
> so you've kinda hijacked the thread slightly.
>

Apologies. I'd just given the patches my first test-drive and replied to
the last message on the thread.

> But anyway, I think we've seen enough of these to conclude that the casts
> from hstore to jsonb and back should not be implicit. I am fairly confident
> that changing that would fix your complaint and the similar one that Peter
> Geoghegan had.

Thanks.
--
Thom


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 14:23:31
Message-ID: 20140228142330.GH2921@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Peter Geoghegan (pg(at)heroku(dot)com) wrote:
> On Thu, Feb 27, 2014 at 8:07 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
> > I'm not advocating authoring two extensions. I am tentatively
> > suggesting that we look at one extension for everything. That may well
> > be the least worst thing.
>
> (Not that it's clear that you imagined I was, but I note it all the same).

Thanks for that clarification- it was useful (for me anyway). For my
2c, while I agree that it would work, I'd still rather see this get into
core for reasons mentioned elsewhere but which I'll echo here- JSON has
become the de-facto data interexchange format on a rather massive scale-
it's become what XML was trying to be, in many ways by being simpler.

While I agree that the comparison to FTS isn't entirely fair, I also
feel that we should still be considering adding new types to core and
not try to push everything out as extensions. To add on to that- I feel
we still have a ways to go before our extension support will be really
*good* (which I certainly hope it to be some day) and I'd rather we not
force that on to the mobs of installations out there who will want
jsonb.

Thanks again,

Stephen


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Christophe Pettus <xof(at)thebuild(dot)com>
Cc: PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 14:27:35
Message-ID: CA+Tgmoauw4JTAS94gQzKNx4WCGSYWqHPfsb6O=oGCFvLk4f3+Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Feb 27, 2014 at 8:55 PM, Christophe Pettus <xof(at)thebuild(dot)com> wrote:
> On Feb 27, 2014, at 5:31 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
>> Now, it's confusing that it has to go through hstore, perhaps, but
>> that's hardly all that bad in and of itself.
>
> Yes, it is. It strikes me as irrational to have jsonb depend on hstore. Let's be honest with ourselves: if we were starting over, we wouldn't start by creating our own proprietary hierarchical type and then making the hierarchical type everyone else uses depend on it. hstore exists because json didn't. But json does now, and we shouldn't create a jsonb dependency on hstore.

Right. I think this is one of the smartest things that anyone has
said on this thread. I don't have any objection to the idea of
enhancing hstore to support hierarchical data; I completely understand
the appeal of such a change. Nor do I have any objection to the idea
of a binary-json type in core (or out of core); there are obvious uses
for such a thing.

But what's happened here is not the sum of those two admirable
proposals. hstore has been augmented not only to support hierarchical
data but also with a notion of typed data that matches that of JSON
(except that I think the hstore and jsonb patches may have slightly
different notions as to what constitutes a valid number). The
internal format for jsonb has been contrived to match the
upward-compatible format designed for JSON. And thus jsonb depends on
hstore for the functionality that it isn't able to provide for itself.

Taken individually, none of those decisions seem crazy, but taken
together it's pretty weird. Instead of inventing a new type (jsonb)
designed from the ground up to do what we want, we're, well, we're
doing what Christophe says: creating our own proprietary hierarchical
type and then making the hierarchical type everyone else uses depend
on it. Described in those terms, it's hard for me to believe that
anyone here thinks that's not a strange thing to do.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Christophe Pettus <xof(at)thebuild(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 14:50:00
Message-ID: 5310A218.2080604@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 02/28/2014 09:27 AM, Robert Haas wrote:
> On Thu, Feb 27, 2014 at 8:55 PM, Christophe Pettus <xof(at)thebuild(dot)com> wrote:
>> On Feb 27, 2014, at 5:31 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
>>> Now, it's confusing that it has to go through hstore, perhaps, but
>>> that's hardly all that bad in and of itself.
>> Yes, it is. It strikes me as irrational to have jsonb depend on hstore. Let's be honest with ourselves: if we were starting over, we wouldn't start by creating our own proprietary hierarchical type and then making the hierarchical type everyone else uses depend on it. hstore exists because json didn't. But json does now, and we shouldn't create a jsonb dependency on hstore.
> Right. I think this is one of the smartest things that anyone has
> said on this thread. I don't have any objection to the idea of
> enhancing hstore to support hierarchical data; I completely understand
> the appeal of such a change. Nor do I have any objection to the idea
> of a binary-json type in core (or out of core); there are obvious uses
> for such a thing.
>
> But what's happened here is not the sum of those two admirable
> proposals. hstore has been augmented not only to support hierarchical
> data but also with a notion of typed data that matches that of JSON
> (except that I think the hstore and jsonb patches may have slightly
> different notions as to what constitutes a valid number). The
> internal format for jsonb has been contrived to match the
> upward-compatible format designed for JSON. And thus jsonb depends on
> hstore for the functionality that it isn't able to provide for itself.
>
> Taken individually, none of those decisions seem crazy, but taken
> together it's pretty weird. Instead of inventing a new type (jsonb)
> designed from the ground up to do what we want, we're, well, we're
> doing what Christophe says: creating our own proprietary hierarchical
> type and then making the hierarchical type everyone else uses depend
> on it. Described in those terms, it's hard for me to believe that
> anyone here thinks that's not a strange thing to do.
>

Well, the time to make these sorts of decisions would have been back in
November. The direction was clear then, if you were paying attention.
But right from the time this came up at pgcon the idea was to leverage
Oleg and Teodor's work. Nobody found it strange then. I'm rather
confused about why it's suddenly strange now.

What you're essentially arguing for is the invention of TWO binary
treeish things, without any argument I have yet seen advanced about why
the first one we have is good enough for hstore but not good enough for
jsonb.

Frankly, it looks to me like you have turned Christophe's argument on
its head, or completely misunderstood his point.

cheers

andrew


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Christophe Pettus <xof(at)thebuild(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 14:57:31
Message-ID: 20140228145731.GJ2921@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> Taken individually, none of those decisions seem crazy, but taken
> together it's pretty weird. Instead of inventing a new type (jsonb)
> designed from the ground up to do what we want, we're, well, we're
> doing what Christophe says: creating our own proprietary hierarchical
> type and then making the hierarchical type everyone else uses depend
> on it. Described in those terms, it's hard for me to believe that
> anyone here thinks that's not a strange thing to do.

I was taking a slightly different perspective on it, though the devil is
almost certainly in the details. I'll be the first to admit that I've
not looked in detail at the patch either and so I've been trying to
avoid commenting on implementation specifics, but I was seeing this from
the perspective that we are building a single hierarchical typed data
store and then providing two interfaces to it. The way we're getting
there is a little awkward, in hindsight, and we'd like to have backwards
compatibility for one of the interfaces (and its on-disk storage), but
I'm not entirely sure that we'd actually end up in a different place
when we reach the end of this road.

Had we implemented jsonb first and then added hstore to it, would much
be different from the result we're getting here beyond the names of the
functions and the backwards-compatibility for the older on-disk format?
Are we really paying a high cost to support that older format?

The specific issues mentioned on this thread look more like bugs to be
addressed or additional operators which need to be implemented for
jsonb (imv, that should really be done for 9.4, but we have this
deadline looming...) along with perhaps dropping the implicit cast
between json and hstore (is there really a need for it..?).

Thanks,

Stephen


From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: Christophe Pettus <xof(at)thebuild(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Josh Berkus <josh(at)agliodbs(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 15:39:08
Message-ID: 5310AD9C.5060808@commandprompt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 02/27/2014 11:02 PM, Christophe Pettus wrote:
>
>
> On Feb 27, 2014, at 9:59 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
>
>> I don't find that very reassuring.
>
> Obviously, we have to try it, and that will decide it.
>
>> I don't understand why an extension is seen as not befitting
>> of a more important feature.
>
> contrib/ is considered a secondary set of features; I routinely get pushback from clients about using hstore because it's not in core, and they are thus suspicious of it. The educational project required to change that far exceeds any technical work we are talking about here.. There's a very large presentational difference between having a feature in contrib/ and in core, at the minimum, setting aside the technical issues (such as the extensions-calling-extensions problem).
>
> We have an existence proof of this already: if there was absolutely no difference between having things being in contrib/ and being in core, full text search would still be in contrib/.

This is an old and currently false argument. It is true that once upon a
time, contrib was a banished heart, weeping for the attention of a true
prince. Now? Not so much. She is a full on passion flower with the
princes of all the kingdoms wanting her attention.

Joshua D. Drake

--
Command Prompt, Inc. - http://www.commandprompt.com/ 509-416-6579
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
For my dreams of your image that blossoms
a rose in the deeps of my heart. - W.B. Yeats


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Christophe Pettus <xof(at)thebuild(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 16:13:26
Message-ID: CAHyXU0xXiM+0LOkOsbZ6eyDJgqFdD=9Lgh_1FawLqRj+OV2fig@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 28, 2014 at 8:57 AM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> The specific issues mentioned on this thread look more like bugs to be
> addressed or additional operators which need to be implemented for
> jsonb (imv, that should really be done for 9.4, but we have this
> deadline looming...) along with perhaps dropping the implicit cast
> between json and hstore (is there really a need for it..?).

Bugs/bad behaviors should be addressed (which AFAICT are mostly if not
all due to implicit casts). "Missing" operators OTOH are should not
hold up the patch, particuarly when the you have the option of an
explicit cast to hstore if you really want them.

Notwithstanding some of the commentary above, some of jsonb features
(in particular, the operators) are quite useful and should find
regular usage (json has them also, but jsonb removes the performance
penalty). The upshot is that with the current patch you have to do a
lot of casting to get 100% feature coverage and that future
improvements to jsonb will remove the necessity of that. Also the
hstore type will be required to do anything approximating the nosql
pattern.

I don't think the extension issue is a deal breaker either way. While
I have a preference for extensions generally, this is nothing personal
to jsonb. And if we can't come to a consensus on that point the patch
should be accepted on precedent (json being in core).

merlin


From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>, Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 16:49:48
Message-ID: 5310BE2C.90402@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>>> + v.size += VARSIZE_ANY(v.numeric) +sizeof(JEntry) /* alignment */ ;
>> Why does + sizeof(JEntry) change anything about alignment? If it was
>> aligned before, adding a statically sized value doesn't give any new
>> guarantees about alignment?
> Teodor, please comment.

Because numeric type will be copied into jsonb value. And we need to keep
alignment inside jsonb value. The same is true for nested jsonb (array or object).

>>> + type = JsonbIteratorGet(&it, &v, false);
>>> + if (type == WJB_VALUE)
>>> + {
>>> + first = false;
>>> + putEscapedValue(out, &v);
>>> + }
>>> + else
>>> + {
>>> + Assert(type == WJB_BEGIN_OBJECT || type ==
>>> WJB_BEGIN_ARRAY);
>>> + /*
>>> + * We need to rerun current switch() due to put
>>> + * in current place object which we just got
>>> + * from iterator.
>>> + */
>> "due to put"?
>
>
> I think that's due to the author not being a native English speaker. I've tried
> to improve it a bit.
>
> Teodor, please comment if you like.

Pls, fix my English. I mean if we got first element of array/object and it isn't
a scalar value we should do actions pointed by case
WJB_BEGIN_OBJECT/WJB_BEGIN_ARRAY in the same switch without calling iterator.

> Teodor, please examine and comment on all comments below this point.

>>> +JsonbValue *
>>> +findUncompressedJsonbValueByValue(char *buffer, uint32 flags,
>>> + uint32 *lowbound, JsonbValue *key)
>>> +{
>> Functions like this *REALLY* need documentation for their
>> parameters. And of their actual purpose.
>>
>> What's actually the uncompressed bit here? Isn't it actually the
>> contrary? This is navigating the compressed, non-tree form, no?

Functions returns value in JsonbValue form (uncompressed, not just a pointer).
For object it performs search by key and returns corresponding value, for array
- returns matched value. If lowbound is not null then it will be set into
array/object position of found value. And search will be started from *lowbound
position in array/object. That allows some optimizations.

Buffer is a pointer to header of jsonb value. After pointer there is an array of
JEntry and following list of values of array or key and values of object. This
is internal representation for jsonb or hstore without varlena header.
Nested array/object have the same representation. Flags points desired search -
in array or object. For example, If buffer contains array and flags has only
JB_FLAG_OBJECT then function returns NULL.

>>> + else if (flags & JB_FLAG_OBJECT & header)
>>> + {
>>> + JEntry *array = (JEntry *) (buffer + sizeof(header));
>>> + char *data = (char *) (array + (header & JB_COUNT_MASK) * 2);
>>> + uint32 stopLow = lowbound ? *lowbound : 0,
>>> + stopHigh = (header & JB_COUNT_MASK),
>>> + stopMiddle;
>> I don't understand what the point of the lowbound logic could be here?
>> If a key hasn't been found, it hasn't been found? Maybe the idea is to
>> use it when testing containedness or somesuch? Wouldn't iterating over
>> the keyspace be a better idea for that case?
If we has keys (a,b,c,d,e,f,g) and need to search keys e and f, then for second
search we could do in in subset of keys (f,g), we don't need to search in full
set of keys. The idea was introduced in hstoreFindKey() in hstore V2.

>>
>>> + if (key->type != jbvString)
>>> + return NULL;
>> That's not allowed, right?

Right. it should be an Assert or ERROR.

>>
>>> +/*
>>> + * Get i-th value of array or hash. if i < 0 then it counts from
>>> + * the end of array/hash. Note: returns pointer to statically
>>> + * allocated JsonbValue.
>>> + */
>>> +JsonbValue *
>>> +getJsonbValue(char *buffer, uint32 flags, int32 i)
>>> +{
>>> + uint32 header = *(uint32 *) buffer;
>>> + static JsonbValue r;
>> Really? And why on earth would static allocation be a good idea? Specify
>> it on the caller's stack if need be. Or even return by value, today's
>> calling convention will just allocate that on the caller's stack without
>> copying.
>> Accessing static data isn't even faster.

Just to prevent multiple palloc(). Could be changed, I don't insist. I saw
problems with a lot of small allocations but didn't see such problems with
static allocations.

>>
>>> + if (JBE_ISSTRING(*e))
>>> + {
>>> + r.type = jbvString;
>>> + r.string.val = data + JBE_OFF(*e);
>>> + r.string.len = JBE_LEN(*e);
>>> + r.size = sizeof(JEntry) + r.string.len;
>>> + }
>>> + else if (JBE_ISBOOL(*e))
>>> + {
>>> + r.type = jbvBool;
>>> + r.boolean = (JBE_ISBOOL_TRUE(*e)) ? true : false;
>>> + r.size = sizeof(JEntry);
>>> + }
>>> + else if (JBE_ISNUMERIC(*e))
>>> + {
>>> + r.type = jbvNumeric;
>>> + r.numeric = (Numeric) (data + INTALIGN(JBE_OFF(*e)));
>>> +
>>> + r.size = 2 * sizeof(JEntry) + VARSIZE_ANY(r.numeric);
>>> + }
>>> + else if (JBE_ISNULL(*e))
>>> + {
>>> + r.type = jbvNull;
>>> + r.size = sizeof(JEntry);
>>> + }
>>> + else
>>> + {
>>> + r.type = jbvBinary;
>>> + r.binary.data = data + INTALIGN(JBE_OFF(*e));
>>> + r.binary.len = JBE_LEN(*e) - (INTALIGN(JBE_OFF(*e)) - JBE_OFF(*e));
>>> + r.size = r.binary.len + 2 * sizeof(JEntry);
>>> + }
>> This bit of code exists pretty similarly in several places, maybe consolitate?
findUncompressedJsonbValueByValue(), getJsonbValue() and formAnswer(). But one
has inconvenient difference with skipNested flag. Ok, will fix.

>>
>>> +/****************************************************************************
>>> + * Walk on tree representation of
>>> jsonb *
>>> + ****************************************************************************/
>>> +static void
>>> +walkUncompressedJsonbDo(JsonbValue *v, walk_jsonb_cb cb, void *cb_arg,
>>> uint32 level)
>>> +{
>>> + int i;
>> check stack limit.
>>
>>> +void
>>> +walkUncompressedJsonb(JsonbValue *v, walk_jsonb_cb cb, void *cb_arg)
>>> +{
>>> + if (v)
>>> + walkUncompressedJsonbDo(v, cb, cb_arg, 0);
>>> +}
>>> +
>>> +/****************************************************************************
>>> + * Iteration over binary
>>> jsonb *
>>> + ****************************************************************************/
>> This needs docs.
>>
>>> +static void
>>> +parseBuffer(JsonbIterator *it, char *buffer)
>>> +{
>> Why invent completely independent naming conventions to the previous
>> functions here?
Suggest it.

>>
>>> +static bool
>>> +formAnswer(JsonbIterator **it, JsonbValue *v, JEntry * e, bool skipNested)
>>> +{
>> Imaginatively undescriptive name. But if it were slightly more more
>> abstracted away from JsonbIterator it could be the answer to my prayers
>> above about removing redundant code.
formAnswerOfJsonbIteratorGet()?

>>
>>> +static JsonbIterator *
>>> +up(JsonbIterator *it)
>>> +{
>> Not a good name.
...

>>
>>> +int
>>> +JsonbIteratorGet(JsonbIterator **it, JsonbValue *v, bool skipNested)
>>> +{
>>> + int res;
>> recursive, stack depth check.
fixed.

>>
>>> + switch ((*it)->type | (*it)->state)
>>> + {
>>> + case JB_FLAG_ARRAY | jbi_start:
>> I don't know, but I don't see the point in avoid if (), else if()
>> ... constructs if it requires such dirty tricks.
Will be:
if ((*it)->type == JB_FLAG_ARRAY && (*it)->state == jbi_start)

A bit slower and I don't feel that switch is more worse. But I don't insist.

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/


From: Christophe Pettus <xof(at)thebuild(dot)com>
To: PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 17:03:14
Message-ID: EE303BFE-886F-4DB7-B6D5-E69D30AFDF50@thebuild.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On Feb 28, 2014, at 6:27 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> Taken individually, none of those decisions seem crazy, but taken
> together it's pretty weird. Instead of inventing a new type (jsonb)
> designed from the ground up to do what we want, we're, well, we're
> doing what Christophe says: creating our own proprietary hierarchical
> type and then making the hierarchical type everyone else uses depend
> on it. Described in those terms, it's hard for me to believe that
> anyone here thinks that's not a strange thing to do.

A lot of it is that we're getting really tied up in knots about terminology. Because of the history of the project, it's being approached as "jsonb depends on hstore2", rather than, "We need a binary format, BSON won't cut it, but hstore2 is creating one, so let's use the same for both to avoid duplication of effort."

Put that last way, it's a more sensible decision. My specific concern was "Well, if you want binary json, install hstore" is a very strange presentation to give to customers. Many of the user-facing objections can be solved just by removing the implicit cast from jsonb to hstore, and the remaining operators (if they don't make it into this patch) can be added over time.

--
-- Christophe Pettus
xof(at)thebuild(dot)com


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Christophe Pettus <xof(at)thebuild(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 17:55:31
Message-ID: 5310CD93.1050000@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 02/28/2014 07:39 AM, Joshua D. Drake wrote:
>
> This is an old and currently false argument. It is true that once upon a
> time, contrib was a banished heart, weeping for the attention of a true
> prince. Now? Not so much. She is a full on passion flower with the
> princes of all the kingdoms wanting her attention.

That's one of the more colorful metaphors ever posted on this list. I
don't think we've had language like that since Hitoshi stopped being
active. ;-)

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Thom Brown <thom(at)linux(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 18:43:13
Message-ID: CAM3SWZTer_1LstyZ2Exw1+2nbMMMUnFxuV8qtj3uZnn6uFmn8g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 28, 2014 at 5:01 AM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
> But anyway, I think we've seen enough of these to conclude that the casts
> from hstore to jsonb and back should not be implicit. I am fairly confident
> that changing that would fix your complaint and the similar one that Peter
> Geoghegan had.

Yes, it will, but I think that that will create more problems than it
will solve (which is not to suggest that an implicit cast is the right
thing). That will require that any non-trivial usage of jsonb requires
copious casting, where nested hstore does not. The hstore module
hardly contains some nice extras that a minority of jsonb users will
be interested in. It contains among other basic things, operator
classes required to index jsonb. All of my examples will still not
work, plus a bunch of cases that currently do work reasonably well.
There'll just be a different error message.

--
Peter Geoghegan


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Thom Brown <thom(at)linux(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 19:00:32
Message-ID: 15614.1393614032@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Peter Geoghegan <pg(at)heroku(dot)com> writes:
> On Fri, Feb 28, 2014 at 5:01 AM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
>> But anyway, I think we've seen enough of these to conclude that the casts
>> from hstore to jsonb and back should not be implicit. I am fairly confident
>> that changing that would fix your complaint and the similar one that Peter
>> Geoghegan had.

> Yes, it will, but I think that that will create more problems than it
> will solve (which is not to suggest that an implicit cast is the right
> thing). That will require that any non-trivial usage of jsonb requires
> copious casting, where nested hstore does not. The hstore module
> hardly contains some nice extras that a minority of jsonb users will
> be interested in. It contains among other basic things, operator
> classes required to index jsonb. All of my examples will still not
> work, plus a bunch of cases that currently do work reasonably well.
> There'll just be a different error message.

We should have learned by now that implicit casts are generally pretty
dangerous things. I think putting in implicit casts as a band-aid for
missing functionality is a horrid idea that we'll regret for a long
time to come. I gather from upthread comments that the patch currently
actually creates implicit casts in *both* directions? That's doubly
horrid/dangerous.

The more I read in this thread, the more I think that jsonb simply
isn't ready. We should put it off to 9.5 so that we can have a
complete implementation without so many rough edges. I'm afraid that
if we ship it as-is, backwards compatibility considerations are going
to prevent us from filing down the rough edges in future.

regards, tom lane


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, Christophe Pettus <xof(at)thebuild(dot)com>
Cc: PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 19:12:46
Message-ID: 5310DFAE.7010607@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 02/28/2014 06:27 AM, Robert Haas wrote:
> Taken individually, none of those decisions seem crazy, but taken
> together it's pretty weird. Instead of inventing a new type (jsonb)
> designed from the ground up to do what we want, we're, well, we're
> doing what Christophe says: creating our own proprietary hierarchical
> type and then making the hierarchical type everyone else uses depend
> on it. Described in those terms, it's hard for me to believe that
> anyone here thinks that's not a strange thing to do.

It certainly seems like a strange thing to do to *me*. However, Oleg
and Teodor were doing the heavy lifting on the heirarchical type -- we
wouldn't even be talking about jsonb without it -- and they were very
negative to JSON. As with many things, this reminds me of a story.

When the BART subway system was being built for the Bay Area in 1970,
the tunnel was planned to go straight under a particular hardware store
in Berkeley. The hardware store owner was convinced that the
construction would destroy his building and his business, and hit the
state with lawsuit after lawsuit to stop the construction. Eventually,
CALtrans caved and added a curve in that section of the BART tunnel to
go around the location of the hardware store. Forty years later, the
hardware store owner is dead, the hardware store is gone (was gone, in
fact, by 1978), but the curve is still there. And that curve forces
BART trains to slow down by 25mph in a spot which is fairly central to
the whole BART system, thus reducing the overall max capacity of the
entire subway system by 10-15%, and thus making thousands of people a
day late for work for the past 40 years.

I think Robert and Christophe are right: we're building a Berkeley BART
Curve. I think there's two courses of action from here which make sense:

A) We move *all* of the important HStore libraries and operators into
core, and make the hstore extension of them just a mapping of what are
essentially jsonb operators to the hstore type (Christophe's suggestion).

B) We make hstore/jsonb a single extension with two types and all of the
requisite operators etc. (Peter's suggestion).

Reasons for (A):

* In-core jsonb would have strongly enhanced adoption value
* jsonb is liable to become one of our most-used types and it would be
strange for it not to be in core
* binary JSON would "just work" for web developers
* the only reason nested hstore is an extension is because hstore was an
extension and we need an upgrade path
* This is essentially the decision we collectively made in November, for
fairly well-argued reasons, and what Andrew has spent 3 months implementing.

Reasons against (A):

* Having a core type and an extension share code is strange.
* Implicit casts between a core type and an extension could cause issues.

Reasons for (B):

* Conceptually simpler.
* Makes a certain degree of bugginess/unfinishedness more acceptable.

Reasons against (B):

* Users would get tripped up by "first, install the postgresql-contrib
package, then do CREATE EXTENSION hstore"
* As cited, many sysadmins block the install of the -contrib package.
* Performance issues for psycopg2 and other drivers which need to look
up type information on each call.
* This requires larger changes to the existing patch, which likely means
missing the bus for 9.4 (and you've seen my blog about that)

Here's the point in particular which makes me very hesitant to endorse
(B) as a solution:

* Once created as an extension, there is no path to ever making jsonb a
core type.

... as long as we have no way to ever move types between extensions and
core, any decision we make on where a type belongs is permanent. This
is one of the things which Andrew's proposal of a Type Registry last
year was intended to solve, but -hackers soundly rejected that proposal,
so we're currently stuck in the proverbial polluted estuary.

My cause, as everyone knows, is adoption. Given that, I'm pretty
strongly in favor of proposal (A); I think a jsonb type which "just
works" will drive twice the adoption that one you have to remember to
install does.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Greg Stark <stark(at)mit(dot)edu>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Christophe Pettus <xof(at)thebuild(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 19:19:20
Message-ID: CAM-w4HMEk3NAtErba+_v46Zvz=-Q6Zo=jFUnTcGu3UrteZij=g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 28, 2014 at 7:12 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> * As cited, many sysadmins block the install of the -contrib package.

Of course the more you put things in core the more you make this logic
sound reasonable.

--
greg


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Thom Brown <thom(at)linux(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 19:45:29
Message-ID: 5310E759.4060708@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 02/28/2014 02:00 PM, Tom Lane wrote:
> Peter Geoghegan <pg(at)heroku(dot)com> writes:
>> On Fri, Feb 28, 2014 at 5:01 AM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
>>> But anyway, I think we've seen enough of these to conclude that the casts
>>> from hstore to jsonb and back should not be implicit. I am fairly confident
>>> that changing that would fix your complaint and the similar one that Peter
>>> Geoghegan had.
>> Yes, it will, but I think that that will create more problems than it
>> will solve (which is not to suggest that an implicit cast is the right
>> thing). That will require that any non-trivial usage of jsonb requires
>> copious casting, where nested hstore does not. The hstore module
>> hardly contains some nice extras that a minority of jsonb users will
>> be interested in. It contains among other basic things, operator
>> classes required to index jsonb. All of my examples will still not
>> work, plus a bunch of cases that currently do work reasonably well.
>> There'll just be a different error message.
> We should have learned by now that implicit casts are generally pretty
> dangerous things. I think putting in implicit casts as a band-aid for
> missing functionality is a horrid idea that we'll regret for a long
> time to come. I gather from upthread comments that the patch currently
> actually creates implicit casts in *both* directions? That's doubly
> horrid/dangerous.

I agree. I have removed them in my current tree.

>
> The more I read in this thread, the more I think that jsonb simply
> isn't ready. We should put it off to 9.5 so that we can have a
> complete implementation without so many rough edges. I'm afraid that
> if we ship it as-is, backwards compatibility considerations are going
> to prevent us from filing down the rough edges in future.
>
>

Well, the jsonb portion of this is arguably the most ready, certainly
it's had a lot more on-list review.

cheers

andrew


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Christophe Pettus <xof(at)thebuild(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 19:46:18
Message-ID: 16721.1393616778@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Josh Berkus <josh(at)agliodbs(dot)com> writes:
> ... This requires larger changes to the existing patch, which likely means
> missing the bus for 9.4 (and you've seen my blog about that)

Yeah. I realize you're gung-ho about getting jsonb into 9.4 in some
form, and I recognize that getting better JSON support is important.
But I wonder how carefully you've thought about the damage it'll do
if what ships in 9.4 is a weird, hard-to-use mishmash. I'd much
rather see us take the time to get it right than to ship something
that's basically a kluge. And having a core type that depends on
an extension for critical functionality is certainly nothing but a
kluge. As an example, you're arguing that some sysadmins won't permit
installation of contrib modules. (Let's pass over the question of
how true or sane that is.) If they won't allow hstore to be installed,
and jsonb is crippled in consequence, where does that put us for
adoption purposes? I'd argue that it's worse than not shipping jsonb
yet at all.

regards, tom lane


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Christophe Pettus <xof(at)thebuild(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 19:57:15
Message-ID: CAM3SWZRJT3foTGG10ctPFWCqrnTpPY2nXf6=KjsumJ-6oV37Zg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 28, 2014 at 11:12 AM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> I think Robert and Christophe are right: we're building a Berkeley BART
> Curve. I think there's two courses of action from here which make sense:
>
> A) We move *all* of the important HStore libraries and operators into
> core, and make the hstore extension of them just a mapping of what are
> essentially jsonb operators to the hstore type (Christophe's suggestion).
>
> B) We make hstore/jsonb a single extension with two types and all of the
> requisite operators etc. (Peter's suggestion).

I agree with that dichotomy. I pointed this out a couple of times
already. I think the only reasonable way to deal with the casting
problems are to have parallel sets of operators and functions for
each, and to do that you really need one of those two things.

> Reasons against (B):
>
> * This requires larger changes to the existing patch, which likely means
> missing the bus for 9.4 (and you've seen my blog about that)

This seems very dubious. I highly doubt it. A big part of the reason
why I favor (B) is because I think just the opposite. Tom's remarks
just now are consistent with that.

> My cause, as everyone knows, is adoption. Given that, I'm pretty
> strongly in favor of proposal (A); I think a jsonb type which "just
> works" will drive twice the adoption that one you have to remember to
> install does.

I don't think that's true. I used to work as a consultant, and I had a
number of fairly conservative clients. I don't ever recall there being
a restriction on installing a contrib package. If indeed any DBA does
operate under such a restrictive regime, then that's probably not the
kind of user that this feature is for anyway.

--
Peter Geoghegan


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Geoghegan <pg(at)heroku(dot)com>, Thom Brown <thom(at)linux(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 20:01:13
Message-ID: 20140228200113.GA7874@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2014-02-28 14:45:29 -0500, Andrew Dunstan wrote:
> Well, the jsonb portion of this is arguably the most ready, certainly it's
> had a lot more on-list review.

Having crossread both patches I tend to agree with this. I don't think
it's unrealistic to get jsonb committable, but the hstore bits are
another story.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Christophe Pettus <xof(at)thebuild(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 20:01:39
Message-ID: 5310EB23.2060807@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 02/28/2014 02:46 PM, Tom Lane wrote:
> Josh Berkus <josh(at)agliodbs(dot)com> writes:
>> ... This requires larger changes to the existing patch, which likely means
>> missing the bus for 9.4 (and you've seen my blog about that)
> Yeah. I realize you're gung-ho about getting jsonb into 9.4 in some
> form, and I recognize that getting better JSON support is important.
> But I wonder how carefully you've thought about the damage it'll do
> if what ships in 9.4 is a weird, hard-to-use mishmash. I'd much
> rather see us take the time to get it right than to ship something
> that's basically a kluge. And having a core type that depends on
> an extension for critical functionality is certainly nothing but a
> kluge. As an example, you're arguing that some sysadmins won't permit
> installation of contrib modules. (Let's pass over the question of
> how true or sane that is.) If they won't allow hstore to be installed,
> and jsonb is crippled in consequence, where does that put us for
> adoption purposes? I'd argue that it's worse than not shipping jsonb
> yet at all.
>
>

That hasn't been the way we've done things in the past. We're frequently
incremental. New features sometimes take several releases to mature.
Taking an example from close by, this will be the third release with
Json, and it's got a bunch of spiffy new stuff, but there's at least one
more round to go (what Merlin calls Manipulation functions), which I'm
rather hopeing someone other than me will see fit to implement.

As for what Peter suggests, I just can't bring myself to do anything
that would require people to say "Oh, you want jsonb? You have to load
hstore." It would be plain embarrassing.

cheers

andrew


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Christophe Pettus <xof(at)thebuild(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 20:19:41
Message-ID: 17642.1393618781@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> That hasn't been the way we've done things in the past. We're frequently
> incremental. New features sometimes take several releases to mature.

That's perfectly fair. What I don't want to see is a user-visible
dependency from jsonb to hstore. I think that'll be a mess that will
take years to undo. I'd rather say "sorry, that functionality isn't
there yet for jsonb" than have such a dependency.

Maybe we're in violent agreement.

regards, tom lane


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Geoghegan <pg(at)heroku(dot)com>, Thom Brown <thom(at)linux(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 20:29:24
Message-ID: CAHyXU0xcaHqHCUdWkY3vzKN0cR7Q6TpFvx7=4Wv3n7=Tch58CA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 28, 2014 at 1:45 PM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
> Well, the jsonb portion of this is arguably the most ready, certainly it's
> had a lot more on-list review.

That is definitely true. Also, the jsonb type does not introduce any
new patterns that are not already covered by json -- it just does some
things better/faster (and, in a couple of cases, a bit differently) so
there's a safe harbor. The implicit casts snuck in after the review
started -- that was a mistake obviously (but mostly with hstore). The
side argument of 'to extension or not' is just that. Make a decision
and commit this thing.

merlin


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Greg Stark <stark(at)mit(dot)edu>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Christophe Pettus <xof(at)thebuild(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 21:03:02
Message-ID: 5310F986.3040505@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 02/28/2014 11:19 AM, Greg Stark wrote:
> On Fri, Feb 28, 2014 at 7:12 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
>> * As cited, many sysadmins block the install of the -contrib package.
>
> Of course the more you put things in core the more you make this logic
> sound reasonable.

Touche'!

However, the problems with admins not wanting to install -contrib aren't
really about what's in or not in -contrib. They're about:

a) it's another package
b) they don't understand what's in it
c) it's called "contrib" which implies that these are
untested/unreviewed scripts, or somehow relates to hacking on Postgres,
both of which were true historically
d) there's some wierd/unstable dependancies for certain contrib modules
(UUID in particular)
e) some vendors don't make contrib available because of the encryption
thing (pgcrypto)

All of the above are worth fixing, but we don't have a proposal on the
table to do so.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Christophe Pettus <xof(at)thebuild(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 21:25:34
Message-ID: 5310FECE.6020404@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 02/28/2014 03:19 PM, Tom Lane wrote:
> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
>> That hasn't been the way we've done things in the past. We're frequently
>> incremental. New features sometimes take several releases to mature.
> That's perfectly fair. What I don't want to see is a user-visible
> dependency from jsonb to hstore. I think that'll be a mess that will
> take years to undo. I'd rather say "sorry, that functionality isn't
> there yet for jsonb" than have such a dependency.
>
> Maybe we're in violent agreement.
>
>

Maybe we are.

There's actually no real dependency. In fact, the dependency is the
other way. The jsonb patches I have been posting could be committed and
pass every regression test and we'd have useful better performance for
some operations. Every json function has an analog in jsonb except the
generator functions (to_json and friends), and they use the same parser
so they accept exactly the same input. The only "dependency" is if you
want to be able to use some advanced indexing and other functionality,
for which we don't currently have jsonb equivalents of the new hstore
operators, because we ran out of time. Then you can get this
functionality by casting the data to hstore (assuming we also have
nested-hstore committed) and using its operators. But that's no more a
dependency than it is for any other type for which you can leverage this
functionality (e.g. any record type).

cheers

andrew


From: Christophe Pettus <xof(at)thebuild(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Greg Stark <stark(at)mit(dot)edu>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 21:31:45
Message-ID: D9311AD2-BEB2-4A3A-AEE0-16EE2933D882@thebuild.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On Feb 28, 2014, at 1:03 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> However, the problems with admins not wanting to install -contrib aren't
> really about what's in or not in -contrib.

I'll also mention that an increasingly large number of people are running PostgreSQL in an environment where they don't get to pick what packages are installed on their server (RDS, for example). Telling them that something is in -contrib can very well be telling them "You can't have it" in those cases.

--
-- Christophe Pettus
xof(at)thebuild(dot)com


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Christophe Pettus <xof(at)thebuild(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Greg Stark <stark(at)mit(dot)edu>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 21:34:29
Message-ID: CAM3SWZQovWuE9A+7ZNgY9eW9R2ZvAgfAO3gEVB4EaVddtj05Kw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 28, 2014 at 1:31 PM, Christophe Pettus <xof(at)thebuild(dot)com> wrote:
> I'll also mention that an increasingly large number of people are running PostgreSQL in an environment where they don't get to pick what packages are installed on their server (RDS, for example). Telling them that something is in -contrib can very well be telling them "You can't have it" in those cases.

Amazon RDS Postgres has hstore.

--
Peter Geoghegan


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Christophe Pettus <xof(at)thebuild(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 21:35:31
Message-ID: CAM3SWZRskhLCqLvQxUWLd5b_vyfOsFY+u8ssnA9g+nBCjB0eYg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 28, 2014 at 1:25 PM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
> The only "dependency" is if you want to be able to use some advanced
> indexing and other functionality, for which we don't currently have jsonb
> equivalents of the new hstore operators, because we ran out of time. Then
> you can get this functionality by casting the data to hstore (assuming we
> also have nested-hstore committed) and using its operators. But that's no
> more a dependency than it is for any other type for which you can leverage
> this functionality (e.g. any record type).

I don't think hstore-style indexing is "advanced"; it's the main
reason for there being a jsonb, in my view. Anyway, this is where I
have a hard time understanding what you intend for jsonb for 9.4. You
ran out of time for writing jsonb operator classes, and so you can use
the hstore ones, which work fine. But, if we didn't run out of time,
how would the jsonb operator classes differ from the hstore ones? Is
there something inferior about the hstore operator class as compared
to a hypothetical jsonb operator class, other than the superficial
need to cast?

--
Peter Geoghegan


From: Christophe Pettus <xof(at)thebuild(dot)com>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 21:38:06
Message-ID: E474B628-E7C1-4BA0-A376-65DCD1135CCD@thebuild.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On Feb 28, 2014, at 1:35 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
> I don't think hstore-style indexing is "advanced"; it's the main
> reason for there being a jsonb, in my view.

jsonb is significantly faster than json even without indexing; there are plenty of reasons to have jsonb even if we don't initially have indexing operations for it.

--
-- Christophe Pettus
xof(at)thebuild(dot)com


From: Christophe Pettus <xof(at)thebuild(dot)com>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Greg Stark <stark(at)mit(dot)edu>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 21:39:23
Message-ID: 18765A60-4DF1-4CD7-BB85-D45662B34372@thebuild.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On Feb 28, 2014, at 1:34 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:

> Amazon RDS Postgres has hstore.

Just observing that putting something in -contrib does not mean every installation can automatically adopt it.

--
-- Christophe Pettus
xof(at)thebuild(dot)com


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Christophe Pettus <xof(at)thebuild(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 22:12:48
Message-ID: CAM3SWZSAGjQM8szL3XOetUNHQd5fcbQzo4L2B5yXjXjEeAuFtg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 28, 2014 at 1:38 PM, Christophe Pettus <xof(at)thebuild(dot)com> wrote:
> jsonb is significantly faster than json even without indexing; there are plenty of reasons to have jsonb even if we don't initially have indexing operations for it.

That may be true, although I think that that's still very disappointing.

In order to make a rational decision to do the work incrementally, we
need to know what we're putting off until 9.5. AFAICT, we have these
operator classes that work fine with jsonb for the purposes of
hstore-style indexing (the hstore operator classes). Wasn't that the
point? When there is a dedicated set of jsonb operator classes, what
will be different about them, other than the fact that they won't be
hstore operator classes? A decision predicated on waiting for those to
come in 9.5 must consider what we're actually waiting for, and right
now that seems very hazy.

--
Peter Geoghegan


From: Christophe Pettus <xof(at)thebuild(dot)com>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 22:40:27
Message-ID: 81A5AEBB-7E74-463F-8E06-0275F2188228@thebuild.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On Feb 28, 2014, at 2:12 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:

> AFAICT, we have these
> operator classes that work fine with jsonb for the purposes of
> hstore-style indexing (the hstore operator classes).

That assumes that it is acceptable that jsonb be packaged in the hstore extension. To put it mildly, there's no consensus on that point; indeed, I think there's consensus that's a non-starter.

--
-- Christophe Pettus
xof(at)thebuild(dot)com


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Christophe Pettus <xof(at)thebuild(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-02-28 22:41:22
Message-ID: CAM3SWZTXOgRNo4dKGp80=TkM89yEeUQ3tn2NGzSLEWiQmPugng@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 28, 2014 at 2:40 PM, Christophe Pettus <xof(at)thebuild(dot)com> wrote:
> On Feb 28, 2014, at 2:12 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
>
>> AFAICT, we have these
>> operator classes that work fine with jsonb for the purposes of
>> hstore-style indexing (the hstore operator classes).
>
> That assumes that it is acceptable that jsonb be packaged in the hstore extension. To put it mildly, there's no consensus on that point; indeed, I think there's consensus that's a non-starter.

No, it assumes nothing at all. It's a very simple question.

--
Peter Geoghegan


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Geoghegan <pg(at)heroku(dot)com>, Thom Brown <thom(at)linux(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-03 14:57:59
Message-ID: CAHyXU0xUvwhcf8CWuu2PZ0S_jmN4mZ6VfNM+9fkjasA3sFzNuw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 28, 2014 at 2:01 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> On 2014-02-28 14:45:29 -0500, Andrew Dunstan wrote:
>> Well, the jsonb portion of this is arguably the most ready, certainly it's
>> had a lot more on-list review.
>
> Having crossread both patches I tend to agree with this. I don't think
> it's unrealistic to get jsonb committable, but the hstore bits are
> another story.

hm, do you have any specific concerns/objections about hstore?

merlin


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Geoghegan <pg(at)heroku(dot)com>, Thom Brown <thom(at)linux(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-03 15:00:14
Message-ID: 20140303150014.GF23352@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2014-03-03 08:57:59 -0600, Merlin Moncure wrote:
> On Fri, Feb 28, 2014 at 2:01 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> > On 2014-02-28 14:45:29 -0500, Andrew Dunstan wrote:
> >> Well, the jsonb portion of this is arguably the most ready, certainly it's
> >> had a lot more on-list review.
> >
> > Having crossread both patches I tend to agree with this. I don't think
> > it's unrealistic to get jsonb committable, but the hstore bits are
> > another story.
>
> hm, do you have any specific concerns/objections about hstore?

I've listed a fair number in various emails, some have been addressed
since I think. But just take a look at the patch, at least last when I
looked, it was simply far from ready. And it's quite a bit of code, so
it's not something that can be addressed within 5min.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Oleg Bartunov <obartunov(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-03 15:17:12
Message-ID: CAF4Au4wz+x1UsutUiuqe4oY5ODvbMuvJG+ktVnEZ6EFwO5G9KQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andres,

you can always look at our development repository:
https://github.com/feodor/postgres/tree/hstore - hstore only,
https://github.com/feodor/postgres/tree/jsonb_and_hstore - hstore with jsonb

Since we were concentrated on the jsonb_and_hstore branch we usually
wait Andrew, who publish patch. You last issues were addressed in
both branches.

Oleg

PS.

We are not native-english and may not well inderstand your criticism
well, but please try to be a little bit polite. We are working
together and our common goal is to make postgres better. Your notes
are very important for quality of postgres, but sometimes you drive us
...

On Mon, Mar 3, 2014 at 7:00 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> On 2014-03-03 08:57:59 -0600, Merlin Moncure wrote:
>> On Fri, Feb 28, 2014 at 2:01 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
>> > On 2014-02-28 14:45:29 -0500, Andrew Dunstan wrote:
>> >> Well, the jsonb portion of this is arguably the most ready, certainly it's
>> >> had a lot more on-list review.
>> >
>> > Having crossread both patches I tend to agree with this. I don't think
>> > it's unrealistic to get jsonb committable, but the hstore bits are
>> > another story.
>>
>> hm, do you have any specific concerns/objections about hstore?
>
> I've listed a fair number in various emails, some have been addressed
> since I think. But just take a look at the patch, at least last when I
> looked, it was simply far from ready. And it's quite a bit of code, so
> it's not something that can be addressed within 5min.
>
> Greetings,
>
> Andres Freund
>
> --
> Andres Freund http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Training & Services
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Oleg Bartunov <obartunov(at)gmail(dot)com>
Cc: Teodor Sigaev <teodor(at)sigaev(dot)ru>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-03 15:22:08
Message-ID: 20140303152208.GG23352@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi Oleg,

On 2014-03-03 19:17:12 +0400, Oleg Bartunov wrote:
> Since we were concentrated on the jsonb_and_hstore branch we usually
> wait Andrew, who publish patch. You last issues were addressed in
> both branches.

I'll try to have look sometime soon.

> We are not native-english and may not well inderstand your criticism
> well, but please try to be a little bit polite. We are working
> together and our common goal is to make postgres better. Your notes
> are very important for quality of postgres, but sometimes you drive us
> ...

I am sorry if I came over as impolite. I just tried to point at things I
thought needed improvement, and imo there were quite some. A patch
needing polishing isn't something that carries shame, blame or
anything. It's just a state a patch can be in.

Greetings,

Andres Freund

PS: Not a native speaker either...

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Oleg Bartunov <obartunov(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Teodor Sigaev <teodor(at)sigaev(dot)ru>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-03 15:25:01
Message-ID: CAF4Au4ya=w9-rAhfy99Wfyx++btLVNeFp6kHMgHrGuxA-dc70Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Mar 3, 2014 at 7:22 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> Hi Oleg,
>
> On 2014-03-03 19:17:12 +0400, Oleg Bartunov wrote:
>> Since we were concentrated on the jsonb_and_hstore branch we usually
>> wait Andrew, who publish patch. You last issues were addressed in
>> both branches.
>
> I'll try to have look sometime soon.
>
>> We are not native-english and may not well inderstand your criticism
>> well, but please try to be a little bit polite. We are working
>> together and our common goal is to make postgres better. Your notes
>> are very important for quality of postgres, but sometimes you drive us
>> ...
>
> I am sorry if I came over as impolite. I just tried to point at things I
> thought needed improvement, and imo there were quite some. A patch
> needing polishing isn't something that carries shame, blame or
> anything. It's just a state a patch can be in.

We have not so much time to go deep onto 100th messages threads and sometimes
just lost directions.

>
> Greetings,
>
> Andres Freund
>
> PS: Not a native speaker either...

That's explain all :)

>
> --
> Andres Freund http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Training & Services


From: Gavin Flower <GavinFlower(at)archidevsys(dot)co(dot)nz>
To: obartunov(at)gmail(dot)com, Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Teodor Sigaev <teodor(at)sigaev(dot)ru>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-03 19:23:19
Message-ID: 5314D6A7.1070500@archidevsys.co.nz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 04/03/14 04:25, Oleg Bartunov wrote:
> On Mon, Mar 3, 2014 at 7:22 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
[...]
>
> PS: Not a native speaker either...
> That's explain all :)
>
[...]

I AM a native English speaker born in England - though if you read some
of my postings where I've been particularly careless, you well assume
otherwise!

My Chinese wife sometimes corrects my English, and from time to time she
is right!

To the extent that I've read the postings of non-native English speakers
like Oleg & Andres, I've not noticed any difficulty understanding what
they meant - other than technical issues that would also be the same for
me from gifted native English speakers!

Cheers,
Gavin


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Christophe Pettus <xof(at)thebuild(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-04 00:50:51
Message-ID: CAM3SWZQbkaM5EEsO=4BV6JE1OghODWqguaYf7qcL2DqDOavAFA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 28, 2014 at 2:12 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
> In order to make a rational decision to do the work incrementally, we
> need to know what we're putting off until 9.5. AFAICT, we have these
> operator classes that work fine with jsonb for the purposes of
> hstore-style indexing (the hstore operator classes). Wasn't that the
> point? When there is a dedicated set of jsonb operator classes, what
> will be different about them, other than the fact that they won't be
> hstore operator classes? A decision predicated on waiting for those to
> come in 9.5 must consider what we're actually waiting for, and right
> now that seems very hazy.

I really would like an answer to this question. Even if I totally
agreed with Andrew's assessment of the relative importance of having
jsonb be an in-core type, versus having some more advanced indexing
capabilities right away, this is still a very salient question.

I appreciate that the "put jsonb in hstore extension to get indexing
right away" trade-off is counter-intuitive, and it may even be that
there is an "everybody wins" third way that sees us factor out code
that is common to both jsonb and hstore as it exists today (although
I'm not optimistic about that). I would like to emphasize that if you
want to defer working on hstore-style jsonb operator classes for one
release, I don't necessarily have a problem with that. But, I must
insist on an answer here, from either you or Oleg or Teodor (it isn't
apparent that Oleg and Teodor concur with you on what's important):
what did we run out of time for? What will be different about the
jsonb operator classes that you're asking us to wait for in a future
release?

I understand that there are ambitious plans for a VODKA-am that will
support indexing operations on nested structures that are a lot more
advanced than those enabled by the hstore operator classes included in
these patches. However, surely these hstore operator classes have
independent value, or represent incremental progress?

--
Peter Geoghegan


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Christophe Pettus <xof(at)thebuild(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-04 00:57:09
Message-ID: 531524E5.5080005@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 03/03/2014 04:50 PM, Peter Geoghegan wrote:
> I understand that there are ambitious plans for a VODKA-am that will
> support indexing operations on nested structures that are a lot more
> advanced than those enabled by the hstore operator classes included in
> these patches. However, surely these hstore operator classes have
> independent value, or represent incremental progress?

Primary value is that in theory the hstore2 opclasses are available
*now*, as opposed to a year from now.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Christophe Pettus <xof(at)thebuild(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-04 01:07:14
Message-ID: CAM3SWZQHvV7ZEa3_g3KaqovwqDKuTP=GqnsvNx=8ikwOywVwsg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Mar 3, 2014 at 4:57 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> On 03/03/2014 04:50 PM, Peter Geoghegan wrote:
>> I understand that there are ambitious plans for a VODKA-am that will
>> support indexing operations on nested structures that are a lot more
>> advanced than those enabled by the hstore operator classes included in
>> these patches. However, surely these hstore operator classes have
>> independent value, or represent incremental progress?
>
> Primary value is that in theory the hstore2 opclasses are available
> *now*, as opposed to a year from now.

Well, yes, that's right. Although we cannot assume that VODKA will get
into 9.5 - it's a big project. Nor is it obvious to me that a
VODKA-ized set of operator classes would not bring with them exactly
the same dilemma as we now face vis-a-vis hstore code reuse and GIN
operator classes. So it seems reasonable to me to suppose that VODKA
should not influence our decision here. Please correct me if I'm
mistaken.

--
Peter Geoghegan


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Christophe Pettus <xof(at)thebuild(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-04 01:09:19
Message-ID: 531527BF.3000006@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 03/03/2014 05:07 PM, Peter Geoghegan wrote:
>> Primary value is that in theory the hstore2 opclasses are available
>> *now*, as opposed to a year from now.
>
> Well, yes, that's right. Although we cannot assume that VODKA will get
> into 9.5 - it's a big project. Nor is it obvious to me that a
> VODKA-ized set of operator classes would not bring with them exactly
> the same dilemma as we now face vis-a-vis hstore code reuse and GIN
> operator classes. So it seems reasonable to me to suppose that VODKA
> should not influence our decision here. Please correct me if I'm
> mistaken.

I would agree with you.

Andrew was onsite with a client over the weekend, which is why you
haven't heard from him on this thread.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Christophe Pettus <xof(at)thebuild(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-04 02:17:10
Message-ID: CAM3SWZTZ35Za_1BxjQa2uLN2DOqSHK449yGB-tsn_d6UwQRDZA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Mar 3, 2014 at 5:09 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> On 03/03/2014 05:07 PM, Peter Geoghegan wrote:
>>> Primary value is that in theory the hstore2 opclasses are available
>>> *now*, as opposed to a year from now.
>>
>> Well, yes, that's right. Although we cannot assume that VODKA will get
>> into 9.5 - it's a big project. Nor is it obvious to me that a
>> VODKA-ized set of operator classes would not bring with them exactly
>> the same dilemma as we now face vis-a-vis hstore code reuse and GIN
>> operator classes. So it seems reasonable to me to suppose that VODKA
>> should not influence our decision here. Please correct me if I'm
>> mistaken.
>
> I would agree with you.

Good. Hopefully you also mean that you recognize the dilemma referred
to above - that the hstore code reuse made a certain amount of sense,
and that more than likely the best way forward is to work out a way to
make it work. I'm not immediately all that concerned about what the
least worst way of doing that is (I just favor putting everything in
hstore on practical grounds). Another way to solve this problem might
be to simply live with the code duplication between core and hstore on
the grounds that hstore will eventually be deprecated as people switch
to jsonb over time (so under that regime nothing new would ever be
added to hstore, which we'd eventually remove from contrib entirely,
while putting everything new here in core). I don't favor that
approach, but it wouldn't be totally unreasonable, based on the
importance that is attached to jsonb, and based on what I'd estimate
to be the actual amount of code redundancy that that would create
(assuming that hstore gets no new functions and operators, since an
awful lot of the hstore-local code after this patch is applied is new
to hstore). I wouldn't stand in the way of this approach.

In my view the most important thing right now is that before anything
is committed, at the very least there needs to be a strategy around
getting hstore-style GIN indexing in jsonb. I cannot understand how
you can have an operator class today that works fine for hstore-style
indexing of jsonb (as far as that goes), but that that code is out of
bounds just because it's nominally (mostly new) hstore code, and you
cannot figure out a way of making that work that is acceptable from a
code maintenance perspective. If you cannot figure that out in a few
days, why should you ever be able to figure it out? We need to bite
the bullet here, whatever that might actually entail. Can we agree on
that much?

--
Peter Geoghegan


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Christophe Pettus <xof(at)thebuild(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-04 02:54:16
Message-ID: 53154058.3080002@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 03/03/2014 07:50 PM, Peter Geoghegan wrote:
> On Fri, Feb 28, 2014 at 2:12 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
>> In order to make a rational decision to do the work incrementally, we
>> need to know what we're putting off until 9.5. AFAICT, we have these
>> operator classes that work fine with jsonb for the purposes of
>> hstore-style indexing (the hstore operator classes). Wasn't that the
>> point? When there is a dedicated set of jsonb operator classes, what
>> will be different about them, other than the fact that they won't be
>> hstore operator classes? A decision predicated on waiting for those to
>> come in 9.5 must consider what we're actually waiting for, and right
>> now that seems very hazy.
> I really would like an answer to this question. Even if I totally
> agreed with Andrew's assessment of the relative importance of having
> jsonb be an in-core type, versus having some more advanced indexing
> capabilities right away, this is still a very salient question.

(Taking a break from some frantic customer work)

My aim for 9.4, given constraints of both the development cycle and my
time budget, has been to get jsonb to a point where it has equivalent
functionality to json, so that nobody is forced to say "well I'll have
to use json because it lacks function x." For the processing functions,
i.e. those that don't generate json from non-json, this should be true
with what's proposed. The jsonb processing functions should be about as
fast as, or in some cases significantly faster than, their json
equivalents. Parsing text input takes a little longer (surprisingly
little, actually), and reserializing takes significantly longer - I
haven't had a chance to look and see if we can improve that. Both of
these are more or less expected results.

For 9.5 I would hope that we have at least the equivalent of the
proposed hstore classes. But that's really just a start. Frankly, I
think we need to think a lot harder about how we want to be able to
index this sort of data. The proposed hstore operators appear to me to
be at best just scratching the surface of that. I'd like to be able to
index jsonb's #> and #>> operators, for example. Unanchored subpath
operators could be an area that's interesting to implement and index.

I also would like to see some basic insert/update/delete/merge operators
for json/jsonb - that's an area I wanted to work on for this lease but
wasn't able to arrange.

Note that future developments is a major topic of my pgcon talk, and I'm
hoping that we can get some good discussion going there.

cheers

andrew


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Christophe Pettus <xof(at)thebuild(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-04 02:59:37
Message-ID: 53154199.7040905@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 03/03/2014 06:17 PM, Peter Geoghegan wrote:
> Good. Hopefully you also mean that you recognize the dilemma referred
> to above - that the hstore code reuse made a certain amount of sense,
> and that more than likely the best way forward is to work out a way to
> make it work. I'm not immediately all that concerned about what the
> least worst way of doing that is (I just favor putting everything in
> hstore on practical grounds).

Well, I don't see how this is "on practical grounds" at this point.
Whether we keep jsonb in core or not, we have an equal amount of work
ahead of us. That's why I said the single-extension approach was
"conceptually simpler" rather than "actually simpler". It's easier to
understand, not necessarily easier to implement at this point.

Also, please recognize that the current implementation was what we
collectively decided on three months ago, and what Andrew worked pretty
hard to implement based on that collective decision. So if we're going
to change course, we need a specific reason to change course, not just
"it seems like a better idea now" or "I wasn't paying attention then".

The "one extension to rule them all" approach also has the issue of
Naming Things, but I think that could be solved with a symlink or two.

> Another way to solve this problem might
> be to simply live with the code duplication between core and hstore on

What code duplication?

> the grounds that hstore will eventually be deprecated as people switch
> to jsonb over time (so under that regime nothing new would ever be
> added to hstore, which we'd eventually remove from contrib entirely,
> while putting everything new here in core). I don't favor that
> approach, but it wouldn't be totally unreasonable, based on the
> importance that is attached to jsonb, and based on what I'd estimate
> to be the actual amount of code redundancy that that would create
> (assuming that hstore gets no new functions and operators, since an
> awful lot of the hstore-local code after this patch is applied is new
> to hstore). I wouldn't stand in the way of this approach.

Realistically, hstore will never go away. I'll bet you a round or two
of pints that, if we get both hstore2 and jsonb, within 2 years the
users of jsonb will be an order of magnitude more numerous that then
users of hstore, but folks out there have apps already built against
hstore1 and they're going to keep on the hstore path.

In the discussion you haven't apparently caught up on yet, we did
discuss moving *hstore* into core to make this whole thing easier.
However, that fell afoul of the fact that we currently have no mechanism
to move types between extensions and core without breaking upgrades for
everyone. So the only reason hstore is still an extension is because of
backwards-compatibility.

> In my view the most important thing right now is that before anything
> is committed, at the very least there needs to be a strategy around
> getting hstore-style GIN indexing in jsonb.

I think it's a good idea to have a strategy.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Christophe Pettus <xof(at)thebuild(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-04 03:39:45
Message-ID: CAM3SWZQCkKgQKd8xjaG9sswyOdJzJV8Bxtcd_-9ZseteSPkcNw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Mar 3, 2014 at 6:54 PM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
> My aim for 9.4, given constraints of both the development cycle and my time
> budget, has been to get jsonb to a point where it has equivalent
> functionality to json, so that nobody is forced to say "well I'll have to
> use json because it lacks function x." For the processing functions, i.e.
> those that don't generate json from non-json, this should be true with
> what's proposed. The jsonb processing functions should be about as fast as,
> or in some cases significantly faster than, their json equivalents. Parsing
> text input takes a little longer (surprisingly little, actually), and
> reserializing takes significantly longer - I haven't had a chance to look
> and see if we can improve that. Both of these are more or less expected
> results.

Okay, that's fine. I'm sure that jsonb has some value without
hstore-style indexing. That isn't really in question. What is in
question is why you would choose to give up on those capabilities.

> For 9.5 I would hope that we have at least the equivalent of the proposed
> hstore classes.

But the equivalent code to the proposed hstore operator classes is
*exactly the same* C code. The two types are fully binary coercible in
the patch, so why delay? Why is that additional step appreciably
riskier than adopting jsonb? I don't see why the functions associated
with the operators that comprise, say, the gin_hstore_ops operator
class represent much additional risk, assuming that jsonb is itself in
good shape. For example, the new hstore_contains() looks fairly
innocuous compared to much of the code you are apparently intent on
including in the first cut at jsonb. Have I missed something? Why are
those operators riskier than the operators you are intent on
including?

If it is true that you think that's a significant additional risk, a
risk too far, then it makes sense that you'd defer doing this. I would
like to know why that is, though, since I don't see it. Speaking of
missing operator classes, I'm pretty sure that it's ipso facto
unacceptable that there is no default btree operator class for the
type jsonb:

[local]/postgres=# \d+ bar
Table "public.bar"
Column | Type | Modifiers | Storage | Stats target | Description
--------+-------+-----------+----------+--------------+-------------
i | jsonb | | extended | |
Has OIDs: no

[local]/postgres=# select * from bar order by i;
ERROR: 42883: could not identify an ordering operator for type jsonb
LINE 1: select * from bar order by i;
^
HINT: Use an explicit ordering operator or modify the query.
LOCATION: get_sort_group_operators, parse_oper.c:221
Time: 6.424 ms
[local]/postgres=# select distinct i from bar;
ERROR: 42883: could not identify an equality operator for type jsonb
LINE 1: select distinct i from bar;
^
LOCATION: get_sort_group_operators, parse_oper.c:226
Time: 6.457 ms

> But that's really just a start. Frankly, I think we need to
> think a lot harder about how we want to be able to index this sort of data.
> The proposed hstore operators appear to me to be at best just scratching the
> surface of that. I'd like to be able to index jsonb's #> and #>> operators,
> for example. Unanchored subpath operators could be an area that's
> interesting to implement and index.

I'm sure that's true, but it's not our immediate concern. We need to
think very hard about it to get everything we want, but we also need
to think somewhat harder about it in order to get even a basic jsonb
type committed.

--
Peter Geoghegan


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Christophe Pettus <xof(at)thebuild(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-04 04:01:50
Message-ID: CAM3SWZSctwY2Xz1BTxG08OkpKEE48Zu0c+P9DfVV8J=py5eW0g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Mar 3, 2014 at 7:39 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
>> But that's really just a start. Frankly, I think we need to
>> think a lot harder about how we want to be able to index this sort of data.
>> The proposed hstore operators appear to me to be at best just scratching the
>> surface of that. I'd like to be able to index jsonb's #> and #>> operators,
>> for example. Unanchored subpath operators could be an area that's
>> interesting to implement and index.
>
> I'm sure that's true, but it's not our immediate concern. We need to
> think very hard about it to get everything we want, but we also need
> to think somewhat harder about it in order to get even a basic jsonb
> type committed.

By the way, I think it would be fine to defer adding many of the new
hstore operators and functions until 9.5 (as hstore infrastructure, or
in-core jsonb infrastructure, or anything else), if you felt you had
to, provided that you included just those sufficient to create jsonb
operator classes (plus the operator classes themselves, of course).
There is absolutely no question about having to do this for
B-Tree...why not go a couple of operator classes further?

--
Peter Geoghegan


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Christophe Pettus <xof(at)thebuild(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-04 04:20:33
Message-ID: CAM3SWZRF6tLde-Z=icJd5OWySQ3HXi9400BRsugd6vrYit99Xg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Mar 3, 2014 at 6:59 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> Also, please recognize that the current implementation was what we
> collectively decided on three months ago, and what Andrew worked pretty
> hard to implement based on that collective decision. So if we're going
> to change course, we need a specific reason to change course, not just
> "it seems like a better idea now" or "I wasn't paying attention then".

I'm pretty sure it doesn't work like that. But if it does, what
exactly am I insisting on that is inconsistent with that consensus? In
what way are we changing course? I think I'm being eminently flexible.
I don't want a jsonb type that is broken, as for example by not having
a default B-Tree operator class. Why don't you let me get on with it?

--
Peter Geoghegan


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Christophe Pettus <xof(at)thebuild(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-04 04:59:02
Message-ID: 53155D96.30102@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 03/03/2014 10:39 PM, Peter Geoghegan wrote:
> On Mon, Mar 3, 2014 at 6:54 PM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
>> My aim for 9.4, given constraints of both the development cycle and my time
>> budget, has been to get jsonb to a point where it has equivalent
>> functionality to json, so that nobody is forced to say "well I'll have to
>> use json because it lacks function x." For the processing functions, i.e.
>> those that don't generate json from non-json, this should be true with
>> what's proposed. The jsonb processing functions should be about as fast as,
>> or in some cases significantly faster than, their json equivalents. Parsing
>> text input takes a little longer (surprisingly little, actually), and
>> reserializing takes significantly longer - I haven't had a chance to look
>> and see if we can improve that. Both of these are more or less expected
>> results.
> Okay, that's fine. I'm sure that jsonb has some value without
> hstore-style indexing. That isn't really in question. What is in
> question is why you would choose to give up on those capabilities.

Who has given up?

I did as much as I could given the time constraints I mentioned. That's
the way Postgres works. People do what they can.

>
>> For 9.5 I would hope that we have at least the equivalent of the proposed
>> hstore classes.
> But the equivalent code to the proposed hstore operator classes is
> *exactly the same* C code. The two types are fully binary coercible in
> the patch, so why delay? Why is that additional step appreciably
> riskier than adopting jsonb? I don't see why the functions associated
> with the operators that comprise, say, the gin_hstore_ops operator
> class represent much additional risk, assuming that jsonb is itself in
> good shape. For example, the new hstore_contains() looks fairly
> innocuous compared to much of the code you are apparently intent on
> including in the first cut at jsonb. Have I missed something? Why are
> those operators riskier than the operators you are intent on
> including?

You are really jumping at conclusions as to what's in my head,
conclusions that are not justified by anything I have said.

Who said they were riskier? I certainly didn't.

Of course the operators would be the same. We could have them today, by
migrating the exisiting code into core and making the hstore operators
use that code instead. I could probably do it in about a day (if I had a
day to spare). I was actually rather expecting that they would have been
put there for the jsonb type when Teodor moved some code so we could
have a jsonb type. But since he didn't we find ourselves where we are today.

If that's what it will take to get agreement I will try to make it happen.

>
> If it is true that you think that's a significant additional risk, a
> risk too far, then it makes sense that you'd defer doing this. I would
> like to know why that is, though, since I don't see it.

I don't, as I said. This whole line of speculation has me quite puzzled.

> Speaking of
> missing operator classes, I'm pretty sure that it's ipso facto
> unacceptable that there is no default btree operator class for the
> type jsonb:
>
> [local]/postgres=# \d+ bar
> Table "public.bar"
> Column | Type | Modifiers | Storage | Stats target | Description
> --------+-------+-----------+----------+--------------+-------------
> i | jsonb | | extended | |
> Has OIDs: no
>
> [local]/postgres=# select * from bar order by i;
> ERROR: 42883: could not identify an ordering operator for type jsonb
> LINE 1: select * from bar order by i;
> ^
> HINT: Use an explicit ordering operator or modify the query.
> LOCATION: get_sort_group_operators, parse_oper.c:221
> Time: 6.424 ms
> [local]/postgres=# select distinct i from bar;
> ERROR: 42883: could not identify an equality operator for type jsonb
> LINE 1: select distinct i from bar;
> ^
> LOCATION: get_sort_group_operators, parse_oper.c:226
> Time: 6.457 ms
>

Well, the trouble is that the only one that would make sense is one that
did in effect "order by i::json", since it would be weird to have these
different. That might make the ordering slow, but would be easy enough
to add.

>> But that's really just a start. Frankly, I think we need to
>> think a lot harder about how we want to be able to index this sort of data.
>> The proposed hstore operators appear to me to be at best just scratching the
>> surface of that. I'd like to be able to index jsonb's #> and #>> operators,
>> for example. Unanchored subpath operators could be an area that's
>> interesting to implement and index.
> I'm sure that's true, but it's not our immediate concern. We need to
> think very hard about it to get everything we want, but we also need
> to think somewhat harder about it in order to get even a basic jsonb
> type committed.
>

I think you need to be more accepting of the fact that Postgres
development is frequently incremental. Nothing that's been proposed
would prevent future development of the type AFAICT. Enums took us two
or three releases to get to where we are. Arrays took longer. Even a
smallish feature like CSV import is still being tweaked about seven
releases after it was introduced.

cheers

andrew


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Christophe Pettus <xof(at)thebuild(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-04 05:05:06
Message-ID: 53155F02.8030105@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 03/03/2014 11:20 PM, Peter Geoghegan wrote:
> On Mon, Mar 3, 2014 at 6:59 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
>> Also, please recognize that the current implementation was what we
>> collectively decided on three months ago, and what Andrew worked pretty
>> hard to implement based on that collective decision. So if we're going
>> to change course, we need a specific reason to change course, not just
>> "it seems like a better idea now" or "I wasn't paying attention then".
> I'm pretty sure it doesn't work like that. But if it does, what
> exactly am I insisting on that is inconsistent with that consensus? In
> what way are we changing course? I think I'm being eminently flexible.
> I don't want a jsonb type that is broken, as for example by not having
> a default B-Tree operator class. Why don't you let me get on with it?
>

You're welcome to submit any code you like. We haven't been secret about
where the code lives. Nobody is stopping you.

What you're not welcome to do, from my POV, is move jsonb into the
hstore extension. I strenuously object to any such plan.

cheers

andrew


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Christophe Pettus <xof(at)thebuild(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-04 05:06:09
Message-ID: CAM3SWZTJkMKKQ7rh1hZLNk391YFdJaVLWjnF+XmSJ0pZGCUKog@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Mar 3, 2014 at 9:05 PM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
> What you're not welcome to do, from my POV, is move jsonb into the hstore
> extension. I strenuously object to any such plan.

We both know that that isn't really the point of contention at all.

--
Peter Geoghegan


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Christophe Pettus <xof(at)thebuild(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-04 05:21:13
Message-ID: CAM3SWZRss5pQXDE7eVPbwSsv1sdG58Ox_46YwbET57WtgcLCWg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Mar 3, 2014 at 8:59 PM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
>> Okay, that's fine. I'm sure that jsonb has some value without
>> hstore-style indexing. That isn't really in question. What is in
>> question is why you would choose to give up on those capabilities.
>
>
>
> Who has given up?
>
> I did as much as I could given the time constraints I mentioned. That's the
> way Postgres works. People do what they can.

It would be such a small amount of additional effort to add those
operators sufficient to make those operator classes work, relative to
the huge benefits. The code already exists, there isn't terribly much
of it, and is isn't scarier than what you already have. I cannot
fathom how you could choose to not do so, as long as you wanted a
jsonb type in core. It is just not sensible.

>> But the equivalent code to the proposed hstore operator classes is
>> *exactly the same* C code. The two types are fully binary coercible in
>> the patch, so why delay? Why is that additional step appreciably
>> riskier than adopting jsonb? I don't see why the functions associated
>> with the operators that comprise, say, the gin_hstore_ops operator
>> class represent much additional risk, assuming that jsonb is itself in
>> good shape. For example, the new hstore_contains() looks fairly
>> innocuous compared to much of the code you are apparently intent on
>> including in the first cut at jsonb. Have I missed something? Why are
>> those operators riskier than the operators you are intent on
>> including?
>
>
>
> You are really jumping at conclusions as to what's in my head, conclusions
> that are not justified by anything I have said.

Right - they are conclusions justified by what you have not said, and
my attempt to fill in the gaps.

> Who said they were riskier? I certainly didn't.
>
> Of course the operators would be the same. We could have them today, by
> migrating the exisiting code into core and making the hstore operators use
> that code instead. I could probably do it in about a day (if I had a day to
> spare). I was actually rather expecting that they would have been put there
> for the jsonb type when Teodor moved some code so we could have a jsonb
> type. But since he didn't we find ourselves where we are today.
>
> If that's what it will take to get agreement I will try to make it happen.

I'll do it if you really are that strapped for time.

> Well, the trouble is that the only one that would make sense is one that did
> in effect "order by i::json", since it would be weird to have these
> different. That might make the ordering slow, but would be easy enough to
> add.

No, it would not be weird to have those be different. In some cases it
would be totally mandatory, as for example when there is a variable
lc_numeric setting that affects the format of numeric. Only text
equality is equivalent to a binary string comparison.

> I think you need to be more accepting of the fact that Postgres development
> is frequently incremental. Nothing that's been proposed would prevent future
> development of the type AFAICT. Enums took us two or three releases to get
> to where we are. Arrays took longer. Even a smallish feature like CSV import
> is still being tweaked about seven releases after it was introduced.

My objection is that the cost/benefit analysis behind the idea of
excluding the hstore operator classes seem to make no sense.

--
Peter Geoghegan


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: obartunov(at)gmail(dot)com
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-04 08:38:39
Message-ID: CAM3SWZRP32qHjKhT9aU-H=QZ2DjUpZ1RQTw5EC+i+PcEwi_fBA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi Oleg,

On Mon, Mar 3, 2014 at 7:17 AM, Oleg Bartunov <obartunov(at)gmail(dot)com> wrote:
> you can always look at our development repository:

I think I found a bug:

[local]/postgres=# \d+ bar
Table "public.bar"
Column | Type | Modifiers | Storage | Stats target | Description
--------+-------+-----------+----------+--------------+-------------
i | jsonb | | extended | |
Indexes:
"f" gin (i)
Has OIDs: no

[local]/postgres=# insert into bar values ('{
"firstName": "John",
"lastName": "Smith",
"age": 25,
"address": {
"streetAddress": "21 2nd Street",
"city": "New York",
"state": "NY",
"postalCode": 10021
},
"phoneNumbers": [
{
"type": "home",
"number": "212 555-1234"
},
{
"type": "fax",
"number": "646 555-4567"
}
]
}');
INSERT 0 1
Time: 7.635 ms
[local]/postgres=# select * from bar where i @> '{"age":25.0}'::jsonb;
i
---
(0 rows)

Time: 2.443 ms
[local]/postgres=# explain select * from bar where i @> '{"age":25.0}'::jsonb;
QUERY PLAN
-----------------------------------------------------------------
Bitmap Heap Scan on bar (cost=16.01..20.02 rows=1 width=32)
Recheck Cond: ((i)::hstore @> '"age"=>25.0'::hstore)
-> Bitmap Index Scan on f (cost=0.00..16.01 rows=1 width=0)
Index Cond: ((i)::hstore @> '"age"=>25.0'::hstore)
Planning time: 0.161 ms
(5 rows)

[local]/postgres=# set enable_bitmapscan = off;
SET
Time: 6.052 ms
[local]/postgres=# select * from bar where i @> '{"age":25.0}'::jsonb;
-[ RECORD 1 ]------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
i | {"age": 25, "address": {"city": "New York", "state": "NY",
"postalCode": 10021, "streetAddress": "21 2nd Street"}, "lastName":
"Smith", "firstName": "John", "phoneNumbers": [{"type": "home",
"number": "212 555-1234"}, {"type": "fax", "number": "646 555-4567"}]}

Time: 6.479 ms
[local]/postgres=# explain select * from bar where i @> '{"age":25.0}'::jsonb;
QUERY PLAN
-----------------------------------------------------
Seq Scan on bar (cost=0.00..26.38 rows=1 width=32)
Filter: ((i)::hstore @> '"age"=>25.0'::hstore)
Planning time: 0.154 ms
(3 rows)

Time: 6.565 ms

--
Peter Geoghegan


From: Oleg Bartunov <obartunov(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-04 09:30:25
Message-ID: CAF4Au4w47NgEpqoaDMXfLFs+RssMU=JYiR=DHzLzsZh8Qn2HDQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Thanks, looks like a bug.

On Tue, Mar 4, 2014 at 12:38 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
> Hi Oleg,
>
> On Mon, Mar 3, 2014 at 7:17 AM, Oleg Bartunov <obartunov(at)gmail(dot)com> wrote:
>> you can always look at our development repository:
>
> I think I found a bug:
>
> [local]/postgres=# \d+ bar
> Table "public.bar"
> Column | Type | Modifiers | Storage | Stats target | Description
> --------+-------+-----------+----------+--------------+-------------
> i | jsonb | | extended | |
> Indexes:
> "f" gin (i)
> Has OIDs: no
>
> [local]/postgres=# insert into bar values ('{
> "firstName": "John",
> "lastName": "Smith",
> "age": 25,
> "address": {
> "streetAddress": "21 2nd Street",
> "city": "New York",
> "state": "NY",
> "postalCode": 10021
> },
> "phoneNumbers": [
> {
> "type": "home",
> "number": "212 555-1234"
> },
> {
> "type": "fax",
> "number": "646 555-4567"
> }
> ]
> }');
> INSERT 0 1
> Time: 7.635 ms
> [local]/postgres=# select * from bar where i @> '{"age":25.0}'::jsonb;
> i
> ---
> (0 rows)
>
> Time: 2.443 ms
> [local]/postgres=# explain select * from bar where i @> '{"age":25.0}'::jsonb;
> QUERY PLAN
> -----------------------------------------------------------------
> Bitmap Heap Scan on bar (cost=16.01..20.02 rows=1 width=32)
> Recheck Cond: ((i)::hstore @> '"age"=>25.0'::hstore)
> -> Bitmap Index Scan on f (cost=0.00..16.01 rows=1 width=0)
> Index Cond: ((i)::hstore @> '"age"=>25.0'::hstore)
> Planning time: 0.161 ms
> (5 rows)
>
> [local]/postgres=# set enable_bitmapscan = off;
> SET
> Time: 6.052 ms
> [local]/postgres=# select * from bar where i @> '{"age":25.0}'::jsonb;
> -[ RECORD 1 ]------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> i | {"age": 25, "address": {"city": "New York", "state": "NY",
> "postalCode": 10021, "streetAddress": "21 2nd Street"}, "lastName":
> "Smith", "firstName": "John", "phoneNumbers": [{"type": "home",
> "number": "212 555-1234"}, {"type": "fax", "number": "646 555-4567"}]}
>
> Time: 6.479 ms
> [local]/postgres=# explain select * from bar where i @> '{"age":25.0}'::jsonb;
> QUERY PLAN
> -----------------------------------------------------
> Seq Scan on bar (cost=0.00..26.38 rows=1 width=32)
> Filter: ((i)::hstore @> '"age"=>25.0'::hstore)
> Planning time: 0.154 ms
> (3 rows)
>
> Time: 6.565 ms
>
> --
> Peter Geoghegan


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: obartunov(at)gmail(dot)com
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-04 09:43:49
Message-ID: CAM3SWZQTuzjWiZQDzBiBajak+dsSVafZmhB-m2fvfE+17P8yaw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Mar 4, 2014 at 1:30 AM, Oleg Bartunov <obartunov(at)gmail(dot)com> wrote:
> Thanks, looks like a bug.

I guess this is down to the continued definition of gin_hstore_ops as
an opclass with text storage?:

+ CREATE OPERATOR CLASS gin_hstore_ops
+ DEFAULT FOR TYPE hstore USING gin
+ AS
+ OPERATOR 7 @>,
+ OPERATOR 9 ?(hstore,text),
+ OPERATOR 10 ?|(hstore,text[]),
+ OPERATOR 11 ?&(hstore,text[]),
+ FUNCTION 1 bttextcmp(text,text),
+ FUNCTION 2 gin_extract_hstore(internal, internal),
+ FUNCTION 3 gin_extract_hstore_query(internal,
internal, int2, internal, internal),
+ FUNCTION 4 gin_consistent_hstore(internal, int2,
internal, int4, internal, internal),
+ STORAGE text;

--
Peter Geoghegan


From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Peter Geoghegan <pg(at)heroku(dot)com>, obartunov(at)gmail(dot)com
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-04 10:07:07
Message-ID: 5315A5CB.6050401@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> I guess this is down to the continued definition of gin_hstore_ops as
> an opclass with text storage?:
No, type of this storage describes type of keys. For gin_hstore_ops each key and
each value will be stored as a text value. The root of problem is a JavaScript
or/and our numeric type. In JavaScript (which was a base for json type) you need
explicitly point type of compare to prevent unpredictable result.

select '25.0'::numeric = '25'::numeric;
?column?
----------
t
but
select '25.0'::numeric::text = '25'::numeric::text;
?column?
----------
f

and
select '{"a": 25}'::json->>'a' = '{"a": 25.0}'::json->>'a';
?column?
----------
f

In pointed example inserted value has age: 25 but searching jsonb value has
age:25.0.

>
> + CREATE OPERATOR CLASS gin_hstore_ops
> + DEFAULT FOR TYPE hstore USING gin
> + AS
> + OPERATOR 7 @>,
> + OPERATOR 9 ?(hstore,text),
> + OPERATOR 10 ?|(hstore,text[]),
> + OPERATOR 11 ?&(hstore,text[]),
> + FUNCTION 1 bttextcmp(text,text),
> + FUNCTION 2 gin_extract_hstore(internal, internal),
> + FUNCTION 3 gin_extract_hstore_query(internal,
> internal, int2, internal, internal),
> + FUNCTION 4 gin_consistent_hstore(internal, int2,
> internal, int4, internal, internal),
> + STORAGE text;
>
>

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/


From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Peter Geoghegan <pg(at)heroku(dot)com>, obartunov(at)gmail(dot)com
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-04 10:18:00
Message-ID: 5315A858.3060504@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> select '{"a": 25}'::json->>'a' = '{"a": 25.0}'::json->>'a';
> ?column?
> ----------
> f

Although for development version of hstore (not a current version)
# select 'a=> 25'::hstore = 'a=> 25.0'::hstore;
?column?
----------
t

That is because compareJsonbValue compares numeric values with a help of
numeric_cmp() instead of comparing text representation. This inconsistent will
be fixed.

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc: obartunov(at)gmail(dot)com, Andres Freund <andres(at)2ndquadrant(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-04 10:18:46
Message-ID: CAM3SWZQwE97S-U5FTjqvnk0nYnHHOa2GieT0h0s7u8x7+gNAxw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Mar 4, 2014 at 2:07 AM, Teodor Sigaev <teodor(at)sigaev(dot)ru> wrote:
> No, type of this storage describes type of keys. For gin_hstore_ops each key
> and each value will be stored as a text value. The root of problem is a
> JavaScript or/and our numeric type. In JavaScript (which was a base for json
> type) you need explicitly point type of compare to prevent unpredictable
> result.

That's what I meant, I think. But I'm not sure what you mean:

Native Chrome JavaScript.
Copyright (c) 2013 Google Inc
25 == 25
=> true
25 == 25.0
=> true

--
Peter Geoghegan


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc: obartunov(at)gmail(dot)com, Andres Freund <andres(at)2ndquadrant(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-04 10:18:56
Message-ID: CAM3SWZTjbm9L-8XuFfE1abhxPhuEARjDW6QEn+op3qaK_JV1AA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Mar 4, 2014 at 2:18 AM, Teodor Sigaev <teodor(at)sigaev(dot)ru> wrote:
> That is because compareJsonbValue compares numeric values with a help of
> numeric_cmp() instead of comparing text representation. This inconsistent
> will be fixed.

Cool.

--
Peter Geoghegan


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc: obartunov(at)gmail(dot)com, Andres Freund <andres(at)2ndquadrant(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-04 10:21:01
Message-ID: CAM3SWZTn8L+ZJs=nbo5aw80Uzh__XPfOHEKOsgRV7uTT6s20kg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Mar 4, 2014 at 2:18 AM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
> On Tue, Mar 4, 2014 at 2:18 AM, Teodor Sigaev <teodor(at)sigaev(dot)ru> wrote:
>> That is because compareJsonbValue compares numeric values with a help of
>> numeric_cmp() instead of comparing text representation. This inconsistent
>> will be fixed.
>
> Cool.

Perhaps this is obvious, but: I expect that you intend to fix the
inconsistency by having everywhere use a native numeric comparison.

Thanks
--
Peter Geoghegan


From: Oleg Bartunov <obartunov(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Teodor Sigaev <teodor(at)sigaev(dot)ru>, Andres Freund <andres(at)2ndquadrant(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-04 10:37:21
Message-ID: CAF4Au4wM212cXR2sOZPskDyz0iuZm42eJTxFONiqWf+5=DKkWw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I tried try.mongodb.com

> 25 == 25.0
true

On Tue, Mar 4, 2014 at 2:18 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
> On Tue, Mar 4, 2014 at 2:18 AM, Teodor Sigaev <teodor(at)sigaev(dot)ru> wrote:
>> That is because compareJsonbValue compares numeric values with a help of
>> numeric_cmp() instead of comparing text representation. This inconsistent
>> will be fixed.
>
> Cool.
>
>
> --
> Peter Geoghegan


From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: obartunov(at)gmail(dot)com, Andres Freund <andres(at)2ndquadrant(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-04 10:44:42
Message-ID: 5315AE9A.2030306@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Do we have function to trim right zeros in numeric?

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc: obartunov(at)gmail(dot)com, Andres Freund <andres(at)2ndquadrant(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-04 11:07:15
Message-ID: CAM3SWZQVHj7f_7Thvc5+4ZB-N_WzaH+k=qQH0VZgLN5GNnUtkw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Mar 4, 2014 at 2:44 AM, Teodor Sigaev <teodor(at)sigaev(dot)ru> wrote:
> Do we have function to trim right zeros in numeric?

I'm not sure why you ask. I hope it isn't because you want to fix this
bug by making text comparisons in place of numeric comparisons work by
fixing the exact problem I reported, because there are other similar
problems, such as differences in lc_numeric settings that your
implementation cannot possibly account for. If that's not what you
meant, I think it's okay if there are apparent trailing zeroes output
under similar circumstances to the numeric type proper. Isn't this
kind of thing intentionally not described by the relevant spec anyway?
--
Peter Geoghegan


From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: obartunov(at)gmail(dot)com, Andres Freund <andres(at)2ndquadrant(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-04 12:48:48
Message-ID: 5315CBB0.6000708@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> On Tue, Mar 4, 2014 at 2:44 AM, Teodor Sigaev <teodor(at)sigaev(dot)ru> wrote:
>> Do we have function to trim right zeros in numeric?

Fixed, pushed to github
(https://github.com/feodor/postgres/tree/jsonb_and_hstore). Now it used
hash_numeric to index numeric value. As I can see, it provides needed trim and
doesn't depend on locale. Possible mismatch (the same hash value for different
numeric valye) will rechecked anyway - interested operations set recheck flag.

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Oleg Bartunov <obartunov(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-04 13:49:57
Message-ID: CAHyXU0ygaCrWNoc6WKVix6ZZvqr=RmdeMbH5_vMpM_ZW+jk5jw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Mar 4, 2014 at 6:48 AM, Teodor Sigaev <teodor(at)sigaev(dot)ru> wrote:
>> On Tue, Mar 4, 2014 at 2:44 AM, Teodor Sigaev <teodor(at)sigaev(dot)ru> wrote:
>>>
>>> Do we have function to trim right zeros in numeric?
>
>
> Fixed, pushed to github
> (https://github.com/feodor/postgres/tree/jsonb_and_hstore). Now it used
> hash_numeric to index numeric value. As I can see, it provides needed trim
> and doesn't depend on locale. Possible mismatch (the same hash value for
> different numeric valye) will rechecked anyway - interested operations set
> recheck flag.

huh. what it is the standard for equivalence? I guess we'd be
following javascript ===, right?
(http://dorey.github.io/JavaScript-Equality-Table/).

merlin


From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Oleg Bartunov <obartunov(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-04 15:10:19
Message-ID: 5315ECDB.90803@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> huh. what it is the standard for equivalence? I guess we'd be
> following javascript ===, right?
> (http://dorey.github.io/JavaScript-Equality-Table/).

right.

But in your link I don't understand array (and object) equality rules. Hstore
(and jsonb) compare function believes that arrays are equal if each
corresponding elements of them are equal.

postgres=# select 'a=>[]'::hstore = 'a=>[]'::hstore;
?column?
----------
t
(1 row)

Time: 0,576 ms
postgres=# select 'a=>[0]'::hstore = 'a=>[0]'::hstore;
?column?
----------
t
(1 row)

Time: 0,663 ms
postgres=# select 'a=>[0]'::hstore = 'a=>[1]'::hstore;
?column?
----------
f

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Christophe Pettus <xof(at)thebuild(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-04 17:47:16
Message-ID: 531611A4.6080605@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 03/03/2014 09:06 PM, Peter Geoghegan wrote:
> On Mon, Mar 3, 2014 at 9:05 PM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
>> What you're not welcome to do, from my POV, is move jsonb into the hstore
>> extension. I strenuously object to any such plan.
>
> We both know that that isn't really the point of contention at all.
>

Actually, I didn't know any such thing. Just a couple days ago, you
were arguing fairly strongly for moving jsonb to the hstore extension.
You weren't clear that you'd given up on that line of argument.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Christophe Pettus <xof(at)thebuild(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-05 14:39:57
Message-ID: 20140305143957.GD28321@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Mar 3, 2014 at 06:59:37PM -0800, Josh Berkus wrote:
> Realistically, hstore will never go away. I'll bet you a round or two
> of pints that, if we get both hstore2 and jsonb, within 2 years the
> users of jsonb will be an order of magnitude more numerous that then
> users of hstore, but folks out there have apps already built against
> hstore1 and they're going to keep on the hstore path.
>
> In the discussion you haven't apparently caught up on yet, we did
> discuss moving *hstore* into core to make this whole thing easier.
> However, that fell afoul of the fact that we currently have no mechanism
> to move types between extensions and core without breaking upgrades for
> everyone. So the only reason hstore is still an extension is because of
> backwards-compatibility.

I have read last week's thread on this issue, and it certainly seems we
are in a non-ideal situation here.

The discussion centered around the split of JSONB in core and hstore in
contrib, the reason for some of these decisions, and what can make it
into PG 9.4.

I would like to take a different approach and explore what we
_eventually_ want, then back into what we have and what can be done for
9.4.

Basically, it seems we have heirchical hstore and JSONB which are
identical except for the input/output syntax. Many are confused how a
code split like that works long-term, and whether decisions made for 9.4
might limit future options.

There seems to be a basic tension that we can't move hstore into core,
must maintain backward-compatibility for hstore, and we want JSONB in
core. Long-term, having JSON in core and JSONB in contrib seems quite
odd.

So, I am going to ask a back-track question and ask why we can't move
hstore into core. Is this a problem with the oids of the hstore data
type and functions? Is this a pg_upgrade-only problem? Can this be
fixed?

Yes, I am ignoring what might be possible for 9.4, but I think these
questions must be asked if we are going to properly plan for post-9.4
changes.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ Everyone has their own god. +


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Christophe Pettus <xof(at)thebuild(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-05 14:53:41
Message-ID: 53173A75.4080601@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 03/05/2014 09:39 AM, Bruce Momjian wrote:

>
> So, I am going to ask a back-track question and ask why we can't move
> hstore into core. Is this a problem with the oids of the hstore data
> type and functions? Is this a pg_upgrade-only problem? Can this be
> fixed?

Yes, pg_upgrade is the problem, and no, I can't see how it can be fixed.

Builtin types have Oids in a certain range. Non-builtin types have Oids
outside that range. If you have a clever way to get over that I'd be all
ears, but it seems to me insurmountable right now.

A year or two ago I made a suggestion to help avoid such problems in
future, but as Josh said it got shot down, and in any case it would not
have helped with existing types such as hstore.

cheers

andrew


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Christophe Pettus <xof(at)thebuild(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-05 15:19:33
Message-ID: CAHyXU0yGcOmKe2rsAyRTC=FPsY6D8X-w9m0TC0ZZ7BEeHbwLtA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Mar 5, 2014 at 8:39 AM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> So, I am going to ask a back-track question and ask why we can't move
> hstore into core.

This is exactly the opposite of what should be happening. Now, jsonb
might make it into core because of the json precedent but the entire
purpose of the extension system is stop dumping everything in the
public namespace. Stuff 'in core' becomes locked in stone, forever,
because of backwards compatibility concerns, which are IMNSHO, a
bigger set of issues than even pg_upgrade related issues. Have we
gone through all the new hstore functions and made sure they don't
break existing applications? Putting things in core welds your only
escape hatch shut.

*All* non-sql standard types ought to be in extensions in an ideal world.

merlin


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Christophe Pettus <xof(at)thebuild(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-05 15:24:40
Message-ID: 3123.1394033080@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> On 03/05/2014 09:39 AM, Bruce Momjian wrote:
>> So, I am going to ask a back-track question and ask why we can't move
>> hstore into core. Is this a problem with the oids of the hstore data
>> type and functions? Is this a pg_upgrade-only problem? Can this be
>> fixed?

> Yes, pg_upgrade is the problem, and no, I can't see how it can be fixed.

> Builtin types have Oids in a certain range. Non-builtin types have Oids
> outside that range. If you have a clever way to get over that I'd be all
> ears, but it seems to me insurmountable right now.

More to the point:

1. Built-in types have predetermined, fixed OIDs. Types made by
extensions do not, and almost certainly will have different OIDs in
different existing databases.

2. There's no easy way to change the OID of an existing type during
pg_upgrade, because it may be on-disk in (at least) array headers.

We could possibly get around #2, if we could think of a secure way
for array_out and sibling functions to identify the array type
without use of the embedded OID value. I don't know how we could
do that though, particularly in polymorphic-function contexts.

Also, there might be other cases besides arrays where we've embedded
type OIDs in on-disk data; anyone remember?

regards, tom lane


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Bruce Momjian <bruce(at)momjian(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Christophe Pettus <xof(at)thebuild(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-05 15:26:52
Message-ID: CAHyXU0z_RN=OohYZPtyJA-DzqCVgeAWLrNcrLurqiNkRbWXfWA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Mar 5, 2014 at 9:24 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
>> On 03/05/2014 09:39 AM, Bruce Momjian wrote:
>>> So, I am going to ask a back-track question and ask why we can't move
>>> hstore into core. Is this a problem with the oids of the hstore data
>>> type and functions? Is this a pg_upgrade-only problem? Can this be
>>> fixed?
>
>> Yes, pg_upgrade is the problem, and no, I can't see how it can be fixed.
>
>> Builtin types have Oids in a certain range. Non-builtin types have Oids
>> outside that range. If you have a clever way to get over that I'd be all
>> ears, but it seems to me insurmountable right now.
>
> More to the point:
>
> 1. Built-in types have predetermined, fixed OIDs. Types made by
> extensions do not, and almost certainly will have different OIDs in
> different existing databases.
>
> 2. There's no easy way to change the OID of an existing type during
> pg_upgrade, because it may be on-disk in (at least) array headers.
>
> We could possibly get around #2, if we could think of a secure way
> for array_out and sibling functions to identify the array type
> without use of the embedded OID value. I don't know how we could
> do that though, particularly in polymorphic-function contexts.
>
> Also, there might be other cases besides arrays where we've embedded
> type OIDs in on-disk data; anyone remember?

composite types.

merlin


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Christophe Pettus <xof(at)thebuild(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-05 15:29:28
Message-ID: 531742D8.1020406@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 03/05/2014 10:24 AM, Tom Lane wrote:
>
> Also, there might be other cases besides arrays where we've embedded
> type OIDs in on-disk data; anyone remember?
>
>

Don't we do that in composites too?

cheers

andrew


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Bruce Momjian <bruce(at)momjian(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Christophe Pettus <xof(at)thebuild(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-05 15:30:22
Message-ID: 3170.1394033422@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
> On Wed, Mar 5, 2014 at 9:24 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Also, there might be other cases besides arrays where we've embedded
>> type OIDs in on-disk data; anyone remember?

> composite types.

But that's only the composite type's own OID, no? So it's not really
a problem unless the type we wanted to move into (or out of) core was
itself composite.

regards, tom lane


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Christophe Pettus <xof(at)thebuild(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-05 15:39:56
Message-ID: 5317454C.4020601@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 03/05/2014 10:30 AM, Tom Lane wrote:
> Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
>> On Wed, Mar 5, 2014 at 9:24 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> Also, there might be other cases besides arrays where we've embedded
>>> type OIDs in on-disk data; anyone remember?
>> composite types.
> But that's only the composite type's own OID, no? So it's not really
> a problem unless the type we wanted to move into (or out of) core was
> itself composite.
>
>

Sure, although that's not entirely impossible to imagine. I admit it
seems less likely, and I could accept it as a restriction if we
conquered the general case.

cheers

andrew


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Christophe Pettus <xof(at)thebuild(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-05 15:52:08
Message-ID: 20140305155208.GE28321@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Mar 5, 2014 at 09:19:33AM -0600, Merlin Moncure wrote:
> On Wed, Mar 5, 2014 at 8:39 AM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> > So, I am going to ask a back-track question and ask why we can't move
> > hstore into core.
>
> This is exactly the opposite of what should be happening. Now, jsonb
> might make it into core because of the json precedent but the entire
> purpose of the extension system is stop dumping everything in the
> public namespace. Stuff 'in core' becomes locked in stone, forever,
> because of backwards compatibility concerns, which are IMNSHO, a
> bigger set of issues than even pg_upgrade related issues. Have we
> gone through all the new hstore functions and made sure they don't
> break existing applications? Putting things in core welds your only
> escape hatch shut.
>
> *All* non-sql standard types ought to be in extensions in an ideal world.

I have seen your opinion on this but there have been enough
counter-arguments that I am not ready to head in that direction.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ Everyone has their own god. +


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Christophe Pettus <xof(at)thebuild(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-05 15:55:59
Message-ID: CA+TgmobywZjmDG_9MSf3V+P3v5EvoYBN4Pw0Lkr=Uvj69y+vuQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Mar 5, 2014 at 10:19 AM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
> On Wed, Mar 5, 2014 at 8:39 AM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
>> So, I am going to ask a back-track question and ask why we can't move
>> hstore into core.
>
> This is exactly the opposite of what should be happening. Now, jsonb
> might make it into core because of the json precedent but the entire
> purpose of the extension system is stop dumping everything in the
> public namespace. Stuff 'in core' becomes locked in stone, forever,
> because of backwards compatibility concerns, which are IMNSHO, a
> bigger set of issues than even pg_upgrade related issues. Have we
> gone through all the new hstore functions and made sure they don't
> break existing applications? Putting things in core welds your only
> escape hatch shut.

I agree. What concerns me about jsonb is that it doesn't seem very
done. If we commit it to core and find out later that we've made some
mistakes we'd like to fix, it's going to be difficult and
controversial. If it goes on PGXN and turns out to have some
problems, then the people responsible for that extension can decide
whether and how to preserve backward compatibility, or somebody else
can write something completely different. On a theoretical level, I'd
absolutely rather have jsonb in core - not because it's in any way
theoretically necessary, but because JSON is popular and better
support for it will be good for PostgreSQL. But on a practical level
I'd rather not ship it in 9.4 than ship something we might later
regret.

And despite the assertions from various people here that these
decisions were all made a long time ago and it's way too late to
question them, I don't buy it. There's not a single email on this
mailing list clearly laying out the design that we've ended up with,
and I'm willing to wager any reasonable amount of money that if
someone had posted an email saying "hey, we're thinking about setting
things up so that jsonb and hstore have the same binary format, but
you can't index jsonb directly, you have to cast it to hstore, is
everyone OK with that?" someone would have written back and said "no,
that sounds nuts". The reason why input on that particular aspect of
the design was not forthcoming isn't because everyone was OK with it;
it's because it was never clearly spelled out. Maybe someone will say
that this was discussed at last year's PGCon unconference, but surely
everyone here knows that a discussion at an unconference 8 months ago
doesn't substitute for a discussion on-list.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Christophe Pettus <xof(at)thebuild(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-05 16:07:24
Message-ID: 3483.1394035644@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> And despite the assertions from various people here that these
> decisions were all made a long time ago and it's way too late to
> question them, I don't buy it. There's not a single email on this
> mailing list clearly laying out the design that we've ended up with,
> and I'm willing to wager any reasonable amount of money that if
> someone had posted an email saying "hey, we're thinking about setting
> things up so that jsonb and hstore have the same binary format, but
> you can't index jsonb directly, you have to cast it to hstore, is
> everyone OK with that?" someone would have written back and said "no,
> that sounds nuts". The reason why input on that particular aspect of
> the design was not forthcoming isn't because everyone was OK with it;
> it's because it was never clearly spelled out.

No, that was never the design (I trust). It's where we are today
because time ran out to complete jsonb for 9.4, and tossing the index
opclasses overboard was one of the last-minute compromises in order
to have something submittable.

I think it would be a completely defensible decision to postpone jsonb
to 9.5 on the grounds that it's not done enough. Now, Josh has laid out
arguments why we want jsonb in 9.4 even if it's incomplete. But ISTM
that those are fundamentally marketing arguments; on a purely technical
basis I think the decision would be to postpone. So it comes down
to how you weight marketing vs technical issues, which is something
that everyone is likely to see a little bit differently :-(

regards, tom lane


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Christophe Pettus <xof(at)thebuild(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-05 16:08:50
Message-ID: 20140305160850.GF28321@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Mar 5, 2014 at 10:39:56AM -0500, Andrew Dunstan wrote:
>
> On 03/05/2014 10:30 AM, Tom Lane wrote:
> >Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
> >>On Wed, Mar 5, 2014 at 9:24 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >>>Also, there might be other cases besides arrays where we've embedded
> >>>type OIDs in on-disk data; anyone remember?
> >>composite types.
> >But that's only the composite type's own OID, no? So it's not really
> >a problem unless the type we wanted to move into (or out of) core was
> >itself composite.
> >
> >
>
>
> Sure, although that's not entirely impossible to imagine. I admit it
> seems less likely, and I could accept it as a restriction if we
> conquered the general case.

OK, so let's look at the general case. Here is what pg_upgrade
preserves:

* We control all assignments of pg_class.oid (and relfilenode) so toast
* oids are the same between old and new clusters. This is important
* because toast oids are stored as toast pointers in user tables.
*
* While pg_class.oid and pg_class.relfilenode are initially the same
* in a cluster, they can diverge due to CLUSTER, REINDEX, or VACUUM
* FULL. In the new cluster, pg_class.oid and pg_class.relfilenode will
* be the same and will match the old pg_class.oid value. Because of
* this, old/new pg_class.relfilenode values will not match if CLUSTER,
* REINDEX, or VACUUM FULL have been performed in the old cluster.
*
* We control all assignments of pg_type.oid because these oids are stored
* in user composite type values.
*
* We control all assignments of pg_enum.oid because these oids are stored
* in user tables as enum values.
*
* We control all assignments of pg_authid.oid because these oids are stored
* in pg_largeobject_metadata.

It seems only pg_type.oid is an issue for hstore. We can easily modify
pg_dump --binary-upgrade mode to suppress the creation of the hstore
extension. That should allow user hstore columns to automatically map
to the new constant hstore oid. We can also modify pg_upgrade to scan
all the user tables for any use of hstore arrays and perhaps composite
types and tell the user they have to drop and upgrade those table
separately.

Again, I am not asking what can be done for 9.4 but what is our final
goal, though the pg_upgrade change are minimal as we have done such
adjustments in the past.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ Everyone has their own god. +


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Christophe Pettus <xof(at)thebuild(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-05 16:10:23
Message-ID: CAHyXU0xA0uJ1CZy91vDiCsQ5Xoz2N599e4uGLxV-J2c0ARGBxQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Mar 5, 2014 at 9:52 AM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> On Wed, Mar 5, 2014 at 09:19:33AM -0600, Merlin Moncure wrote:
>> On Wed, Mar 5, 2014 at 8:39 AM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
>> > So, I am going to ask a back-track question and ask why we can't move
>> > hstore into core.
>>
>> This is exactly the opposite of what should be happening. Now, jsonb
>> might make it into core because of the json precedent but the entire
>> purpose of the extension system is stop dumping everything in the
>> public namespace. Stuff 'in core' becomes locked in stone, forever,
>> because of backwards compatibility concerns, which are IMNSHO, a
>> bigger set of issues than even pg_upgrade related issues. Have we
>> gone through all the new hstore functions and made sure they don't
>> break existing applications? Putting things in core welds your only
>> escape hatch shut.
>>
>> *All* non-sql standard types ought to be in extensions in an ideal world.
>
> I have seen your opinion on this but there have been enough
> counter-arguments that I am not ready to head in that direction.

The only counter argument given is that this will prevent people from
being able to use extensions because they A: can't or won't install
contrib packages or B: are too stupid or lazy to type 'create
extension json'. Note I'm discussing 'in core extension vs in core
built in'. 'out of core extension' loosely translates to 'can't be
used by the vast majority of systems.

Most corporate IT departments (including mine) have a policy of only
installing packages through the operating system packaging to simplify
management of deploying updates. Really strict companies might not
even allow anything but packages supplied by a vendor like redhat
(which in practice keeps you some versions back from the latest).
Now, if some crappy hosting company blocks contrib I don't believe at
all that this should drive our project management decisions.

Postgresql is both a database and increasingly a development language
platform. Most good stacks have a system (cpan, npm, etgc) to manage
the scope of the installed runtime and it's a routine and expected
exercise to leverage that system.

merlin


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Christophe Pettus <xof(at)thebuild(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-05 16:11:51
Message-ID: CA+TgmobAuCNG=iszd2k27Y02Z9r6uf5PeiE3Nv7cEi09iiOQmw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Mar 3, 2014 at 11:20 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
> On Mon, Mar 3, 2014 at 6:59 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
>> Also, please recognize that the current implementation was what we
>> collectively decided on three months ago, and what Andrew worked pretty
>> hard to implement based on that collective decision. So if we're going
>> to change course, we need a specific reason to change course, not just
>> "it seems like a better idea now" or "I wasn't paying attention then".
>
> I'm pretty sure it doesn't work like that. But if it does, what
> exactly am I insisting on that is inconsistent with that consensus? In
> what way are we changing course? I think I'm being eminently flexible.
> I don't want a jsonb type that is broken, as for example by not having
> a default B-Tree operator class. Why don't you let me get on with it?

An excellent question. This thread has become mostly about whether
someone (like, say, me, or in this case Peter) is attempting to pull
the rug out from under a previously-agreed consensus path forward.
But despite my asking, nobody's been able to provide a pointer to any
previous discussion of the points under debate. That's because the
points that are *actually* being debated here were not discussed
previously. I recognize that Josh and Andrew would like to make that
the fault of the people who are now raising objections, but it doesn't
work like that. The fact that people were and are *generally* in
favor of jsonb and hstore doesn't mean they have to like the way that
the patches have turned out.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Christophe Pettus <xof(at)thebuild(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-05 16:16:01
Message-ID: 3576.1394036161@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Bruce Momjian <bruce(at)momjian(dot)us> writes:
> It seems only pg_type.oid is an issue for hstore. We can easily modify
> pg_dump --binary-upgrade mode to suppress the creation of the hstore
> extension. That should allow user hstore columns to automatically map
> to the new constant hstore oid. We can also modify pg_upgrade to scan
> all the user tables for any use of hstore arrays and perhaps composite
> types and tell the user they have to drop and upgrade those table
> separately.

Yeah, and that doesn't seem terribly acceptable. Unless you think the
field usage of hstore[] is nil; which maybe it is, I'm not sure what
the usage patterns are like. In general it would not be acceptable
at all to not be able to support migrations of array columns.

regards, tom lane


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Christophe Pettus <xof(at)thebuild(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-05 16:19:30
Message-ID: 20140305161930.GH27273@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2014-03-05 10:10:23 -0600, Merlin Moncure wrote:
> On Wed, Mar 5, 2014 at 9:52 AM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> > On Wed, Mar 5, 2014 at 09:19:33AM -0600, Merlin Moncure wrote:
> >> *All* non-sql standard types ought to be in extensions in an ideal world.
> >
> > I have seen your opinion on this but there have been enough
> > counter-arguments that I am not ready to head in that direction.
>
> The only counter argument given is that this will prevent people from
> being able to use extensions because they A: can't or won't install
> contrib packages or B: are too stupid or lazy to type 'create
> extension json'. Note I'm discussing 'in core extension vs in core
> built in'. 'out of core extension' loosely translates to 'can't be
> used by the vast majority of systems.

There's the absolutely significant issue that you cannot reasonably
write extensions that interact on a C level. You can't call from
extension to extension directly, but you can from extension to pg core
provided ones.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Christophe Pettus <xof(at)thebuild(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-05 16:24:21
Message-ID: 3633.1394036661@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
>>> *All* non-sql standard types ought to be in extensions in an ideal world.

While there's certainly much to be said for the idea that jsonb should be
an extension, I don't think we have the technology to package it as a
*separate* extension; it'd have to be included in the hstore extension.
Which is weird, and quite a mixed message from the marketing standpoint.
If I understand Josh's vision of the future, he's expecting that hstore
will gradually die off in favor of jsonb, so we don't really want to
present the latter as the ugly stepchild.

Just out of curiosity, exactly what features are missing from jsonb
today that are available with hstore? How long would it take to
copy-and-paste all that code, if someone were to decide to do the
work instead of argue about it?

regards, tom lane


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Christophe Pettus <xof(at)thebuild(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-05 16:24:27
Message-ID: CAHyXU0y6fTP+pwNxsjDE+iDVL7w8rh-OsN7qbGunPFpSp3Wu4Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Mar 5, 2014 at 10:19 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> There's the absolutely significant issue that you cannot reasonably
> write extensions that interact on a C level. You can't call from
> extension to extension directly, but you can from extension to pg core
> provided ones.

Certainly. Note I never said that the internal .so can't be in core
that both extensions interface with and perhaps wrap. It would be
nice to have a intra-extension call system worked out but that in no
way plays to the bigger issues at stake. This is all about management
of the public API; take a good skeptical look at the history of types
like xml, json, geo, money and others.

merlin


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Christophe Pettus <xof(at)thebuild(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-05 16:28:00
Message-ID: 20140305162800.GT12995@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Merlin Moncure (mmoncure(at)gmail(dot)com) wrote:
> *All* non-sql standard types ought to be in extensions in an ideal world.

While I appreciate that you'd like to see it that way, others don't
agree (I certainly don't), and that ship sailed quite a long time ago
regardless. I'm not advocating putting everything into core, but I
agreed with having json in core and further feel jsonb should be there
also. I'm not against having hstore either- and I *wish* we'd put ip4r
in and replace our existing inet types with it.

Thanks,

Stephen


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Christophe Pettus <xof(at)thebuild(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-05 16:28:25
Message-ID: CA+TgmoaF=6i8SsH3CjvEwRqA5hhjW0Bk_1fp1roTWek-BJccSw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Mar 5, 2014 at 11:07 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> And despite the assertions from various people here that these
>> decisions were all made a long time ago and it's way too late to
>> question them, I don't buy it. There's not a single email on this
>> mailing list clearly laying out the design that we've ended up with,
>> and I'm willing to wager any reasonable amount of money that if
>> someone had posted an email saying "hey, we're thinking about setting
>> things up so that jsonb and hstore have the same binary format, but
>> you can't index jsonb directly, you have to cast it to hstore, is
>> everyone OK with that?" someone would have written back and said "no,
>> that sounds nuts". The reason why input on that particular aspect of
>> the design was not forthcoming isn't because everyone was OK with it;
>> it's because it was never clearly spelled out.
>
> No, that was never the design (I trust). It's where we are today
> because time ran out to complete jsonb for 9.4, and tossing the index
> opclasses overboard was one of the last-minute compromises in order
> to have something submittable.

Well, what I was told when I started objecting to the current state of
affairs is that it was too late to "change course" now, which seemed
to me to imply that this was the idea all along. On the other hand,
Josh also said that there was a plan in the works to ship the missing
opclasses on PGXN before 9.4 hits shelves, which is more along the
lines of what you're saying. So, hey, I don't know.

> I think it would be a completely defensible decision to postpone jsonb
> to 9.5 on the grounds that it's not done enough. Now, Josh has laid out
> arguments why we want jsonb in 9.4 even if it's incomplete. But ISTM
> that those are fundamentally marketing arguments; on a purely technical
> basis I think the decision would be to postpone. So it comes down
> to how you weight marketing vs technical issues, which is something
> that everyone is likely to see a little bit differently :-(

I don't have any problem shipping incremental progress on important
features, but once we ship things that are visible at the SQL level
they get awfully hard to change, and my confidence that we won't want
to change this is not very high right now. To the extent that we have
a jsonb that is missing some features we will eventually want to have,
I don't care; that's incremental development at its finest. To the
extent that we have a jsonb that makes choices about what to store on
disk or expose at the SQL level that we may regret later, that's not
incremental development; that's just not being done. Anyone who
thinks that digging ourselves out of a backward-compatibility hole
will be painless enough to justify the marketing value of the feature
has most probably not had to do it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Christophe Pettus <xof(at)thebuild(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-05 16:30:06
Message-ID: CA+TgmobhJ8a0Fp6nEw_4xziwnMcx+QMq7k4hA1DoGx9RFZ1J9g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Mar 5, 2014 at 11:24 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
>>>> *All* non-sql standard types ought to be in extensions in an ideal world.
>
> While there's certainly much to be said for the idea that jsonb should be
> an extension, I don't think we have the technology to package it as a
> *separate* extension; it'd have to be included in the hstore extension.
> Which is weird, and quite a mixed message from the marketing standpoint.
> If I understand Josh's vision of the future, he's expecting that hstore
> will gradually die off in favor of jsonb, so we don't really want to
> present the latter as the ugly stepchild.
>
> Just out of curiosity, exactly what features are missing from jsonb
> today that are available with hstore? How long would it take to
> copy-and-paste all that code, if someone were to decide to do the
> work instead of argue about it?

I believe the main thing is the opclasses.

My information might be incomplete.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Christophe Pettus <xof(at)thebuild(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-05 16:34:10
Message-ID: 20140305163410.GU12995@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
> Just out of curiosity, exactly what features are missing from jsonb
> today that are available with hstore? How long would it take to
> copy-and-paste all that code, if someone were to decide to do the
> work instead of argue about it?

Somewhere upthread, Peter seemed to estimate it at a day, if I
understood correctly. If that's accurate, I'm certainly behind getting
it done and in and moving on. I'm sure no one particularly likes a
bunch of copy/pasteing of code, but if it would get us to the point of
having a really working jsonb that everyone is happy with, I'm all for
it.

It's not clear how much different it would be if we waited til 9.5
either- do we anticipate a lot of code changes beyond the copy/paste for
these?

Thanks,

Stephen


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Christophe Pettus <xof(at)thebuild(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-05 16:38:47
Message-ID: CAHyXU0wPLyxywG7J38dGVA=7xPj=+D8kXJmZ=XR7OHEGjN6m_g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Mar 5, 2014 at 10:24 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
>>>> *All* non-sql standard types ought to be in extensions in an ideal world.
>
> While there's certainly much to be said for the idea that jsonb should be
> an extension, I don't think we have the technology to package it as a
> *separate* extension; it'd have to be included in the hstore extension.

I disagree with that. The shared C bits can live inside the core
system and the SQL level hooks and extension specific behaviors could
live in an extension. AFAICT moving jsonb to extension is mostly a
function of migrating the hard coded SQL hooks out to an extension
(I'm probably oversimplifying though).

> Just out of curiosity, exactly what features are missing from jsonb
> today that are available with hstore? How long would it take to
> copy-and-paste all that code, if someone were to decide to do the
> work instead of argue about it?

Basically opclasses, operators (particularly search operators) and
functions/operators to manipulate the hstore in place. Personally I'm
not inclined to copy/paste the code. I'd also like to see this stuff
committed, and don't want to hold up the patch for that unless it's
determined for other reasons (and by other people) this is the only
reasonable path for 9.4.

merlin


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <bruce(at)momjian(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Christophe Pettus <xof(at)thebuild(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-05 16:43:31
Message-ID: 20140305164331.GV12995@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Merlin Moncure (mmoncure(at)gmail(dot)com) wrote:
> On Wed, Mar 5, 2014 at 10:24 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
> >>>> *All* non-sql standard types ought to be in extensions in an ideal world.
> >
> > While there's certainly much to be said for the idea that jsonb should be
> > an extension, I don't think we have the technology to package it as a
> > *separate* extension; it'd have to be included in the hstore extension.
>
> I disagree with that. The shared C bits can live inside the core
> system and the SQL level hooks and extension specific behaviors could
> live in an extension. AFAICT moving jsonb to extension is mostly a
> function of migrating the hard coded SQL hooks out to an extension
> (I'm probably oversimplifying though).

Yeah, from what I gather you're suggesting, that's more-or-less "move it
all to core", except that all of the actual interface bits end up in an
extension that has to be installed to use what would have to already be
there. I don't see that as any kind of improvement.

Thanks,

Stephen


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Christophe Pettus <xof(at)thebuild(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-05 16:44:38
Message-ID: 20140305164438.GG28321@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Mar 5, 2014 at 11:16:01AM -0500, Tom Lane wrote:
> Bruce Momjian <bruce(at)momjian(dot)us> writes:
> > It seems only pg_type.oid is an issue for hstore. We can easily modify
> > pg_dump --binary-upgrade mode to suppress the creation of the hstore
> > extension. That should allow user hstore columns to automatically map
> > to the new constant hstore oid. We can also modify pg_upgrade to scan
> > all the user tables for any use of hstore arrays and perhaps composite
> > types and tell the user they have to drop and upgrade those table
> > separately.
>
> Yeah, and that doesn't seem terribly acceptable. Unless you think the
> field usage of hstore[] is nil; which maybe it is, I'm not sure what
> the usage patterns are like. In general it would not be acceptable
> at all to not be able to support migrations of array columns.

It would prevent migration of _hstore_ array columns, which might be
acceptable. If we say pg_upgrade can never decline an upgrade, we
basically limit changes and increase the odds of needing a total
pg_upgrade-breaking release someday to re-adjust everything.

I basically think that a split between contrib and core for the
internally same data type just isn't sustainable.

Another conern is that it doesn't seem we are sure if we want JSONB in
core or contrib, at least based on some comments, so we should probably
decide that now, as I don't think the decision is going to be any easier
in the future. And as discussed above, moving something from contrib to
core has its own complexities.

I think we also have to break out how much of the feeling that JSONB is
not ready is because of problems with the core/contrib split, and how
much of it is because of the type itself. I am suggesting that
core/contrib split problems are not symptomatic of data type problems,
and if address/address the core/contrib split issue, the data type might
be just fine.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ Everyone has their own god. +


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Christophe Pettus <xof(at)thebuild(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-05 16:49:11
Message-ID: 53175587.7060206@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 03/05/2014 11:34 AM, Stephen Frost wrote:
> * Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
>> Just out of curiosity, exactly what features are missing from jsonb
>> today that are available with hstore? How long would it take to
>> copy-and-paste all that code, if someone were to decide to do the
>> work instead of argue about it?
> Somewhere upthread, Peter seemed to estimate it at a day, if I
> understood correctly. If that's accurate, I'm certainly behind getting
> it done and in and moving on. I'm sure no one particularly likes a
> bunch of copy/pasteing of code, but if it would get us to the point of
> having a really working jsonb that everyone is happy with, I'm all for
> it.
>
> It's not clear how much different it would be if we waited til 9.5
> either- do we anticipate a lot of code changes beyond the copy/paste for
> these?
>
>

I think that was my estimate, but Peter did offer to do it. He certainly
asserted that the effort required would not be great. I'm all for taking
up his offer.

Incidentally, this would probably have been done quite weeks ago if
people had not objected to my doing any more on the feature. Of course
missing the GIN/GIST ops was not part of the design. Quite the contrary.

cheers

andrew


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Christophe Pettus <xof(at)thebuild(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-05 16:50:23
Message-ID: 20140305165023.GH28321@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Mar 5, 2014 at 11:34:10AM -0500, Stephen Frost wrote:
> * Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
> > Just out of curiosity, exactly what features are missing from jsonb
> > today that are available with hstore? How long would it take to
> > copy-and-paste all that code, if someone were to decide to do the
> > work instead of argue about it?
>
> Somewhere upthread, Peter seemed to estimate it at a day, if I
> understood correctly. If that's accurate, I'm certainly behind getting
> it done and in and moving on. I'm sure no one particularly likes a
> bunch of copy/pasteing of code, but if it would get us to the point of
> having a really working jsonb that everyone is happy with, I'm all for
> it.
>
> It's not clear how much different it would be if we waited til 9.5
> either- do we anticipate a lot of code changes beyond the copy/paste for
> these?

What _would_ be interesting is to move all the hstore code into core,
and have hstore contrib just call the hstore core parts. That way, you
have one copy of the code, it is shared with JSONB, but hstore remains
as an extension that you can change or remove later.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ Everyone has their own god. +


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Christophe Pettus <xof(at)thebuild(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-05 16:53:28
Message-ID: 20140305165328.GI28321@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Mar 5, 2014 at 11:11:51AM -0500, Robert Haas wrote:
> An excellent question. This thread has become mostly about whether
> someone (like, say, me, or in this case Peter) is attempting to pull
> the rug out from under a previously-agreed consensus path forward.
> But despite my asking, nobody's been able to provide a pointer to any
> previous discussion of the points under debate. That's because the
> points that are *actually* being debated here were not discussed
> previously. I recognize that Josh and Andrew would like to make that
> the fault of the people who are now raising objections, but it doesn't
> work like that. The fact that people were and are *generally* in
> favor of jsonb and hstore doesn't mean they have to like the way that
> the patches have turned out.

I am assuming much of this was discussed verbally, and many of us were
not present.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ Everyone has their own god. +


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Christophe Pettus <xof(at)thebuild(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-05 16:53:31
Message-ID: 5317568B.90807@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 03/05/2014 11:44 AM, Bruce Momjian wrote:
> On Wed, Mar 5, 2014 at 11:16:01AM -0500, Tom Lane wrote:
>> Bruce Momjian <bruce(at)momjian(dot)us> writes:
>>> It seems only pg_type.oid is an issue for hstore. We can easily modify
>>> pg_dump --binary-upgrade mode to suppress the creation of the hstore
>>> extension. That should allow user hstore columns to automatically map
>>> to the new constant hstore oid. We can also modify pg_upgrade to scan
>>> all the user tables for any use of hstore arrays and perhaps composite
>>> types and tell the user they have to drop and upgrade those table
>>> separately.
>> Yeah, and that doesn't seem terribly acceptable. Unless you think the
>> field usage of hstore[] is nil; which maybe it is, I'm not sure what
>> the usage patterns are like. In general it would not be acceptable
>> at all to not be able to support migrations of array columns.
> It would prevent migration of _hstore_ array columns, which might be
> acceptable. If we say pg_upgrade can never decline an upgrade, we
> basically limit changes and increase the odds of needing a total
> pg_upgrade-breaking release someday to re-adjust everything.
>
> I basically think that a split between contrib and core for the
> internally same data type just isn't sustainable.
>
> Another conern is that it doesn't seem we are sure if we want JSONB in
> core or contrib, at least based on some comments, so we should probably
> decide that now, as I don't think the decision is going to be any easier
> in the future. And as discussed above, moving something from contrib to
> core has its own complexities.
>
> I think we also have to break out how much of the feeling that JSONB is
> not ready is because of problems with the core/contrib split, and how
> much of it is because of the type itself. I am suggesting that
> core/contrib split problems are not symptomatic of data type problems,
> and if address/address the core/contrib split issue, the data type might
> be just fine.
>

Splitting out jsonb to an extension is going to be moderately painful.
The json and jsonb functions share some code that's not exposed (and
probably shouldn't be). It's not likely to be less painful than
implementing the hstore GIN/GIST ops for jsonb, I suspect the reverse.

cheers

andrew


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <bruce(at)momjian(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Christophe Pettus <xof(at)thebuild(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-05 16:56:25
Message-ID: CAHyXU0wBJJ9hg=76bFn1JbHP0806gOGaFu9NFw4QnhtWyeiCMA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Mar 5, 2014 at 10:43 AM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> * Merlin Moncure (mmoncure(at)gmail(dot)com) wrote:
>> On Wed, Mar 5, 2014 at 10:24 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> > Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
>> >>>> *All* non-sql standard types ought to be in extensions in an ideal world.
>> >
>> > While there's certainly much to be said for the idea that jsonb should be
>> > an extension, I don't think we have the technology to package it as a
>> > *separate* extension; it'd have to be included in the hstore extension.
>>
>> I disagree with that. The shared C bits can live inside the core
>> system and the SQL level hooks and extension specific behaviors could
>> live in an extension. AFAICT moving jsonb to extension is mostly a
>> function of migrating the hard coded SQL hooks out to an extension
>> (I'm probably oversimplifying though).
>
> Yeah, from what I gather you're suggesting, that's more-or-less "move it
> all to core", except that all of the actual interface bits end up in an
> extension that has to be installed to use what would have to already be
> there. I don't see that as any kind of improvement.

If you don't then you simply have not been paying attention to the
endless backwards compatibility problems we've faced which are highly
ameliorated in an extension heavy world. Also, you're ignoring the
fact that having an endlessly accreting set of symbols in the public
namespace is not free. Internal C libraries don't have to be
supported and don't have any signficant user facing costs by simply
being there.

merlin


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Christophe Pettus <xof(at)thebuild(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-05 17:01:18
Message-ID: 20140305170118.GJ28321@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Mar 5, 2014 at 11:53:31AM -0500, Andrew Dunstan wrote:
> >I think we also have to break out how much of the feeling that JSONB is
> >not ready is because of problems with the core/contrib split, and how
> >much of it is because of the type itself. I am suggesting that
> >core/contrib split problems are not symptomatic of data type problems,
> >and if address/address the core/contrib split issue, the data type might
> >be just fine.
> >
>
>
> Splitting out jsonb to an extension is going to be moderately
> painful. The json and jsonb functions share some code that's not
> exposed (and probably shouldn't be). It's not likely to be less
> painful than implementing the hstore GIN/GIST ops for jsonb, I
> suspect the reverse.

OK, that's good information. So we have JSONB which ties to a core
type, JSON, _and_ to a contrib module, hstore. No wonder it is so
complex.

I am warming up to the idea of moving hstore internals into core,
sharing that with JSONB, and having contrib/hstore just call the core
functions when defining its data type.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ Everyone has their own god. +


From: "David E(dot) Wheeler" <david(at)justatheory(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Christophe Pettus <xof(at)thebuild(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-05 17:01:55
Message-ID: 21AAE130-A3C1-4430-BA00-27427EFD168A@justatheory.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mar 5, 2014, at 8:49 AM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:

> I think that was my estimate, but Peter did offer to do it. He certainly asserted that the effort required would not be great. I'm all for taking up his offer.

+1 to this. Can you and Peter collaborate somehow to get it knocked out?

> Incidentally, this would probably have been done quite weeks ago if people had not objected to my doing any more on the feature. Of course missing the GIN/GIST ops was not part of the design. Quite the contrary.

That was my understanding, as well.

Best,

David


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Christophe Pettus <xof(at)thebuild(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-05 17:19:16
Message-ID: 15394.1394039956@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
> On Wed, Mar 5, 2014 at 10:24 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> While there's certainly much to be said for the idea that jsonb should be
>> an extension, I don't think we have the technology to package it as a
>> *separate* extension; it'd have to be included in the hstore extension.

> I disagree with that. The shared C bits can live inside the core
> system and the SQL level hooks and extension specific behaviors could
> live in an extension.

That approach abandons every bit of value in an extension, no?
You certainly don't get to fix bugs outside a core-system release cycle.

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Christophe Pettus <xof(at)thebuild(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-05 17:26:13
Message-ID: CA+Tgmoae_j0=EKHjTZ=8B=zrW2uuacRy_jyDe-7mw7TqwvXeqg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Mar 5, 2014 at 11:50 AM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> On Wed, Mar 5, 2014 at 11:34:10AM -0500, Stephen Frost wrote:
>> * Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
>> > Just out of curiosity, exactly what features are missing from jsonb
>> > today that are available with hstore? How long would it take to
>> > copy-and-paste all that code, if someone were to decide to do the
>> > work instead of argue about it?
>>
>> Somewhere upthread, Peter seemed to estimate it at a day, if I
>> understood correctly. If that's accurate, I'm certainly behind getting
>> it done and in and moving on. I'm sure no one particularly likes a
>> bunch of copy/pasteing of code, but if it would get us to the point of
>> having a really working jsonb that everyone is happy with, I'm all for
>> it.
>>
>> It's not clear how much different it would be if we waited til 9.5
>> either- do we anticipate a lot of code changes beyond the copy/paste for
>> these?
>
> What _would_ be interesting is to move all the hstore code into core,
> and have hstore contrib just call the hstore core parts. That way, you
> have one copy of the code, it is shared with JSONB, but hstore remains
> as an extension that you can change or remove later.

That seems like an approach possibly worth investigating. It's not
too different from what we did when we moved text search into core.
The basic idea seems to be that we want jsonb in core, and we expect
it to replace hstore, but we can't get just get rid of hstore because
it has too many users.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Christophe Pettus <xof(at)thebuild(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-05 17:29:49
Message-ID: 20140305172949.GA15259@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Mar 5, 2014 at 12:26:13PM -0500, Robert Haas wrote:
> >> It's not clear how much different it would be if we waited til 9.5
> >> either- do we anticipate a lot of code changes beyond the copy/paste for
> >> these?
> >
> > What _would_ be interesting is to move all the hstore code into core,
> > and have hstore contrib just call the hstore core parts. That way, you
> > have one copy of the code, it is shared with JSONB, but hstore remains
> > as an extension that you can change or remove later.
>
> That seems like an approach possibly worth investigating. It's not
> too different from what we did when we moved text search into core.
> The basic idea seems to be that we want jsonb in core, and we expect
> it to replace hstore, but we can't get just get rid of hstore because
> it has too many users.

Yes. It eliminates the problem of code duplication, but keeps hstore in
contrib for flexibility and compatibility.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ Everyone has their own god. +


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Christophe Pettus <xof(at)thebuild(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-05 17:30:27
Message-ID: 20140305173026.GW12995@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> On Wed, Mar 5, 2014 at 11:50 AM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> > What _would_ be interesting is to move all the hstore code into core,
> > and have hstore contrib just call the hstore core parts. That way, you
> > have one copy of the code, it is shared with JSONB, but hstore remains
> > as an extension that you can change or remove later.
>
> That seems like an approach possibly worth investigating. It's not
> too different from what we did when we moved text search into core.
> The basic idea seems to be that we want jsonb in core, and we expect
> it to replace hstore, but we can't get just get rid of hstore because
> it has too many users.

This might be valuable for hstore, specifically, because we can't easily
move it into core. I'm fine with that- the disagreement I have is with
the more general idea that everything not-defined-by-committee should be
in shim extensions which just provide basically the catalog entries for
types which are otherwise all in core.

Thanks,

Stephen


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Christophe Pettus <xof(at)thebuild(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-05 17:36:24
Message-ID: CAHyXU0wa7eSuERcyq=54Ypxjnpym13syhgq3Rs0YRtkMuhWCiw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Mar 5, 2014 at 11:19 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
>> On Wed, Mar 5, 2014 at 10:24 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> While there's certainly much to be said for the idea that jsonb should be
>>> an extension, I don't think we have the technology to package it as a
>>> *separate* extension; it'd have to be included in the hstore extension.
>
>> I disagree with that. The shared C bits can live inside the core
>> system and the SQL level hooks and extension specific behaviors could
>> live in an extension.
>
> That approach abandons every bit of value in an extension, no?
> You certainly don't get to fix bugs outside a core-system release cycle.

That's core vs non core debate. Just about everyone (including me)
wants json and hstore to live in core -- meaning packaged, shipped,
supported, and documented with the postgresql source code releases.
Only an elite set of broadly useful and popular extensions get that
honor of which json is most certainly one.

Moreover, you give up nothing except the debate/approval issues to get
your code in core. If you want to release off cycle, you can
certainly do that and enterprising users can simply install the
extension manually (or perhaps via pgxn) instead of via contrib.

BTW,This is yet another thing that becomes impossible if you don't
extension (on top of legacy/backwards compatibility issues and general
bloat which is IMNSHO already a pretty severe situation).

merlin


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Christophe Pettus <xof(at)thebuild(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-05 17:38:57
Message-ID: 53176131.6020205@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 03/05/2014 12:01 PM, Bruce Momjian wrote:
> On Wed, Mar 5, 2014 at 11:53:31AM -0500, Andrew Dunstan wrote:
>>> I think we also have to break out how much of the feeling that JSONB is
>>> not ready is because of problems with the core/contrib split, and how
>>> much of it is because of the type itself. I am suggesting that
>>> core/contrib split problems are not symptomatic of data type problems,
>>> and if address/address the core/contrib split issue, the data type might
>>> be just fine.
>>>
>>
>> Splitting out jsonb to an extension is going to be moderately
>> painful. The json and jsonb functions share some code that's not
>> exposed (and probably shouldn't be). It's not likely to be less
>> painful than implementing the hstore GIN/GIST ops for jsonb, I
>> suspect the reverse.
> OK, that's good information. So we have JSONB which ties to a core
> type, JSON, _and_ to a contrib module, hstore. No wonder it is so
> complex.

Well, "ties to" is a loose term. It's hstore in these patches that
depends on jsonb - necessarily since we can't have core code depend on
an extension.

> I am warming up to the idea of moving hstore internals into core,
> sharing that with JSONB, and having contrib/hstore just call the core
> functions when defining its data type.
>

Right, at least the parts they need in common. That's how I'd handle the
GIN/GIST ops, for example.

cheers

andrew


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <bruce(at)momjian(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Christophe Pettus <xof(at)thebuild(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-05 17:44:06
Message-ID: 20140305174406.GX12995@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Merlin Moncure (mmoncure(at)gmail(dot)com) wrote:
> On Wed, Mar 5, 2014 at 10:43 AM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> > Yeah, from what I gather you're suggesting, that's more-or-less "move it
> > all to core", except that all of the actual interface bits end up in an
> > extension that has to be installed to use what would have to already be
> > there. I don't see that as any kind of improvement.
>
> If you don't then you simply have not been paying attention to the
> endless backwards compatibility problems we've faced which are highly
> ameliorated in an extension heavy world.

We have backwards compatibility "problems" because we don't want to
*break* things for people. Moving things into extensions doesn't
magically fix that- if you break something in a backwards-incompatible
way then you're going to cause a lot of grief for people. Doing that to
everyone who uses hstore, just because it's an extension, doesn't make
it acceptable. On this thread we're already argueing about exactly that
issue and how to avoid breaking things for those users were we to move
hstore into core.

> Also, you're ignoring the
> fact that having an endlessly accreting set of symbols in the public
> namespace is not free. Internal C libraries don't have to be
> supported and don't have any signficant user facing costs by simply
> being there.

I *really* hate how extensions end up getting dumped into the "public"
schema and I'm not a big fan for having huge search_paths either. As I
mentioned earlier- I'm also not advocating that everything be put into
core. I don't follow what you mean by "Internal C libraries don't have
to be supported" because, clearly, anything released would have to be
supported and if the extension is calling into a C interface then we'd
have to support that interface for that extension *and anyone else who
uses it*. We don't get to say "oh, this C function can only be used by
extensions we bless." We already worry less about breaking backwards
compatibility for C-level functions across PG major versions, but
that's true for both in-core hooks and extensions.

Thanks,

Stephen


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Christophe Pettus <xof(at)thebuild(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-05 18:05:37
Message-ID: 53176771.1040508@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 03/05/2014 09:26 AM, Robert Haas wrote:
>> > What _would_ be interesting is to move all the hstore code into core,
>> > and have hstore contrib just call the hstore core parts. That way, you
>> > have one copy of the code, it is shared with JSONB, but hstore remains
>> > as an extension that you can change or remove later.
> That seems like an approach possibly worth investigating. It's not
> too different from what we did when we moved text search into core.
> The basic idea seems to be that we want jsonb in core, and we expect
> it to replace hstore, but we can't get just get rid of hstore because
> it has too many users

Yes, please! This was the original approach that we talked about and
everyone agreed to, and what Andrew has been trying to implement.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Christophe Pettus <xof(at)thebuild(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-05 18:59:37
Message-ID: CAM3SWZSQK0LoLg7uVYwhSx54PBcFd7dkeRLfDp5+JSy78EFd2w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Mar 5, 2014 at 8:30 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> Just out of curiosity, exactly what features are missing from jsonb
>> today that are available with hstore? How long would it take to
>> copy-and-paste all that code, if someone were to decide to do the
>> work instead of argue about it?
>
> I believe the main thing is the opclasses.

Yes, that's right. A large volume of code currently proposed for
hstore2 is much less valuable than those operators sufficient to
implement the hstore2 opclasses. If you assume that hstore will become
a legacy extension that we won't add anything to (including everything
proposed in any patch posted to this thread), and jsonb will go in
core (which is of course more or less just hstore2 with a few json
extras), the amount of code redundantly shared between core and an
unchanged hstore turns out to not be that bad. I hope to have a
precise answer to just how bad soon.

--
Peter Geoghegan


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Christophe Pettus <xof(at)thebuild(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-05 19:05:28
Message-ID: 20140305190528.GB15259@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Mar 5, 2014 at 10:59:37AM -0800, Peter Geoghegan wrote:
> On Wed, Mar 5, 2014 at 8:30 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> >> Just out of curiosity, exactly what features are missing from jsonb
> >> today that are available with hstore? How long would it take to
> >> copy-and-paste all that code, if someone were to decide to do the
> >> work instead of argue about it?
> >
> > I believe the main thing is the opclasses.
>
> Yes, that's right. A large volume of code currently proposed for
> hstore2 is much less valuable than those operators sufficient to
> implement the hstore2 opclasses. If you assume that hstore will become
> a legacy extension that we won't add anything to (including everything
> proposed in any patch posted to this thread), and jsonb will go in
> core (which is of course more or less just hstore2 with a few json
> extras), the amount of code redundantly shared between core and an
> unchanged hstore turns out to not be that bad. I hope to have a
> precise answer to just how bad soon.

Can you clarify what hstore2 is? It that the name of a type? Is that
hierarchical hstore with the same hstore name?

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ Everyone has their own god. +


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>, Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Christophe Pettus <xof(at)thebuild(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-05 19:10:52
Message-ID: 531776BC.9010702@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 03/05/2014 11:05 AM, Bruce Momjian wrote:
> Can you clarify what hstore2 is? It that the name of a type? Is that
> hierarchical hstore with the same hstore name?

hstore2 == nested heirarchical hstore. It's just a shorthand; there
won't be any actual type called "hstore2", by design. Unlike the json
users, the hstore users are going to get an automatic upgrade whether
they want it or not. Mind you, I can't see a reason NOT to want it ...

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Christophe Pettus <xof(at)thebuild(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-05 19:32:12
Message-ID: 20140305193212.GC15259@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Mar 5, 2014 at 10:59:37AM -0800, Peter Geoghegan wrote:
> On Wed, Mar 5, 2014 at 8:30 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> >> Just out of curiosity, exactly what features are missing from jsonb
> >> today that are available with hstore? How long would it take to
> >> copy-and-paste all that code, if someone were to decide to do the
> >> work instead of argue about it?
> >
> > I believe the main thing is the opclasses.
>
> Yes, that's right. A large volume of code currently proposed for
> hstore2 is much less valuable than those operators sufficient to
> implement the hstore2 opclasses. If you assume that hstore will become
> a legacy extension that we won't add anything to (including everything
> proposed in any patch posted to this thread), and jsonb will go in
> core (which is of course more or less just hstore2 with a few json
> extras), the amount of code redundantly shared between core and an
> unchanged hstore turns out to not be that bad. I hope to have a
> precise answer to just how bad soon.

So, now knowing that hstore2 is just hierarchical hstore using the same
hstore type name, you are saying that we are keeping the
non-hierarchical code in contrib, and the rest goes into core --- that
makes sense, and from a code maintenance perspective, I like that the
non-hierarchical hstore code is not going in core.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ Everyone has their own god. +


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <bruce(at)momjian(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Christophe Pettus <xof(at)thebuild(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-05 19:59:55
Message-ID: CAHyXU0zP=gGWg9Cv5BS2FC_Wk7rL-+0NwSjLgy2Ve5CEZmA7mQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Mar 5, 2014 at 11:44 AM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> * Merlin Moncure (mmoncure(at)gmail(dot)com) wrote:
>> On Wed, Mar 5, 2014 at 10:43 AM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
>> > Yeah, from what I gather you're suggesting, that's more-or-less "move it
>> > all to core", except that all of the actual interface bits end up in an
>> > extension that has to be installed to use what would have to already be
>> > there. I don't see that as any kind of improvement.
>>
>> If you don't then you simply have not been paying attention to the
>> endless backwards compatibility problems we've faced which are highly
>> ameliorated in an extension heavy world.
>
> We have backwards compatibility "problems" because we don't want to
> *break* things for people. Moving things into extensions doesn't
> magically fix that- if you break something in a backwards-incompatible
> way then you're going to cause a lot of grief for people.

It doesn't magically fix it, but at least provides a way forward. If
the function you want to modify is in an extension 'foo', you get to
put your new stuff in 'foo2' extension. That way your users do not
have to adjust all the code you would have broken. Perhaps for
in-core extensions you offer the old one in contrib for a while until
a reasonable amount of time passes then move it out to pgxn. This is
a vastly better system than the choices we have now, which is A. break
code or B. do nothing.

>> Also, you're ignoring the
>> fact that having an endlessly accreting set of symbols in the public
>> namespace is not free. Internal C libraries don't have to be
>> supported and don't have any signficant user facing costs by simply
>> being there.
>
> I *really* hate how extensions end up getting dumped into the "public"
> schema and I'm not a big fan for having huge search_paths either.

At least with extensions you have control over this.

> mentioned earlier- I'm also not advocating that everything be put into
> core. I don't follow what you mean by "Internal C libraries don't have
> to be supported" because,

I mean, we are free to change them or delete them. They do not come
with the legacy that user facing API comes. They also do not bloat
the public namespace.

merlin


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <bruce(at)momjian(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Christophe Pettus <xof(at)thebuild(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-05 20:45:18
Message-ID: 20140305204518.GV4759@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Merlin Moncure escribió:
> On Wed, Mar 5, 2014 at 11:44 AM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:

> > We have backwards compatibility "problems" because we don't want to
> > *break* things for people. Moving things into extensions doesn't
> > magically fix that- if you break something in a backwards-incompatible
> > way then you're going to cause a lot of grief for people.
>
> It doesn't magically fix it, but at least provides a way forward. If
> the function you want to modify is in an extension 'foo', you get to
> put your new stuff in 'foo2' extension. That way your users do not
> have to adjust all the code you would have broken. Perhaps for
> in-core extensions you offer the old one in contrib for a while until
> a reasonable amount of time passes then move it out to pgxn.

Uhm. Would it work to define a new version of foo, say 2.0, but keep
the old 1.2 version the default? That way, if you want to keep the old
foo you do nothing (after both fresh install and pg_upgrade), and if you
want to upgrade to the new code, it's just an ALTER EXTENSION UPDATE
away.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <bruce(at)momjian(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Christophe Pettus <xof(at)thebuild(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-05 20:46:43
Message-ID: 20140305204643.GC12995@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Merlin Moncure (mmoncure(at)gmail(dot)com) wrote:
> On Wed, Mar 5, 2014 at 11:44 AM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> > We have backwards compatibility "problems" because we don't want to
> > *break* things for people. Moving things into extensions doesn't
> > magically fix that- if you break something in a backwards-incompatible
> > way then you're going to cause a lot of grief for people.
>
> It doesn't magically fix it, but at least provides a way forward. If
> the function you want to modify is in an extension 'foo', you get to
> put your new stuff in 'foo2' extension. That way your users do not
> have to adjust all the code you would have broken. Perhaps for
> in-core extensions you offer the old one in contrib for a while until
> a reasonable amount of time passes then move it out to pgxn. This is
> a vastly better system than the choices we have now, which is A. break
> code or B. do nothing.

I don't see why we can't do exactly what you're suggesting in core.
This whole thread is about doing exactly that, in fact, which is why
we're talking about 'jsonb' instead of just 'json'. I agree that we
don't push too hard to remove things from core, but it's not like we've
had a whole ton of success kicking things out of -contrib either.

> > I *really* hate how extensions end up getting dumped into the "public"
> > schema and I'm not a big fan for having huge search_paths either.
>
> At least with extensions you have control over this.

Yeah, but I much prefer how things end up in pg_catalog rather than
public or individual schemas.

> > mentioned earlier- I'm also not advocating that everything be put into
> > core. I don't follow what you mean by "Internal C libraries don't have
> > to be supported" because,
>
> I mean, we are free to change them or delete them. They do not come
> with the legacy that user facing API comes. They also do not bloat
> the public namespace.

But we actually *aren't* free to change or delete them- which is what I
was getting at. Certainly, in back-branches we regularly worry about
breaking things for users of C functions, and there is some
consideration for them even in major version changes.

Thanks,

Stephen


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <bruce(at)momjian(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Christophe Pettus <xof(at)thebuild(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-05 22:14:41
Message-ID: CAHyXU0xof6czcm78H-8HZ5X_6fycC8B3P3vJSVBX=h0zTV09Gg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Mar 5, 2014 at 2:45 PM, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
> Merlin Moncure escribió:
>> It doesn't magically fix it, but at least provides a way forward. If
>> the function you want to modify is in an extension 'foo', you get to
>> put your new stuff in 'foo2' extension. That way your users do not
>> have to adjust all the code you would have broken. Perhaps for
>> in-core extensions you offer the old one in contrib for a while until
>> a reasonable amount of time passes then move it out to pgxn.
>
> Uhm. Would it work to define a new version of foo, say 2.0, but keep
> the old 1.2 version the default? That way, if you want to keep the old
> foo you do nothing (after both fresh install and pg_upgrade), and if you
> want to upgrade to the new code, it's just an ALTER EXTENSION UPDATE
> away.

Certainly. The important point though is that neither option is
available if the old stuff is locked into the public namespace.
Consider various warts like the array ('array_upper' et al) API or geo
types. We're stuck with them. Even with jsonb: it may be the hot new
thing *today* but 5 years down the line there's json2 that does all
kinds of wonderful things we haven't thought about -- what if it
displaces current usages? The very same people who are arguing that
jsonb should not be in an extension are the ones arguing json is
legacy and to be superseded. These two points of view IMO are
directly in conflict: if json would have been an extension than the
path to deprecation is clear. Now the json functions are in the
public namespace. Forever (or at least for a very long time).

On Wed, Mar 5, 2014 at 2:46 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> I don't see why we can't do exactly what you're suggesting in core.

Because you can't (if you're defining core to mean 'not an
extension'). Functions can't be removed or changed because of legacy
application support. In an extension world, they can -- albeit not
'magically', but at least it can be done.

merlin


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <bruce(at)momjian(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Christophe Pettus <xof(at)thebuild(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-05 22:24:32
Message-ID: 20140305222432.GG12995@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Merlin Moncure (mmoncure(at)gmail(dot)com) wrote:
> On Wed, Mar 5, 2014 at 2:46 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> > I don't see why we can't do exactly what you're suggesting in core.
>
> Because you can't (if you're defining core to mean 'not an
> extension'). Functions can't be removed or changed because of legacy
> application support. In an extension world, they can -- albeit not
> 'magically', but at least it can be done.

That simply isn't accurate on either level- if there is concern about
application support, that can apply equally to core and contrib, and we
certainly *can* remove and/or redefine functions in core with sufficient
cause. It's just not something we do lightly for things living in
either core or contrib.

For an example, consider the FDW API, particularly what we did between
9.1 and 9.2.

Thanks,

Stephen


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <bruce(at)momjian(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Christophe Pettus <xof(at)thebuild(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-05 22:46:51
Message-ID: CAHyXU0x-do1uvJ0yFa=5fyB_k5HC1BdEsKS4yFS=TPZtbQghGw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Mar 5, 2014 at 4:24 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> * Merlin Moncure (mmoncure(at)gmail(dot)com) wrote:
>> On Wed, Mar 5, 2014 at 2:46 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
>> > I don't see why we can't do exactly what you're suggesting in core.
>>
>> Because you can't (if you're defining core to mean 'not an
>> extension'). Functions can't be removed or changed because of legacy
>> application support. In an extension world, they can -- albeit not
>> 'magically', but at least it can be done.
>
> That simply isn't accurate on either level- if there is concern about
> application support, that can apply equally to core and contrib, and we
> certainly *can* remove and/or redefine functions in core with sufficient
> cause. It's just not something we do lightly for things living in
> either core or contrib.
>
> For an example, consider the FDW API, particularly what we did between
> 9.1 and 9.2.

Well, we'll have to agree to disagree I suppose. Getting back on
topic, the question is 'what about jsonb/hstore2'? At this point my
interests are practical. I promised (heh) to bone up the docs. I'm on
vacation this weekend so it's looking like around sometime late next
week for that. In particular, it'd be helpful to get some kind of
read on the final disposition of hstore2.

merlin


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Christophe Pettus <xof(at)thebuild(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonb and nested hstore
Date: 2014-03-06 05:07:00
Message-ID: CAM3SWZSewg++hhk9f9H4trVeddvxeB5fmaJ1acEh9U-1w-7FHA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Mar 5, 2014 at 11:32 AM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> So, now knowing that hstore2 is just hierarchical hstore using the same
> hstore type name, you are saying that we are keeping the
> non-hierarchical code in contrib, and the rest goes into core --- that
> makes sense, and from a code maintenance perspective, I like that the
> non-hierarchical hstore code is not going in core.

Yeah.

It's hard to justify having a user-facing hstore2 on the grounds of
backwards compatibility, and giving those stuck on hstore the benefit
of all of these new capabilities. That's because we *cannot* really
preserve compatibility, AFAICT. Many of the lines of the patch
submitted are due to changes in the output format of hstore, and the
need to update the hstore tests' expected results to reflect these
changes. For example:

*************** select slice(hstore 'aa=>1, b=>2, c=>3',
*** 759,779 ****
(1 row)

select slice(hstore 'aa=>1, b=>2, c=>3', ARRAY['c','b']);
! slice
! --------------------
! "b"=>"2", "c"=>"3"