Calling json_* functions with JSONB data

Lists: pgsql-hackers
From: Peter van Hardenberg <pvh(at)pvh(dot)ca>
To: "pgsql-hackers(at)postgresql(dot)org Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Calling json_* functions with JSONB data
Date: 2016-05-23 16:55:16
Message-ID: CABTbUpgQ3b-JTNUaLPLGL3aw4V9LtwFnbLuHhA9-BTVyGcggNQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi there,

I noticed it was very easy to accidentally call the json_* form of JSON
manipulation functions with jsonb data as input. This is pretty
sub-optimal, since it involves rendering the jsonb then reparsing it and
calling the json_* form of the function.

Fortunately, this seems quite easy to resolve by taking advantage of our
ability to add json_*(jsonb) form of the functions.

I talked this over with Andrew who had no objections and suggested I float
it on the list before writing a patch. Looks pretty straightforward, just a
few new data rows in pg_proc.h.

Anyone have any concerns or suggestions?

-p

--
Peter van Hardenberg
San Francisco, California
"Everything was beautiful, and nothing hurt."—Kurt Vonnegut


From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: Peter van Hardenberg <pvh(at)pvh(dot)ca>
Cc: "pgsql-hackers(at)postgresql(dot)org Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Calling json_* functions with JSONB data
Date: 2016-05-23 17:14:55
Message-ID: CAKFQuwZGOzxbc2pQh1WaaD9HQuDs0hHVUQwtiXoT=YmZWWZcRA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, May 23, 2016 at 12:55 PM, Peter van Hardenberg <pvh(at)pvh(dot)ca> wrote:

> Hi there,
>
> I noticed it was very easy to accidentally call the json_* form of JSON
> manipulation functions with jsonb data as input. This is pretty
> sub-optimal, since it involves rendering the jsonb then reparsing it and
> calling the json_* form of the function.
>
> Fortunately, this seems quite easy to resolve by taking advantage of our
> ability to add json_*(jsonb) form of the functions.
>
> I talked this over with Andrew who had no objections and suggested I float
> it on the list before writing a patch. Looks pretty straightforward, just a
> few new data rows in pg_proc.h.
>
> Anyone have any concerns or suggestions?
>
>
Please provide an example of what you are talking about.

SELECT json_array_length('[1,2]'::jsonb)
ERROR: function json_array_length(jsonb) does not exist

-- The function name is "jsonb_array_length"; and there is no implicit cast
between the two.

David J.


From: Ryan Pedela <rpedela(at)datalanche(dot)com>
To: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
Cc: Peter van Hardenberg <pvh(at)pvh(dot)ca>, "pgsql-hackers(at)postgresql(dot)org Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Calling json_* functions with JSONB data
Date: 2016-05-23 17:23:05
Message-ID: CACu89FSEGzH0+s=MpVvS70Tkbj2p2Q9Abiibc7-3mCM2M60Jkw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, May 23, 2016 at 11:14 AM, David G. Johnston <
david(dot)g(dot)johnston(at)gmail(dot)com> wrote:

> On Mon, May 23, 2016 at 12:55 PM, Peter van Hardenberg <pvh(at)pvh(dot)ca> wrote:
>
>> Hi there,
>>
>> I noticed it was very easy to accidentally call the json_* form of JSON
>> manipulation functions with jsonb data as input. This is pretty
>> sub-optimal, since it involves rendering the jsonb then reparsing it and
>> calling the json_* form of the function.
>>
>> Fortunately, this seems quite easy to resolve by taking advantage of our
>> ability to add json_*(jsonb) form of the functions.
>>
>> I talked this over with Andrew who had no objections and suggested I
>> float it on the list before writing a patch. Looks pretty straightforward,
>> just a few new data rows in pg_proc.h.
>>
>> Anyone have any concerns or suggestions?
>>
>>
> Please provide an example of what you are talking about.
>
> SELECT json_array_length('[1,2]'::jsonb)
> ERROR: function json_array_length(jsonb) does not exist
>
> -- The function name is "jsonb_array_length"; and there is no implicit
> cast between the two.
>

He is saying that he accidentally calls json_array_length() instead of
jsonb_array_length()
and that it is an annoying usability problem. It happens to me too and I
agree it would be better if you could just call json_array_length()
regardless if the type is JSON or JSONB. If there is some significant
functionality difference from the user's perspective then having separate
"json_" and "jsonb_" functions makes sense, but in your example there is
not.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter van Hardenberg <pvh(at)pvh(dot)ca>
Cc: "pgsql-hackers(at)postgresql(dot)org Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Calling json_* functions with JSONB data
Date: 2016-05-23 17:23:44
Message-ID: 733.1464024224@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Peter van Hardenberg <pvh(at)pvh(dot)ca> writes:
> I talked this over with Andrew who had no objections and suggested I float
> it on the list before writing a patch. Looks pretty straightforward, just a
> few new data rows in pg_proc.h.

I think you might find that you need to add new C function entry points to
keep opr_sanity.sql from complaining. That's still pretty easy though.

regards, tom lane


From: Marko Tiikkaja <marko(at)joh(dot)to>
To: Peter van Hardenberg <pvh(at)pvh(dot)ca>, "pgsql-hackers(at)postgresql(dot)org Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Calling json_* functions with JSONB data
Date: 2016-05-23 17:47:36
Message-ID: 5af0f0c1-7ffa-7324-d83d-d1529124daca@joh.to
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2016-05-23 18:55, Peter van Hardenberg wrote:
> I talked this over with Andrew who had no objections and suggested I float
> it on the list before writing a patch. Looks pretty straightforward, just a
> few new data rows in pg_proc.h.
>
> Anyone have any concerns or suggestions?

What about cases like json_whatever($1) which previously worked but
will now be ambiguous? (Or will they somehow not be ambiguous?)

.m


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Marko Tiikkaja <marko(at)joh(dot)to>
Cc: Peter van Hardenberg <pvh(at)pvh(dot)ca>, "pgsql-hackers(at)postgresql(dot)org Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Calling json_* functions with JSONB data
Date: 2016-05-23 18:29:57
Message-ID: 16463.1464028197@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Marko Tiikkaja <marko(at)joh(dot)to> writes:
> On 2016-05-23 18:55, Peter van Hardenberg wrote:
>> Anyone have any concerns or suggestions?

> What about cases like json_whatever($1) which previously worked but
> will now be ambiguous? (Or will they somehow not be ambiguous?)

Good point, that would have to be looked into.

regards, tom lane


From: Peter van Hardenberg <pvh(at)pvh(dot)ca>
To: Marko Tiikkaja <marko(at)joh(dot)to>
Cc: "pgsql-hackers(at)postgresql(dot)org Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Calling json_* functions with JSONB data
Date: 2016-05-23 20:15:31
Message-ID: CABTbUpik8-x8Oq-GOk3NffY0pcRG3LfpKC9Hr1KYNDYfWzLv5g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Great question, Marko. If you can point me towards an example I'll take a
look, but I'll proceed with the current understanding and suggestions and
see what people have to say.

On Mon, May 23, 2016 at 10:47 AM, Marko Tiikkaja <marko(at)joh(dot)to> wrote:

> On 2016-05-23 18:55, Peter van Hardenberg wrote:
>
>> I talked this over with Andrew who had no objections and suggested I float
>> it on the list before writing a patch. Looks pretty straightforward, just
>> a
>> few new data rows in pg_proc.h.
>>
>> Anyone have any concerns or suggestions?
>>
>
> What about cases like json_whatever($1) which previously worked but will
> now be ambiguous? (Or will they somehow not be ambiguous?)
>
>
> .m
>

--
Peter van Hardenberg
San Francisco, California
"Everything was beautiful, and nothing hurt."—Kurt Vonnegut


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter van Hardenberg <pvh(at)pvh(dot)ca>
Cc: Marko Tiikkaja <marko(at)joh(dot)to>, "pgsql-hackers(at)postgresql(dot)org Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Calling json_* functions with JSONB data
Date: 2016-05-23 20:37:28
Message-ID: 22179.1464035848@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Peter van Hardenberg <pvh(at)pvh(dot)ca> writes:
> Great question, Marko. If you can point me towards an example I'll take a
> look, but I'll proceed with the current understanding and suggestions and
> see what people have to say.

I believe Marko's just complaining about the case for unknown-type
arguments, for example:

regression=# select json_array_length('[1,2,3]');
json_array_length
-------------------
3
(1 row)

The parser has no trouble resolving this because there is only one
json_array_length(); but if there were two, it would fail to make a
determination of which one you meant.

AFAICS the only way to fix that would be to introduce some preference
between the two types. For example, we could move both 'json' and 'jsonb'
into their own typcategory ('J' is unused...) and then mark 'jsonb' as
the preferred type in that category. This would require a fair amount of
experimentation to determine if it upsets any cases that work conveniently
today; but right offhand I don't see any fatal problems with such an idea.

regards, tom lane


From: Peter van Hardenberg <pvh(at)pvh(dot)ca>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Marko Tiikkaja <marko(at)joh(dot)to>, "pgsql-hackers(at)postgresql(dot)org Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Calling json_* functions with JSONB data
Date: 2016-05-23 21:20:28
Message-ID: CABTbUpiZcbgYJU9xX39MH_1w6k6KX2f4ojX_+GkaxVWaYCM=Vw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I'll look into it, thanks for the explanation.

On Mon, May 23, 2016 at 1:37 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Peter van Hardenberg <pvh(at)pvh(dot)ca> writes:
> > Great question, Marko. If you can point me towards an example I'll take a
> > look, but I'll proceed with the current understanding and suggestions and
> > see what people have to say.
>
> I believe Marko's just complaining about the case for unknown-type
> arguments, for example:
>
> regression=# select json_array_length('[1,2,3]');
> json_array_length
> -------------------
> 3
> (1 row)
>
> The parser has no trouble resolving this because there is only one
> json_array_length(); but if there were two, it would fail to make a
> determination of which one you meant.
>
> AFAICS the only way to fix that would be to introduce some preference
> between the two types. For example, we could move both 'json' and 'jsonb'
> into their own typcategory ('J' is unused...) and then mark 'jsonb' as
> the preferred type in that category. This would require a fair amount of
> experimentation to determine if it upsets any cases that work conveniently
> today; but right offhand I don't see any fatal problems with such an idea.
>
> regards, tom lane
>

--
Peter van Hardenberg
San Francisco, California
"Everything was beautiful, and nothing hurt."—Kurt Vonnegut


From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peter van Hardenberg <pvh(at)pvh(dot)ca>, Marko Tiikkaja <marko(at)joh(dot)to>, "pgsql-hackers(at)postgresql(dot)org Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Calling json_* functions with JSONB data
Date: 2016-05-23 21:21:55
Message-ID: CAKFQuwbTk8xwWJ5SFTGoRnxwEyh4Y4TXvy6BSuk_WN+4ov_w4Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, May 23, 2016 at 4:37 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Peter van Hardenberg <pvh(at)pvh(dot)ca> writes:
> > Great question, Marko. If you can point me towards an example I'll take a
> > look, but I'll proceed with the current understanding and suggestions and
> > see what people have to say.
>
> I believe Marko's just complaining about the case for unknown-type
> arguments, for example:
>
> regression=# select json_array_length('[1,2,3]');
> json_array_length
> -------------------
> 3
> (1 row)
>
> The parser has no trouble resolving this because there is only one
> json_array_length(); but if there were two, it would fail to make a
> determination of which one you meant.
>
> AFAICS the only way to fix that would be to introduce some preference
> between the two types. For example, we could move both 'json' and 'jsonb'
> into their own typcategory ('J' is unused...) and then mark 'jsonb' as
> the preferred type in that category. This would require a fair amount of
> experimentation to determine if it upsets any cases that work conveniently
> today; but right offhand I don't see any fatal problems with such an idea.
>
> regards, tom lane
>



I
​ guess the relevant point in the documentation is the parenthetical
sentence:


(The processing functions consider the last value as the operative one.)

http://www.postgresql.org/docs/9.5/static/datatype-json.html

​Which normalizes the behaviors of jsonb and json as they pass through one
of these functions. Though only the multi-key is noted which means
white-space (immaterial) and key-order (potentially material) behaviors
differ; though the later coming through the function unscathed is not
something that user's should be relying upon. Specifically I'm thinking of
the behavior that "json_each(...)" would exhibit.

David J.


From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Peter van Hardenberg <pvh(at)pvh(dot)ca>, "pgsql-hackers(at)postgresql(dot)org Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Calling json_* functions with JSONB data
Date: 2016-05-23 21:38:16
Message-ID: 4eaed4e8-6e23-ef85-ffd8-849f3b39bd3c@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 5/23/16 11:55 AM, Peter van Hardenberg wrote:
> Fortunately, this seems quite easy to resolve by taking advantage of our
> ability to add json_*(jsonb) form of the functions.

Another issue no one has mentioned is functions that return JSON/JSONB.
IMO those should NOT be overloaded, because that will make it very easy
to accidentally change from one type to the other without meaning to.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532) mobile: 512-569-9461


From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
Cc: Peter van Hardenberg <pvh(at)pvh(dot)ca>, "pgsql-hackers(at)postgresql(dot)org Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Calling json_* functions with JSONB data
Date: 2016-05-23 21:45:08
Message-ID: CAKFQuwZazz+8YLueMV1p4+RzkaAmGyXyfV2xDjbdC8+hU6eEOQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, May 23, 2016 at 5:38 PM, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com> wrote:

> On 5/23/16 11:55 AM, Peter van Hardenberg wrote:
>
>> Fortunately, this seems quite easy to resolve by taking advantage of our
>> ability to add json_*(jsonb) form of the functions.
>>
>
> Another issue no one has mentioned is functions that return JSON/JSONB.
> IMO those should NOT be overloaded, because that will make it very easy to
> accidentally change from one type to the other without meaning to.

​Actually, by definition they cannot be overloaded. A function's signature
is derived from its input types only.

http://www.postgresql.org/docs/devel/static/sql-createfunction.html

​"""
​The name of the new function must not match any existing function with the
same input argument types in the same schema. However, functions of
different argument types can share a name (this is called overloading).
"""

Admittedly the absence of "output" is not emphasized but overloading in
(most?) languages (small sample size for personal knowledge) is subject to
the same restriction.

David J.


From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
Cc: Peter van Hardenberg <pvh(at)pvh(dot)ca>, "pgsql-hackers(at)postgresql(dot)org Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Calling json_* functions with JSONB data
Date: 2016-05-24 14:09:41
Message-ID: 1f8a6917-2421-34df-402c-ee6ad4a6ed63@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 5/23/16 4:45 PM, David G. Johnston wrote:
> On Mon, May 23, 2016 at 5:38 PM, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com
> <mailto:Jim(dot)Nasby(at)bluetreble(dot)com>>wrote:
>
> On 5/23/16 11:55 AM, Peter van Hardenberg wrote:
>
> Fortunately, this seems quite easy to resolve by taking
> advantage of our
> ability to add json_*(jsonb) form of the functions.
>
>
> Another issue no one has mentioned is functions that return
> JSON/JSONB. IMO those should NOT be overloaded, because that will
> make it very easy to accidentally change from one type to the other
> without meaning to.
>
>
> ​Actually, by definition they cannot be overloaded. A function's
> signature is derived from its input types only.

My point is that while it would be possible to create something like
json_strip_nulls(jsonb) to go with json_strip_nulls(json), that would be
a bad idea, because it makes it easily to accidentally turn your jsonb
into plain json.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532) mobile: 512-569-9461