Re: json generation enhancements

Lists: pgsql-hackers
From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: json generation enhancements
Date: 2013-01-11 16:00:20
Message-ID: 50F03714.9050601@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


I have not had anyone follow up on this, so I have added docs and will
add this to the commitfest.

Recap:

This adds the following:

json_agg(anyrecord) -> json
to_json(any) -> json
hstore_to_json(hstore) -> json (also used as a cast)
hstore_to_json_loose(hstore) -> json

Also, in json generation, if any non-builtin type has a cast to json,
that function is used instead of the type's output function.

cheers

andrew


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: json generation enhancements
Date: 2013-01-11 16:03:56
Message-ID: 50F037EC.3000107@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 01/11/2013 11:00 AM, Andrew Dunstan wrote:
>
> I have not had anyone follow up on this, so I have added docs and will
> add this to the commitfest.
>
> Recap:
>
> This adds the following:
>
> json_agg(anyrecord) -> json
> to_json(any) -> json
> hstore_to_json(hstore) -> json (also used as a cast)
> hstore_to_json_loose(hstore) -> json
>
> Also, in json generation, if any non-builtin type has a cast to json,
> that function is used instead of the type's output function.
>
>

This time with a patch.

cheers

andrew

Attachment Content-Type Size
json_enhancements_part1-v3.patch text/x-patch 30.5 KB

From: Steve Singer <steve(at)ssinger(dot)info>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: json generation enhancements
Date: 2013-02-24 06:09:20
Message-ID: BLU0-SMTP414B2E9CC000E9C5BD5B2ADCF20@phx.gbl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 13-01-11 11:03 AM, Andrew Dunstan wrote:
>
> On 01/11/2013 11:00 AM, Andrew Dunstan wrote:
>>
>> I have not had anyone follow up on this, so I have added docs and
>> will add this to the commitfest.
>>
>> Recap:
>>
>> This adds the following:
>>
>> json_agg(anyrecord) -> json
>> to_json(any) -> json
>> hstore_to_json(hstore) -> json (also used as a cast)
>> hstore_to_json_loose(hstore) -> json
>>
>> Also, in json generation, if any non-builtin type has a cast to json,
>> that function is used instead of the type's output function.
>>
>>
>
> This time with a patch.
>

Here is a review of this patch.,

Overview
---------------------
This patch adds a set of functions to convert types to json. Specifically

* An aggregate that take the elements and builds up a json array.
* Conversions from an hstore to json

For non-builtin types the text conversion is used unless a cast has
specifically been defined from the type to json.

There was some discussion last year on this patch that raised three issues

a) Robert was concerned that if someone added a cast from a non-core
type like citext to json that it would change the behaviour of this
function. No one else offered an opinion on this at the time. I don't
see this as a problem, if I add a cast between citext (or any other
non-core datatype) to json I would expect it to effect how that datatype
is generated as a json object in any functions that generate json. I
don't see why this would be surprising behaviour. If I add a cast
between my datatype and json to generate a json representation that
differs from the textout representation then I would expect that
representation to be used in the json_agg function as well.

b) There was some talk in the json bikeshedding thread that json_agg()
should be renamed to collect_json() but more people preferred json_agg().

c) Should an aggregate of an empty result produce NULL or an empty json
element. Most people who commented on the thread seemed to feel that
NULL is preferred because it is consistent with other aggregates

I feel that the functionality of this patch is fine.

Testing
-------------------

When I try running the regression tests with this patch I get:
creating template1 database in
/usr/local/src/postgresql/src/test/regress/./tmp_check/data/base/1 ...
FATAL: could not create unique index "pg_proc_oid_index"
DETAIL: Key (oid)=(3171) is duplicated.
child process exited with exit code 1

oid 3170, 3171 and 3172 appears to be used by lo_tell64 and lo_lseek64

If I fix those up the regression tests pass.

I tried using the new functions in a few different ways and everything
worked as expected.

Code Review
-----------
in contrib/hstore/hstore_io.c
+ /* not an int - try a double */
+ double doubleres =
strtod(src->data,&endptr);
+ if (*endptr == '\0')
+ is_number = true;
+ else if (false)
+ {
+ /* shut the compiler up
about unused variables */
+ longres = (long) doubleres;
+ longres = longres / 2;

I dislike that we have to do this to avoid compiler warnings. I am also
worried the some other compiler might decide that the else if (false) is
worthy of a warning. Does anyone have cleaner way of getting rid of
the warning we get if we don't store the strtol/strtod result?

Should we do something like:

(void) ( strtol(src->data,&endptr,10)+1);

Other than that I don't see any issues.

Steve

> cheers
>
> andrew
>
>
>
>


From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Steve Singer <steve(at)ssinger(dot)info>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: json generation enhancements
Date: 2013-02-24 07:15:03
Message-ID: 5129BDF7.5020507@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 02/24/2013 02:09 PM, Steve Singer wrote:
> Here is a review of this patch.,
>
> Overview
> ---------------------
> This patch adds a set of functions to convert types to json. Specifically
>
> * An aggregate that take the elements and builds up a json array.
> * Conversions from an hstore to json
>
> For non-builtin types the text conversion is used unless a cast has
> specifically been defined from the type to json.
>
> There was some discussion last year on this patch that raised three
> issues
>
> a) Robert was concerned that if someone added a cast from a non-core
> type like citext to json that it would change the behaviour of this
> function. No one else offered an opinion on this at the time. I don't
> see this as a problem, if I add a cast between citext (or any other
> non-core datatype) to json I would expect it to effect how that
> datatype is generated as a json object in any functions that generate
> json. I don't see why this would be surprising behaviour. If I add
> a cast between my datatype and json to generate a json representation
> that differs from the textout representation then I would expect that
> representation to be used in the json_agg function as well.
I'm not thrilled about that, because we're likely to want to add more
JSON-specific casts to built-in or extension types in the future. If
doing so changes behaviour, causing something that used to work to
continue to work but produce a different result, that'll result in
considerable arguments about backward compatibility.

I'd be happier to require explicit casts to text or require the user to
explicitly CREATE CAST where no JSON-aware cast is already defined.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc: Steve Singer <steve(at)ssinger(dot)info>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: json generation enhancements
Date: 2013-02-24 13:58:58
Message-ID: 512A1CA2.50009@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 02/24/2013 02:15 AM, Craig Ringer wrote:
> On 02/24/2013 02:09 PM, Steve Singer wrote:
>> Here is a review of this patch.,
>>
>> Overview
>> ---------------------
>> This patch adds a set of functions to convert types to json. Specifically
>>
>> * An aggregate that take the elements and builds up a json array.
>> * Conversions from an hstore to json
>>
>> For non-builtin types the text conversion is used unless a cast has
>> specifically been defined from the type to json.
>>
>> There was some discussion last year on this patch that raised three
>> issues
>>
>> a) Robert was concerned that if someone added a cast from a non-core
>> type like citext to json that it would change the behaviour of this
>> function. No one else offered an opinion on this at the time. I don't
>> see this as a problem, if I add a cast between citext (or any other
>> non-core datatype) to json I would expect it to effect how that
>> datatype is generated as a json object in any functions that generate
>> json. I don't see why this would be surprising behaviour. If I add
>> a cast between my datatype and json to generate a json representation
>> that differs from the textout representation then I would expect that
>> representation to be used in the json_agg function as well.
> I'm not thrilled about that, because we're likely to want to add more
> JSON-specific casts to built-in or extension types in the future. If
> doing so changes behaviour, causing something that used to work to
> continue to work but produce a different result, that'll result in
> considerable arguments about backward compatibility.
>
> I'd be happier to require explicit casts to text or require the user to
> explicitly CREATE CAST where no JSON-aware cast is already defined.
>

Adding a cast to json for a builtin type will have no effect unless you
also change this code. We can relax that but my view was that we should
know how to generate JSON from builtin types and just do it. But if we
wanted to allow someone to create a cast from, say, xml to json, and
have it take effect in json_agg then it might make sense to honor casts
for all types.

Your requirement of an explicit cast to text or json would result in a
class of type that could not be represented as json at all. I'm very
strongly opposed to that. If you proposed instead to prefer a cast to
text and only fall back on the type's default output format if one
doesn't exist I could live with that, although I strongly suspect it
will be mostly pointless.

cheers

andrew


From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Steve Singer <steve(at)ssinger(dot)info>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: json generation enhancements
Date: 2013-02-24 23:33:53
Message-ID: 512AA361.1030904@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 02/24/2013 09:58 PM, Andrew Dunstan wrote:
>
> Adding a cast to json for a builtin type will have no effect unless
> you also change this code. We can relax that but my view was that we
> should know how to generate JSON from builtin types and just do it.
If json generation from built-in types is complete with full coverage of
all useful types when this first ships, then I'm happy. I was only
concerned about the BC argument "you can't add a proper json-aware cast
from interval to json because we've already been doing it via a cast to
text and changing the result would break existing code".

That's all I'm worried about, and if that won't be an issue then I'm
perfectly happy.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc: Steve Singer <steve(at)ssinger(dot)info>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: json generation enhancements
Date: 2013-02-25 00:09:53
Message-ID: 512AABD1.6020304@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 02/24/2013 06:33 PM, Craig Ringer wrote:
> On 02/24/2013 09:58 PM, Andrew Dunstan wrote:
>> Adding a cast to json for a builtin type will have no effect unless
>> you also change this code. We can relax that but my view was that we
>> should know how to generate JSON from builtin types and just do it.
> If json generation from built-in types is complete with full coverage of
> all useful types when this first ships, then I'm happy. I was only
> concerned about the BC argument "you can't add a proper json-aware cast
> from interval to json because we've already been doing it via a cast to
> text and changing the result would break existing code".
>
> That's all I'm worried about, and if that won't be an issue then I'm
> perfectly happy.
>

The code as written does not use casts except for non-builtin types.

For a builtin type you could change the way the json generators work by
using a function or cast so that the json generators saw what was
already json (which is just passed through). The trouble with this is
that if you're handling some large record and you want to do your own
conversion to json on one field it gets very messy. That's why I added
provision for honoring casts for non-builtin types - the most obvious
candidate for which is hstore, but I can imagine other types wanting to
be able to provide JSON representations.

cheers

andrew


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Steve Singer <steve(at)ssinger(dot)info>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: json generation enhancements
Date: 2013-02-25 22:37:03
Message-ID: 512BE78F.2000209@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 02/24/2013 01:09 AM, Steve Singer wrote:
> On 13-01-11 11:03 AM, Andrew Dunstan wrote:
>>
>> On 01/11/2013 11:00 AM, Andrew Dunstan wrote:
>>>
>>> I have not had anyone follow up on this, so I have added docs and
>>> will add this to the commitfest.
>>>
>>> Recap:
>>>
>>> This adds the following:
>>>
>>> json_agg(anyrecord) -> json
>>> to_json(any) -> json
>>> hstore_to_json(hstore) -> json (also used as a cast)
>>> hstore_to_json_loose(hstore) -> json
>>>
>>> Also, in json generation, if any non-builtin type has a cast to
>>> json, that function is used instead of the type's output function.
>>>
>>>
>>
>> This time with a patch.
>>
>
> Here is a review of this patch.,
>
> Overview
> ---------------------
> This patch adds a set of functions to convert types to json. Specifically
>
> * An aggregate that take the elements and builds up a json array.
> * Conversions from an hstore to json
>
> For non-builtin types the text conversion is used unless a cast has
> specifically been defined from the type to json.
>
> There was some discussion last year on this patch that raised three
> issues
>
> a) Robert was concerned that if someone added a cast from a non-core
> type like citext to json that it would change the behaviour of this
> function. No one else offered an opinion on this at the time. I don't
> see this as a problem, if I add a cast between citext (or any other
> non-core datatype) to json I would expect it to effect how that
> datatype is generated as a json object in any functions that generate
> json. I don't see why this would be surprising behaviour. If I add
> a cast between my datatype and json to generate a json representation
> that differs from the textout representation then I would expect that
> representation to be used in the json_agg function as well.
>
> b) There was some talk in the json bikeshedding thread that json_agg()
> should be renamed to collect_json() but more people preferred json_agg().
>
> c) Should an aggregate of an empty result produce NULL or an empty
> json element. Most people who commented on the thread seemed to feel
> that NULL is preferred because it is consistent with other aggregates
>
> I feel that the functionality of this patch is fine.
>
> Testing
> -------------------
>
> When I try running the regression tests with this patch I get:
> creating template1 database in
> /usr/local/src/postgresql/src/test/regress/./tmp_check/data/base/1 ...
> FATAL: could not create unique index "pg_proc_oid_index"
> DETAIL: Key (oid)=(3171) is duplicated.
> child process exited with exit code 1
>
> oid 3170, 3171 and 3172 appears to be used by lo_tell64 and lo_lseek64
>
> If I fix those up the regression tests pass.
>
> I tried using the new functions in a few different ways and everything
> worked as expected.
>
> Code Review
> -----------
> in contrib/hstore/hstore_io.c
> + /* not an int - try a double */
> + double doubleres =
> strtod(src->data,&endptr);
> + if (*endptr == '\0')
> + is_number = true;
> + else if (false)
> + {
> + /* shut the compiler
> up about unused variables */
> + longres = (long)
> doubleres;
> + longres = longres / 2;
>
>
> I dislike that we have to do this to avoid compiler warnings. I am
> also worried the some other compiler might decide that the else if
> (false) is worthy of a warning. Does anyone have cleaner way of
> getting rid of the warning we get if we don't store the strtol/strtod
> result?
>
> Should we do something like:
>
> (void) ( strtol(src->data,&endptr,10)+1);
>
> Other than that I don't see any issues.
>
>

Thanks for the review. Revised patch attached with fresh OIDs and
numeric detection code cleaned up along the lines you suggest.

cheers

andrew

Attachment Content-Type Size
json_enhancements_part1-v4.patch text/x-patch 36.4 KB

From: Steve Singer <steve(at)ssinger(dot)info>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: json generation enhancements
Date: 2013-03-01 00:53:48
Message-ID: BLU0-SMTP1974E8ED2BA323C369001EDCFF0@phx.gbl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 13-02-25 05:37 PM, Andrew Dunstan wrote:
>
> On 02/24/2013 01:09 AM, Steve Singer wrote:
>> On 13-01-11 11:03 AM, Andrew Dunstan wrote:
>>>
>>> On 01/11/2013 11:00 AM, Andrew Dunstan wrote:
>>>>
>>>> I have not had anyone follow up on this, so I have added docs and
>>>> will add this to the commitfest.
>>>>
>>>> Recap:
>>>>
>>>> This adds the following:
>>>>
>>>> json_agg(anyrecord) -> json
>>>> to_json(any) -> json
>>>> hstore_to_json(hstore) -> json (also used as a cast)
>>>> hstore_to_json_loose(hstore) -> json
>>>>
>>>> Also, in json generation, if any non-builtin type has a cast to
>>>> json, that function is used instead of the type's output function.
>>>>
>>>>
>>>
>>> This time with a patch.
>>>
>>
>> Here is a review of this patch.,
>>
>> Overview
>> ---------------------
>> This patch adds a set of functions to convert types to json.
>> Specifically
>>
>> * An aggregate that take the elements and builds up a json array.
>> * Conversions from an hstore to json
>>
>> For non-builtin types the text conversion is used unless a cast has
>> specifically been defined from the type to json.
>>
>> There was some discussion last year on this patch that raised three
>> issues
>>
>> a) Robert was concerned that if someone added a cast from a non-core
>> type like citext to json that it would change the behaviour of this
>> function. No one else offered an opinion on this at the time. I
>> don't see this as a problem, if I add a cast between citext (or any
>> other non-core datatype) to json I would expect it to effect how that
>> datatype is generated as a json object in any functions that generate
>> json. I don't see why this would be surprising behaviour. If I add
>> a cast between my datatype and json to generate a json representation
>> that differs from the textout representation then I would expect that
>> representation to be used in the json_agg function as well.
>>
>> b) There was some talk in the json bikeshedding thread that
>> json_agg() should be renamed to collect_json() but more people
>> preferred json_agg().
>>
>> c) Should an aggregate of an empty result produce NULL or an empty
>> json element. Most people who commented on the thread seemed to feel
>> that NULL is preferred because it is consistent with other aggregates
>>
>> I feel that the functionality of this patch is fine.
>>
>> Testing
>> -------------------
>>
>> When I try running the regression tests with this patch I get:
>> creating template1 database in
>> /usr/local/src/postgresql/src/test/regress/./tmp_check/data/base/1
>> ... FATAL: could not create unique index "pg_proc_oid_index"
>> DETAIL: Key (oid)=(3171) is duplicated.
>> child process exited with exit code 1
>>
>> oid 3170, 3171 and 3172 appears to be used by lo_tell64 and lo_lseek64
>>
>> If I fix those up the regression tests pass.
>>
>> I tried using the new functions in a few different ways and
>> everything worked as expected.
>>
>> Code Review
>> -----------
>> in contrib/hstore/hstore_io.c
>> + /* not an int - try a double */
>> + double doubleres =
>> strtod(src->data,&endptr);
>> + if (*endptr == '\0')
>> + is_number = true;
>> + else if (false)
>> + {
>> + /* shut the compiler
>> up about unused variables */
>> + longres = (long)
>> doubleres;
>> + longres = longres / 2;
>>
>>
>> I dislike that we have to do this to avoid compiler warnings. I am
>> also worried the some other compiler might decide that the else if
>> (false) is worthy of a warning. Does anyone have cleaner way of
>> getting rid of the warning we get if we don't store the strtol/strtod
>> result?
>>
>> Should we do something like:
>>
>> (void) ( strtol(src->data,&endptr,10)+1);
>>
>> Other than that I don't see any issues.
>>
>>
>
> Thanks for the review. Revised patch attached with fresh OIDs and
> numeric detection code cleaned up along the lines you suggest.
>

The opr_test unit test still fails.
I think you forgot to include your update to pg_aggregate.h. See the
attached diff.

Other than that it looks fine, Craig is satisfied with the casting
behaviour and no one else has objected to it.

I think your good to commit this

Steve

> cheers
>
> andrew
>
>

Attachment Content-Type Size
pg_aggregate.h.diff text/x-patch 760 bytes