Proposal: json_populate_record and nested json objects

Lists: pgsql-hackers
From: chris travers <chris(at)2ndquadrant(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Proposal: json_populate_record and nested json objects
Date: 2013-09-15 02:27:35
Message-ID: 962759450.189991.1379212055460.open-xchange@email.1and1.co.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi all;

Currently json_populate_record and json_populate_recordset cannot work with
nested json objects. This creates two fundamental problems when trying to use
JSON as an interface format.

The first problem is you can't easily embed a json data type in an json object
and have it populate a record. This means that storing extended attributes in
the database is somewhat problematic if you accept the whole row in as a json
object.

The second problem is that nested data structures and json don't go together
well. You can't have a composite type which has as an attribute an array of
another composite type and populate this from a json object. This makes json
largely an alternative to hstore for interfaces in its current shape.

I would propose handling the json_populate_record and friends as such:

1. Don't throw errors initially as a pre-check if the json object is nested.
2. If one comes to a nested fragment, check the attribute type it is going into
first.
2.1 If it is a json type, put the nested fragment there.
2.2 If it is a composite type (i.e. anything in pg_class), push it through
another json_populate_record run
2.3 If it is neither, then see if a json::[type] cast exists, if so call it.
2.4 Otherwise raise an exception

I have a few questions before I go on to look at creating a patch.

1. Are there any problems anyone spots with this approach?

2. Is anyone working on something like this?

3. Would it be preferable to build something like this first as an extension
(perhaps with different function names) or first as a patch?

Best Wishes,
Chris Travers
http://www.2ndquadrant.com
PostgreSQL Services, Training, and Support


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: chris travers <chris(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal: json_populate_record and nested json objects
Date: 2013-09-15 17:42:04
Message-ID: 5235F16C.60506@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 09/14/2013 10:27 PM, chris travers wrote:
> Hi all;
> Currently json_populate_record and json_populate_recordset cannot work
> with nested json objects. This creates two fundamental problems when
> trying to use JSON as an interface format.
> The first problem is you can't easily embed a json data type in an
> json object and have it populate a record. This means that storing
> extended attributes in the database is somewhat problematic if you
> accept the whole row in as a json object.
> The second problem is that nested data structures and json don't go
> together well. You can't have a composite type which has as an
> attribute an array of another composite type and populate this from a
> json object. This makes json largely an alternative to hstore for
> interfaces in its current shape.
> I would propose handling the json_populate_record and friends as such:
> 1. Don't throw errors initially as a pre-check if the json object is
> nested.
> 2. If one comes to a nested fragment, check the attribute type it is
> going into first.
> 2.1 If it is a json type, put the nested fragment there.
> 2.2 If it is a composite type (i.e. anything in pg_class), push it
> through another json_populate_record run
> 2.3 If it is neither, then see if a json::[type] cast exists, if
> so call it.
> 2.4 Otherwise raise an exception
> I have a few questions before I go on to look at creating a patch.
> 1. Are there any problems anyone spots with this approach?
> 2. Is anyone working on something like this?
> 3. Would it be preferable to build something like this first as an
> extension (perhaps with different function names) or first as a patch?
>

Well, you could fairly easily build it as an extension as a POC. The
main point of the API this is built on was to allow for extensions.

The logic changes might be a bit tricky. I'll be interested to see what
you come up with.

If we're going to do this we should make these handle arrays as well as
objects.

cheers

andrew


From: Chris Travers <chris(at)2ndquadrant(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal: json_populate_record and nested json objects
Date: 2013-09-16 00:13:49
Message-ID: 1720654706.211305.1379290429511.open-xchange@email.1and1.co.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> On 15 September 2013 at 18:42 Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
>
>
>
> On 09/14/2013 10:27 PM, chris travers wrote:
>
> Well, you could fairly easily build it as an extension as a POC. The
> main point of the API this is built on was to allow for extensions.
>
> The logic changes might be a bit tricky. I'll be interested to see what
> you come up with.
>
> If we're going to do this we should make these handle arrays as well as
> objects.

Yes, arrays are necessary for my use case. I probably should have been explicit
that I needed that too.

Best Wishes,
Chris Travers
>
> cheers
>
> andrew
>
>
Best Wishes,
Chris Travers
http://www.2ndquadrant.com
PostgreSQL Services, Training, and Support


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: chris travers <chris(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal: json_populate_record and nested json objects
Date: 2013-09-16 13:43:19
Message-ID: CAHyXU0ymNo0TVTE2QZ3MaD70CLjUeuJeaBYjLuvU=Yc9o5Sp3Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Sep 14, 2013 at 9:27 PM, chris travers <chris(at)2ndquadrant(dot)com> wrote:
> Hi all;
>
> Currently json_populate_record and json_populate_recordset cannot work with
> nested json objects. This creates two fundamental problems when trying to
> use JSON as an interface format.
>
> The first problem is you can't easily embed a json data type in an json
> object and have it populate a record. This means that storing extended
> attributes in the database is somewhat problematic if you accept the whole
> row in as a json object.
>
> The second problem is that nested data structures and json don't go together
> well. You can't have a composite type which has as an attribute an array
> of another composite type and populate this from a json object. This makes
> json largely an alternative to hstore for interfaces in its current shape.
>
> I would propose handling the json_populate_record and friends as such:
>
> 1. Don't throw errors initially as a pre-check if the json object is nested.
> 2. If one comes to a nested fragment, check the attribute type it is going
> into first.
> 2.1 If it is a json type, put the nested fragment there.
> 2.2 If it is a composite type (i.e. anything in pg_class), push it
> through another json_populate_record run
> 2.3 If it is neither, then see if a json::[type] cast exists, if so call
> it.
> 2.4 Otherwise raise an exception
>
> I have a few questions before I go on to look at creating a patch.
>
> 1. Are there any problems anyone spots with this approach?
>
> 2. Is anyone working on something like this?
>
> 3. Would it be preferable to build something like this first as an
> extension (perhaps with different function names) or first as a patch?

Huge +1 on on this. Couple random thoughts:

*) Hard to see how you would structure this as an extension as you're
adjusting the behaviors of existing functions, unless you wanted to
introduce new function names for testing purposes?

*) Would like to at least consider being able to use casting syntax as
a replacement for "populate_record" and (the missing) "populate_array"
for most usages.

merlin


From: Chris Travers <chris(at)2ndquadrant(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal: json_populate_record and nested json objects
Date: 2013-09-16 13:57:19
Message-ID: 1361412433.252897.1379339839458.open-xchange@email.1and1.co.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> On 16 September 2013 at 14:43 Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:

>
> Huge +1 on on this. Couple random thoughts:
>
> *) Hard to see how you would structure this as an extension as you're
> adjusting the behaviors of existing functions, unless you wanted to
> introduce new function names for testing purposes?

Yeah, and reading the source, it looks like some parts of the JSON parsing code
will have to be rewritten because the nested object errors are thrown quite
deeply in the parsing stage. It looks to me as if this will require some
significant copying as a POC into a new file with different publicly exposed
function names.
>
> *) Would like to at least consider being able to use casting syntax as
> a replacement for "populate_record" and (the missing) "populate_array"
> for most usages.

Yes. I am trying to figure out how best to do this at present. Initially I
think I would be happy to settle for casts wrapping functions which themselves
just wrap the call to populate_record.

What I will probably do for my POC is expose the following methods:

1. json_populate_type()
2. json_populate_array()

Then we can talk about whether to merge the changes into the core,

This will be an interesting way to get into PostgreSQL hacking.

Best Wishes,
Chris Travers

>
> merlin
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
Best Wishes,
Chris Travers
http://www.2ndquadrant.com
PostgreSQL Services, Training, and Support


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Chris Travers <chris(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal: json_populate_record and nested json objects
Date: 2013-09-16 14:18:04
Message-ID: CAHyXU0zE5YMqnYzScKvJEQewD2E_hu9V8ZirO87wwBqqu22yag@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Sep 16, 2013 at 8:57 AM, Chris Travers <chris(at)2ndquadrant(dot)com> wrote:
>> On 16 September 2013 at 14:43 Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
>
>>
>> Huge +1 on on this. Couple random thoughts:
>>
>> *) Hard to see how you would structure this as an extension as you're
>> adjusting the behaviors of existing functions, unless you wanted to
>> introduce new function names for testing purposes?
>
> Yeah, and reading the source, it looks like some parts of the JSON parsing
> code will have to be rewritten because the nested object errors are thrown
> quite deeply in the parsing stage. It looks to me as if this will require
> some significant copying as a POC into a new file with different publicly
> exposed function names.

ISTM that's not worth it then.

>> *) Would like to at least consider being able to use casting syntax as
>> a replacement for "populate_record" and (the missing) "populate_array"
>> for most usages.
>
> Yes. I am trying to figure out how best to do this at present. Initially I
> think I would be happy to settle for casts wrapping functions which
> themselves just wrap the call to populate_record.

right.

> What I will probably do for my POC is expose the following methods:
>
> 1. json_populate_type()

hm if you're going to name it that way, prefer json_populate_value().
(or maybe _scalar() or _datum()). but we have to_json(), so how about
from_json()?

merlin


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Chris Travers <chris(at)2ndquadrant(dot)com>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal: json_populate_record and nested json objects
Date: 2013-09-16 21:03:00
Message-ID: 52377204.5050903@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 09/16/2013 09:57 AM, Chris Travers wrote:
>
> > On 16 September 2013 at 14:43 Merlin Moncure <mmoncure(at)gmail(dot)com>
> wrote:
>
> >
> > Huge +1 on on this. Couple random thoughts:
> >
> > *) Hard to see how you would structure this as an extension as you're
> > adjusting the behaviors of existing functions, unless you wanted to
> > introduce new function names for testing purposes?
> Yeah, and reading the source, it looks like some parts of the JSON
> parsing code will have to be rewritten because the nested object
> errors are thrown quite deeply in the parsing stage. It looks to me
> as if this will require some significant copying as a POC into a new
> file with different publicly exposed function names.

I don't believe any of the parsing code should require changing at all.
The event handlers that the parser calls would need to be changed. If
you're at PostgresOpen you should be attending my talk which includes an
example of how to use this API.

cheers

andrew