Re: Duplicate JSON Object Keys

From: Gavin Flower <GavinFlower(at)archidevsys(dot)co(dot)nz>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, "David E(dot) Wheeler" <david(at)justatheory(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Duplicate JSON Object Keys
Date: 2013-03-13 16:51:42
Message-ID: 5140AE9E.8010406@archidevsys.co.nz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 14/03/13 02:02, Andrew Dunstan wrote:
>
> On 03/13/2013 08:17 AM, Robert Haas wrote:
>> On Fri, Mar 8, 2013 at 4:42 PM, Andrew Dunstan <andrew(at)dunslane(dot)net>
>> wrote:
>>>> So my order of preference for the options would be:
>>>>
>>>> 1. Have the JSON type collapse objects so the last instance of a
>>>> key wins
>>>> and is actually stored
>>>>
>>>> 2. Throw an error when a JSON type has duplicate keys
>>>>
>>>> 3. Have the accessors find the last instance of a key and return that
>>>> value
>>>>
>>>> 4. Let things remain as they are now
>>>>
>>>> On second though, I don't like 4 at all. It means that the JSON type
>>>> things a value is valid while the accessor does not. They
>>>> contradict one
>>>> another.
>>> You can forget 1. We are not going to have the parser collapse
>>> anything.
>>> Either the JSON it gets is valid or it's not. But the parser isn't
>>> going to
>>> try to MAKE it valid.
>> Why not? Because it's the wrong thing to do, or because it would be
>> slower?
>>
>> What I think is tricky here is that there's more than one way to
>> conceptualize what the JSON data type really is. Is it a key-value
>> store of sorts, or just a way to store text values that meet certain
>> minimalist syntactic criteria? I had imagined it as the latter, in
>> which case normalization isn't sensible. But if you think of it the
>> first way, then normalization is not only sensible, but almost
>> obligatory. For example, we don't feel bad about this:
>>
>> rhaas=# select '1e1'::numeric;
>> numeric
>> ---------
>> 10
>> (1 row)
>>
>> I think Andrew and I had envisioned this as basically a text data type
>> that enforces some syntax checking on its input, hence the current
>> design. But I'm not sure that's the ONLY sensible design.
>>
>
>
> I think we've moved on from this point, because a) other
> implementations allow duplicate keys, b) it's trivially easy to make
> Postgres generate such json, and c) there is some dispute about
> exactly what the spec mandates.
>
> I'll be posting a revised patch shortly that doesn't error out but
> simply uses the value for the later key lexically.
>
> cheers
>
> andrew
>
>
>
>
How about adding a new function with '_strict' added to the existing
name, with an extra parameter 'coalesce' - or using other names, if
considered more appropriate!

That way slower more stringent functionality can be added where
required. This way, the existing function need not be changed.

If coalesce = true,
then: the last duplicate is used
else: an error is returned when the new key is a duplicate.

Cheers,
Gavin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dimitri Fontaine 2013-03-13 16:54:33 Re: transforms
Previous Message David E. Wheeler 2013-03-13 16:40:57 Re: Duplicate JSON Object Keys