Re: JSON and unicode surrogate pairs

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: JSON and unicode surrogate pairs
Date: 2013-06-10 15:43:28
Message-ID: 23366.1370879008@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> Or we could abandon the conversion altogether, but that doesn't seem
> very friendly either. I suspect the biggest case for people to use these
> sequences is where the database is UTF8 but the client encoding is not.

Well, if that's actually the biggest use-case, then maybe we should just
say we're *not* in the business of converting those escapes. That would
make things nice and consistent regardless of the DB encoding, and it
would avoid the problem of being able to input a value and then not
being able to output it again.

It's legal, is it not, to just write the equivalent Unicode character in
the JSON string and not use the escapes? If so I would think that that
would be the most common usage. If someone's writing an escape, they
probably had a reason for doing it that way, and might not appreciate
our overriding their decision.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2013-06-10 16:20:45 Re: SPGist "triple parity" concept doesn't work
Previous Message Will Crawford 2013-06-10 15:41:35 Re: SPGist "triple parity" concept doesn't work