Quick Links

Re: [rfc] unicode escapes for extended strings

From:	Sam Mason <sam(at)samason(dot)me(dot)uk>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: [rfc] unicode escapes for extended strings
Date:	2009-04-16 18:43:09
Message-ID:	20090416184309.GQ12225@frubble.xen.chris-lamb.co.uk
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Thu, Apr 16, 2009 at 08:48:58PM +0300, Marko Kreen wrote:
> Seems I'm bad at communicating in english,

I hope you're not saying this because of my misunderstandings!

> so here is C variant of
> my proposal to bring \u escaping into extended strings. Reasons:
>
> - More people are familiar with \u escaping, as it's standard
> in Java/C#/Python, probably more..
> - U& strings will not work when stdstr=off.
>
> Syntax:
>
> \uXXXX - 16-bit value
> \UXXXXXXXX - 32-bit value
>
> Additionally, both \u and \U can be used to specify UTF-16 surrogate
> pairs to encode characters with value > 0xFFFF. This is exact behaviour
> used by Java/C#/Python. (except that Java does not have \U)

Are you sure that this handling of surrogates is correct? The best
answer I've managed to find on the Unicode consortium's site is:

http://unicode.org/faq/utf_bom.html#utf16-7

it says:

They are invalid in interchange, but may be freely used internal to an
implementation.

I think this means they consider the handling of them you noted above,
in other languages, to be an error.

--
Sam http://samason.me.uk/

In response to

[rfc] unicode escapes for extended strings at 2009-04-16 17:48:58 from Marko Kreen

Responses

Re: [rfc] unicode escapes for extended strings at 2009-04-16 19:04:37 from Andrew Dunstan
Re: [rfc] unicode escapes for extended strings at 2009-04-16 19:32:16 from Marko Kreen

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Grzegorz Jaskiewicz	2009-04-16 18:45:47	Re: [GENERAL] Performance of full outer join in 8.3
Previous Message	Merlin Moncure	2009-04-16 18:41:15	Re: [GENERAL] Performance of full outer join in 8.3