Re: [GENERAL] Re: [GENERAL] Different encoding for string values and identifier strings? Or (select 'tést' as tést) returns different values for string and identifier...

From: "Francisco Figueiredo Jr(dot)" <francisco(at)npgsql(dot)org>
To: Andreas Kretschmer <akretschmer(at)spamfence(dot)net>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: [GENERAL] Re: [GENERAL] Different encoding for string values and identifier strings? Or (select 'tést' as tést) returns different values for string and identifier...
Date: 2011-03-17 22:26:38
Message-ID: AANLkTin6OCp3v5xjapioPUqsy+b0bzzb+0hkp=bdU4gS@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Any ideas??

Would it be possible that Postgresql would be using another encoding
for the identifiers when they aren't wrapped by double quotes?

On Tue, Mar 15, 2011 at 23:37, Francisco Figueiredo Jr.
<francisco(at)npgsql(dot)org> wrote:
> Now, I'm using my dev machine.
>
> With the tests I'm doing, I can see the following:
>
> If I use:
>
> select 'seléct' as "seléct";
>
> column name returns ok as expected.
>
> If I do:
>
> select 'seléct' as seléct;
>
>
> This is the sequence of bytes I receive from postgresql:
>
> byte1 - 115 UTF-8 for s
> byte2 - 101 UTF-8 for e
> byte3 - 108 UTF-8 for l
> byte4 - 227
> byte5 - 169
> byte6 - 99 UTF-8 for c
> byte7 - 116 UTF-8 for t
>
>
> The problem lies in the byte4.
> According to [1], the first byte defines how many bytes will compose
> the UTF-8 char. the problem is that 227 encodes to a binary value of
> 1110 0011 and so, the UTF-8 decoder will think there are 3 bytes in
> sequence when actually there are only 2! :( And this seems to be the
> root of the problem for me.
>
>
> For the select value the correct byte is returned:
>
> byte1 - 115 UTF-8 for s
> byte2 - 101 UTF-8 for e
> byte3 - 108 UTF-8 for l
> byte4 - 195
> byte5 - 169
> byte6 - 99 UTF-8 for c
> byte7 - 116 UTF-8 for t
>
>
> Where 195 is 1100 0011 which gives two bytes in sequence and the
> decoder can decode this to the U+00E9 which is the char "é"
>
> Do you think this can be related to my machine? I'm using OSX 10.6.6
> and I compiled postgresql 9.0.1 from source code.
>
> Thanks in advance.
>
>
>
>
> [1] - http://en.wikipedia.org/wiki/UTF-8
>
>
>
>
> On Tue, Mar 15, 2011 at 15:52, Francisco Figueiredo Jr.
> <francisco(at)npgsql(dot)org> wrote:
>> Hmmmmmmmm,
>>
>> What would change the encoding of the identifiers?
>>
>> Because on my dev machine which unfortunately isn't with me right now
>> I can't get the identifier returned correctly :(
>>
>> I remember that it returns:
>>
>>  test=*# select 'tést' as tést;
>>   tst
>>  ------
>>   tést
>>
>> Is there any config I can change at runtime in order to have it
>> returned correctly?
>>
>> Thanks in advance.
>>
>>
>> On Tue, Mar 15, 2011 at 15:45, Andreas Kretschmer
>> <akretschmer(at)spamfence(dot)net> wrote:
>>> Francisco Figueiredo Jr. <francisco(at)npgsql(dot)org> wrote:
>>>
>>>>
>>>> What happens if you remove the double quotes in the column name identifier?
>>>
>>> the same:
>>>
>>> test=*# select 'tést' as tést;
>>>  tést
>>> ------
>>>  tést
>>> (1 Zeile)
>>>
>>>
>>>
>>> Andreas
>>> --
>>> Really, I'm not out to destroy Microsoft. That will just be a completely
>>> unintentional side effect.                              (Linus Torvalds)
>>> "If I was god, I would recompile penguin with --enable-fly."   (unknown)
>>> Kaufbach, Saxony, Germany, Europe.              N 51.05082°, E 13.56889°
>>>
>>> --
>>> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
>>> To make changes to your subscription:
>>> http://www.postgresql.org/mailpref/pgsql-general
>>>
>>
>>
>>
>> --
>> Regards,
>>
>> Francisco Figueiredo Jr.
>> Npgsql Lead Developer
>> http://www.npgsql.org
>> http://fxjr.blogspot.com
>> http://twitter.com/franciscojunior
>>
>
>
>
> --
> Regards,
>
> Francisco Figueiredo Jr.
> Npgsql Lead Developer
> http://www.npgsql.org
> http://fxjr.blogspot.com
> http://twitter.com/franciscojunior
>

--
Regards,

Francisco Figueiredo Jr.
Npgsql Lead Developer
http://www.npgsql.org
http://fxjr.blogspot.com
http://twitter.com/franciscojunior

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Scott Marlowe 2011-03-17 22:30:56 Re: Primary key vs unique index
Previous Message Joseph Doench 2011-03-17 22:20:16 Re: Windows ODBC connection trouble? ISP issue?