Re: nvarchar notation accepted?

Lists: pgsql-hackers
From: Jaime Casanova <jaime(at)2ndquadrant(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: nvarchar notation accepted?
Date: 2010-05-13 22:31:55
Message-ID: AANLkTilm8PWD3RAiRiAn548h1Zv7Lb-HcB7n-Sbw0eJS@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

i migrate a ms sql server database to postgres and was trying some
queries from the application to find if everything works right...
when i was looking to those queries i found some that has a notation
for nvarchar (ej: campo = N'sometext')
i was expecting those to fail but this actually works, is that fine? i
know, we can use E'' strings but N'' ones are no where documented, so
can i rely on those or i have to change those strings?

"""
create table t1_nvarchar(col1 text);
insert into t1_nvarchar values (N'texto');
"""

--
Jaime Casanova www.2ndQuadrant.com
Soporte y capacitación de PostgreSQL


From: Takahiro Itagaki <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: Jaime Casanova <jaime(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: nvarchar notation accepted?
Date: 2010-05-14 03:13:09
Message-ID: 20100514121309.A450.52131E4D@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Jaime Casanova <jaime(at)2ndquadrant(dot)com> wrote:

> i migrate a ms sql server database to postgres and was trying some
> queries from the application to find if everything works right...
> when i was looking to those queries i found some that has a notation
> for nvarchar (ej: campo = N'sometext')

Do you have documentation for N'...' literal in SQLServer?
Does it mean unicode literal? What is the difference from U& literal?
http://developer.postgresql.org/pgdocs/postgres/sql-syntax-lexical.html

PostgreSQL doesn't have nvarchar types (UTF16 in MSSQL), and only
have mutlti-tyte characters. So I think you can remove N and just
use "SET client_encoding = UTF8" in the cases.

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Takahiro Itagaki <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: nvarchar notation accepted?
Date: 2010-05-14 03:52:18
Message-ID: 13387.1273809138@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Takahiro Itagaki <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> writes:
> Jaime Casanova <jaime(at)2ndquadrant(dot)com> wrote:
>> i migrate a ms sql server database to postgres and was trying some
>> queries from the application to find if everything works right...
>> when i was looking to those queries i found some that has a notation
>> for nvarchar (ej: campo = N'sometext')

> Do you have documentation for N'...' literal in SQLServer?
> Does it mean unicode literal? What is the difference from U& literal?
> http://developer.postgresql.org/pgdocs/postgres/sql-syntax-lexical.html

> PostgreSQL doesn't have nvarchar types (UTF16 in MSSQL), and only
> have mutlti-tyte characters. So I think you can remove N and just
> use "SET client_encoding = UTF8" in the cases.

Actually, the lexer translates N'foo' to NCHAR 'foo' and then the
grammar treats that just like CHAR 'foo'. In short, the N doesn't do
anything very useful, and it certainly doesn't have any effect on
encoding behavior. I think this is something Tom Lockhart put in ten or
so years back, and never got as far as making it actually do anything
helpful.

regards, tom lane


From: Jaime Casanova <jaime(at)2ndquadrant(dot)com>
To: Takahiro Itagaki <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: nvarchar notation accepted?
Date: 2010-05-14 03:56:51
Message-ID: AANLkTim_PgDNGUnZl37YrMMzm1UflL9rK5vWc78agOqA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, May 13, 2010 at 10:13 PM, Takahiro Itagaki
<itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> wrote:
>
> Jaime Casanova <jaime(at)2ndquadrant(dot)com> wrote:
>
>> i migrate a ms sql server database to postgres and was trying some
>> queries from the application to find if everything works right...
>> when i was looking to those queries i found some that has a notation
>> for nvarchar (ej: campo = N'sometext')
>
> Do you have documentation for N'...' literal in SQLServer?
> Does it mean unicode literal? What is the difference from U& literal?
> http://developer.postgresql.org/pgdocs/postgres/sql-syntax-lexical.html
>

nop, only thing i found is about NVARCHAR:
http://msdn.microsoft.com/en-us/library/ms186939.aspx but it has no
examples about the N'' notation although you can find examples of it
use here: http://msdn.microsoft.com/en-us/library/dd776381.aspx#BasicSyntax

> PostgreSQL doesn't have nvarchar types (UTF16 in MSSQL), and only
> have mutlti-tyte characters. So I think you can remove N and just
> use "SET client_encoding = UTF8" in the cases.
>

i don't want to remove it! i'm trying to understand if this is a bug
that will be removed if no i can safely tell my client to not look for
those queries so it has less work to do for the migration

--
Jaime Casanova www.2ndQuadrant.com
Soporte y capacitación de PostgreSQL


From: Jaime Casanova <jaime(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Takahiro Itagaki <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: nvarchar notation accepted?
Date: 2010-05-14 03:58:29
Message-ID: AANLkTikho3oMY2qesFGCnif2S4Cq331Yz1YioBGHXSgC@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, May 13, 2010 at 10:52 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Actually, the lexer translates N'foo' to NCHAR 'foo' and then the
> grammar treats that just like CHAR 'foo'.  In short, the N doesn't do
> anything very useful, and it certainly doesn't have any effect on
> encoding behavior.  I think this is something Tom Lockhart put in ten or
> so years back, and never got as far as making it actually do anything
> helpful.
>

so, the N'' syntax is fine and i don't need to hunt them as a migration step?

--
Jaime Casanova www.2ndQuadrant.com
Soporte y capacitación de PostgreSQL


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jaime Casanova <jaime(at)2ndquadrant(dot)com>
Cc: Takahiro Itagaki <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: nvarchar notation accepted?
Date: 2010-05-14 04:00:36
Message-ID: 13535.1273809636@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Jaime Casanova <jaime(at)2ndquadrant(dot)com> writes:
> On Thu, May 13, 2010 at 10:52 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Actually, the lexer translates N'foo' to NCHAR 'foo' and then the
>> grammar treats that just like CHAR 'foo'. In short, the N doesn't do
>> anything very useful, and it certainly doesn't have any effect on
>> encoding behavior. I think this is something Tom Lockhart put in ten or
>> so years back, and never got as far as making it actually do anything
>> helpful.

> so, the N'' syntax is fine and i don't need to hunt them as a migration step?

As long as the implied cast to char(n) doesn't cause you problems, it's
fine.

regards, tom lane


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Takahiro Itagaki <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: nvarchar notation accepted?
Date: 2010-05-14 04:16:16
Message-ID: 1273810576.1066.0.camel@vanquo.pezone.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On tor, 2010-05-13 at 23:52 -0400, Tom Lane wrote:
> Takahiro Itagaki <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> writes:
> > Jaime Casanova <jaime(at)2ndquadrant(dot)com> wrote:
> >> i migrate a ms sql server database to postgres and was trying some
> >> queries from the application to find if everything works right...
> >> when i was looking to those queries i found some that has a notation
> >> for nvarchar (ej: campo = N'sometext')
>
> > Do you have documentation for N'...' literal in SQLServer?
> > Does it mean unicode literal? What is the difference from U& literal?
> > http://developer.postgresql.org/pgdocs/postgres/sql-syntax-lexical.html
>
> > PostgreSQL doesn't have nvarchar types (UTF16 in MSSQL), and only
> > have mutlti-tyte characters. So I think you can remove N and just
> > use "SET client_encoding = UTF8" in the cases.
>
> Actually, the lexer translates N'foo' to NCHAR 'foo' and then the
> grammar treats that just like CHAR 'foo'. In short, the N doesn't do
> anything very useful, and it certainly doesn't have any effect on
> encoding behavior. I think this is something Tom Lockhart put in ten or
> so years back, and never got as far as making it actually do anything
> helpful.

This should maybe changed to just ignoring the N and treating N'' like
''.


From: Florian Pflug <fgp(at)phlo(dot)org>
To: Jaime Casanova <jaime(at)2ndquadrant(dot)com>
Cc: Takahiro Itagaki <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: nvarchar notation accepted?
Date: 2010-05-14 10:06:54
Message-ID: C4DCBE98-5362-49C1-867B-864C4EB2D330@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On May 14, 2010, at 5:56 , Jaime Casanova wrote:
> On Thu, May 13, 2010 at 10:13 PM, Takahiro Itagaki
> <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> wrote:
>>
>> Jaime Casanova <jaime(at)2ndquadrant(dot)com> wrote:
>>
>>> i migrate a ms sql server database to postgres and was trying some
>>> queries from the application to find if everything works right...
>>> when i was looking to those queries i found some that has a notation
>>> for nvarchar (ej: campo = N'sometext')
>>
>> Do you have documentation for N'...' literal in SQLServer?
>> Does it mean unicode literal? What is the difference from U& literal?
>> http://developer.postgresql.org/pgdocs/postgres/sql-syntax-lexical.html
>>
>
> nop, only thing i found is about NVARCHAR:
> http://msdn.microsoft.com/en-us/library/ms186939.aspx but it has no
> examples about the N'' notation although you can find examples of it
> use here: http://msdn.microsoft.com/en-us/library/dd776381.aspx#BasicSyntax

Without using the N prefixed versions of CHAR, VARCHAR and string literals, MS SQL Server refuses to process characters other than those in the database's character set. It will replace all those characters with '?'.

Note that this is not an encoding issue - it will even do so with protocol versions (everything >= 7.0 I think) that use UTF16 on-wire, where those characters can be transmitted just fine.

best regards,
Florian Pflug


From: Jaime Casanova <jaime(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Takahiro Itagaki <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: nvarchar notation accepted?
Date: 2010-06-07 02:13:02
Message-ID: AANLkTilaCHIQbz3fcDN-51uqK81d1NjkDmPH7re6Qiz3@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, May 13, 2010 at 11:00 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Jaime Casanova <jaime(at)2ndquadrant(dot)com> writes:
>> On Thu, May 13, 2010 at 10:52 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> Actually, the lexer translates N'foo' to NCHAR 'foo' and then the
>>> grammar treats that just like CHAR 'foo'.  In short, the N doesn't do
>>> anything very useful, and it certainly doesn't have any effect on
>>> encoding behavior.  I think this is something Tom Lockhart put in ten or
>>> so years back, and never got as far as making it actually do anything
>>> helpful.
>
>> so, the N'' syntax is fine and i don't need to hunt them as a migration step?
>
> As long as the implied cast to char(n) doesn't cause you problems, it's
> fine.
>

Is this something we want to document? Maybe something like:
"""
For historical reasons N'' syntax is also accepted as a string literal.
"""

or we can even mention the fact that that is useful for sql server migrations?

--
Jaime Casanova www.2ndQuadrant.com
Soporte y capacitación de PostgreSQL


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Jaime Casanova <jaime(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Takahiro Itagaki <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: nvarchar notation accepted?
Date: 2010-06-07 07:23:58
Message-ID: 1275895438.1849.1.camel@fsopti579.F-Secure.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On sön, 2010-06-06 at 21:13 -0500, Jaime Casanova wrote:
> On Thu, May 13, 2010 at 11:00 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > Jaime Casanova <jaime(at)2ndquadrant(dot)com> writes:
> >> On Thu, May 13, 2010 at 10:52 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >>> Actually, the lexer translates N'foo' to NCHAR 'foo' and then the
> >>> grammar treats that just like CHAR 'foo'. In short, the N doesn't do
> >>> anything very useful, and it certainly doesn't have any effect on
> >>> encoding behavior. I think this is something Tom Lockhart put in ten or
> >>> so years back, and never got as far as making it actually do anything
> >>> helpful.
> >
> >> so, the N'' syntax is fine and i don't need to hunt them as a migration step?
> >
> > As long as the implied cast to char(n) doesn't cause you problems, it's
> > fine.
> >
>
> Is this something we want to document? Maybe something like:
> """
> For historical reasons N'' syntax is also accepted as a string literal.
> """
>
> or we can even mention the fact that that is useful for sql server migrations?

I don't think it's a historical reason, at least not unless all reasons
are to some degree historical. The N'' syntax is in the SQL standard,
and so if our implementation matches that, it should be documented as a
supported feature, and if it doesn't match it, we should fix it, and
perhaps leave it undocumented until we have figured out what we want it
to do. (I have not done that analysis.)


From: Jaime Casanova <jaime(at)2ndquadrant(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Takahiro Itagaki <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: nvarchar notation accepted?
Date: 2010-06-07 17:56:26
Message-ID: AANLkTimt03kAizi2lfMcWhG-2QWDkiknalRfT679G5MM@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jun 7, 2010 at 2:23 AM, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
>
> The N'' syntax is in the SQL standard,
>

I didn't know that, do you know what paragraph is it? i can't find it

--
Jaime Casanova www.2ndQuadrant.com
Soporte y capacitación de PostgreSQL


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Jaime Casanova <jaime(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: nvarchar notation accepted?
Date: 2010-06-07 18:18:27
Message-ID: 1275934707.11078.0.camel@vanquo.pezone.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On mån, 2010-06-07 at 12:56 -0500, Jaime Casanova wrote:
> On Mon, Jun 7, 2010 at 2:23 AM, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
> >
> > The N'' syntax is in the SQL standard,
> >
>
> I didn't know that, do you know what paragraph is it? i can't find it

Look for <national character string literal>.


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: nvarchar notation accepted?
Date: 2010-07-03 14:38:37
Message-ID: 201007031438.o63EcbN20946@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Peter Eisentraut wrote:
> On m?n, 2010-06-07 at 12:56 -0500, Jaime Casanova wrote:
> > On Mon, Jun 7, 2010 at 2:23 AM, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
> > >
> > > The N'' syntax is in the SQL standard,
> > >
> >
> > I didn't know that, do you know what paragraph is it? i can't find it
>
> Look for <national character string literal>.

I have moved this from the open items list to the main TODO list.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ None of us is going to be here forever. +