Re: Chinese database name in URL, can I ?

Lists: pgsql-jdbc
From: LiuYan 刘研 <lovetide(at)21cn(dot)com>
To: pgsql-jdbc(at)postgresql(dot)org
Subject: Chinese database name in URL, can I ?
Date: 2007-10-23 15:16:38
Message-ID: 13366356.post@talk.nabble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc


I'm interesting in PostgreSQL these days, I created a database whose name
contains Chinese characters, when I try to connect this database, I got an
error: org.postgresql.util.PSQLException: FATAL: database "??pg???" does not
exist

I'm using PostgreSQL 8.2.5 on Windows XP SP2 Simplified Chinese system, and
using postgresql-8.2-506.jdbc4.jar as the jdbc driver.

Here is my test Java program:
--------------------------------------------------------------------------------
<%(at)page pageEncoding="gbk" contentType="text/html" session="false"
import="java.sql.*,java.util.*"%>
<h1>Chinese DatabaseName Test | 中文数据库名测试</h1>
<%
String jdbcDriverName = "org.postgresql.Driver";
String
jdbcURL="jdbc:postgresql://localhost/中文pg数据库?user=root&password=123456";
String jdbcUserName = "root";
String jdbcPassword = "123456";

Connection con = null;
Statement stmt = null;
ResultSet rs = null;
try {
Class.forName (jdbcDriverName);
con = DriverManager.getConnection (jdbcURL, jdbcUserName, jdbcPassword);
stmt = con.createStatement ();
rs = stmt.executeQuery("select version()");
ResultSetMetaData rsmd = rs.getMetaData ();

while (rs.next())
{
out.println (rs.getString(1));
}
rs.close();
stmt.close();
con.close ();
}catch (Exception e) {
out.println (e);
}
%>
--------------------------------------------------------------------------------

and the database environment:
--------------------------------------------------------------------------------
C:\>psql -U root postgres
Welcome to psql 8.2.5, the PostgreSQL interactive terminal.

Type: \copyright for distribution terms
\h for help with SQL commands
\? for help with psql commands
\g or terminate with semicolon to execute query
\q to quit

postgres=# \l
数据库列表
名字 | 所有者 | 编码
--------------+--------+-----------
cntestdb | root | EUC_CN
postgres | root | SQL_ASCII
template0 | root | SQL_ASCII
template1 | root | SQL_ASCII
中文pg_utf8 | root | UTF8
中文pg数据库 | root | EUC_CN
(6 rows)

postgres=# \c 中文pg数据库;
You are now connected to database "中文pg数据库".
中文pg数据库=# \dt;
关系列表
模式 | 名字 | 类型 | 所有者
--------+--------------+------+--------
public | serverregion | 表 | root
public | servers | 表 | root
public | 中文表 | 表 | root
(3 rows)

中文pg数据库=# select * from 中文表;
字段1 | 字段2
-------+------------
1 | hello 你好
2 | pg数据库
3 | 用jdbc访问
4 | 中文数据库
5 | 可以吗?
(5 rows)

中文pg数据库=#
--------------------------------------------------------------------------------

I've also tried add '&charSet=GBK' in the jdbcURL, and got the same result.

I know that MySQL can support Chinese database name in jdbc URL, so how can
I use Chinese database name in jdbc URL to connect PostgreSQL?

Thanks!

--
View this message in context: http://www.nabble.com/Chinese-database-name-in-URL%2C-can-I---tf4678108.html#a13366356
Sent from the PostgreSQL - jdbc mailing list archive at Nabble.com.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: LiuYan 刘研 <lovetide(at)21cn(dot)com>
Cc: pgsql-jdbc(at)postgresql(dot)org
Subject: Re: Chinese database name in URL, can I ?
Date: 2007-10-23 15:59:49
Message-ID: 19162.1193155189@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

=?UTF-8?Q?LiuYan_=E5=88=98=E7=A0=94?= <lovetide(at)21cn(dot)com> writes:
> I've also tried add '&charSet=GBK' in the jdbcURL, and got the same result.

At a guess, the name of the database will have to be in UTF8. I doubt
that JDBC will think it should do any character set conversion on it.

regards, tom lane


From: Oliver Jowett <oliver(at)opencloud(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: LiuYan 刘研 <lovetide(at)21cn(dot)com>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: Chinese database name in URL, can I ?
Date: 2007-10-23 23:15:17
Message-ID: 471E8085.8030308@opencloud.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

Tom Lane wrote:
> =?UTF-8?Q?LiuYan_=E5=88=98=E7=A0=94?= <lovetide(at)21cn(dot)com> writes:
>> I've also tried add '&charSet=GBK' in the jdbcURL, and got the same result.
>
> At a guess, the name of the database will have to be in UTF8. I doubt
> that JDBC will think it should do any character set conversion on it.

IIRC the last time I raised the issue of the encoding used for things
like database names & usernames in the startup packet the answer boiled
down to "don't use anything but 7-bit ASCII". So the driver sends those
strings as 7-bit ASCII (i.e. String.getBytes("US-ASCII")). If you've got
a database name or username that can't be represented using only 7-bit
ASCII, you're out of luck.

Has something changed here so that non-7-bit data in the startup packet
will work?

-O


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Oliver Jowett <oliver(at)opencloud(dot)com>
Cc: LiuYan 刘研 <lovetide(at)21cn(dot)com>, pgsql-jdbc(at)postgresql(dot)org
Subject: Re: Chinese database name in URL, can I ?
Date: 2007-10-23 23:33:33
Message-ID: 5901.1193182413@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

Oliver Jowett <oliver(at)opencloud(dot)com> writes:
> Tom Lane wrote:
>> At a guess, the name of the database will have to be in UTF8. I doubt
>> that JDBC will think it should do any character set conversion on it.

> Has something changed here so that non-7-bit data in the startup packet
> will work?

The server just thinks those are 8-bit strings. I guess my
recommendation is that you should go ahead and send UTF8, and if the
user created the non-ASCII name that way, it'll work. (In practice I
think that means he'd have to issue the CREATE DATABASE or CREATE USER
command while logged into a database that uses UTF8 encoding.)

This whole business of per-database encodings is a mess, of course.

regards, tom lane


From: LiuYan 刘研 <lovetide(at)21cn(dot)com>
To: pgsql-jdbc(at)postgresql(dot)org
Subject: Re: Chinese database name in URL, can I ?
Date: 2007-10-24 14:03:20
Message-ID: 13384678.post@talk.nabble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc


Thanks for your reply Oliver & Tom,

You are right. I downloaded the source code of postgresql-jdbc-8.2-506, and
found that org.postgresql.core.v3.ConnectionFactoryImpl.java using the hard
coded "US-ASCII" to transform the encoding of jdbc parameter names and
values. So, a GBK encoded Chinese character(2 bytes) will be transformed to
a hex value 0x3F, which is a '?' character.

I tried to add a URL parameter "jdbcURLEncoding" in
org.postgresql.core.v3.ConnectionFactoryImpl.java to resolve this problem,
and it seems works ok for me now(Chinese database name/username /password in
the jdbc URL) for this testing purpose. But there are too many "US-ASCII"s
in the source codes (from database name/username/password to MD5Digest,
UnixCrypt, Encoding etc...), so there must be some other things to be
concerned which I don't know where/how to do it.

The attachments are the two file which I modified:
org.postgresql.core.v3.ConnectionFactoryImpl.java
org.postgresql.util.MD5Digest.java

The following is the new test code:
-----------------------------------------
import java.sql.*;

public class PgTest
{
public static void main (String[] args) throws Exception
{
// 中文pg数据库: \u4e2d\u6587pg\u6570\u636e\u5e93
// chineseusername中文用户名: chineseusername\u4e2d\u6587\u7528\u6237\u540d
// chinesepassword中文密码: chinesepassword\u4e2d\u6587\u5bc6\u7801

String jdbcDriverName = "org.postgresql.Driver";
String jdbcURL
="jdbc:postgresql://192.168.1.6/\u4e2d\u6587pg\u6570\u636e\u5e93?user=chineseusername\u4e2d\u6587\u7528\u6237\u540d&password=chinesepassword\u4e2d\u6587\u5bc6\u780d&jdbcURLEncoding=GBK";

Connection con = null;
Statement stmt = null;
ResultSet rs = null;
try {
Class.forName (jdbcDriverName);
con = DriverManager.getConnection (jdbcURL);
stmt = con.createStatement ();
rs = stmt.executeQuery("select version()");
ResultSetMetaData rsmd = rs.getMetaData ();

while (rs.next())
{
System.out.println (rs.getString(1));
}
rs.close();
stmt.close();
con.close ();
}catch (Exception e) {
System.out.println (e);
}

}
}
-----------------------------------------

and the following is the DEBUG output produced by modified postgresql jdbc
driver.
--------------------------------------------
F:\Software\Development\PostgreSQL\psqlJDBC\postgresql-jdbc-8.2-506.src>java
-classpath .;jars\postgresql.jar PgTest
21:35:44.937 (1) PostgreSQL 8.2 JDBC4 with SSL (build 506)
21:35:44.968 (1) Trying to establish a protocol version 3 connection to
192.168.1.6:5432
21:35:45.000 (1) FE=> StartupPacket(user=chineseusername中文用户名,
database=中文pg数据库, client_encoding=UNICODE, DateStyle=ISO)
21:35:45.015 (1) Using GBK charset encoding to encode parameters
21:35:45.031 (1) encodedParams hex dump:
21:35:45.031 (1) user-->75 73 65 72
21:35:45.031 (1) chineseusername中文用户名-->63 68 69 6e 65 73 65 75 73 65 72
6e 61 6d 65 d6 d0 ce c4 d3 c3 bb a7 c3 fb
21:35:45.031 (1) database-->64 61 74 61 62 61 73 65
21:35:45.046 (1) 中文pg数据库-->d6 d0 ce c4 70 67 ca fd be dd bf e2
21:35:45.046 (1) client_encoding-->63 6c 69 65 6e 74 5f 65 6e 63 6f 64 69
6e 67
21:35:45.046 (1) UNICODE-->55 4e 49 43 4f 44 45
21:35:45.046 (1) DateStyle-->44 61 74 65 53 74 79 6c 65
21:35:45.046 (1) ISO-->49 53 4f
21:35:46.031 (1) <=BE AuthenticationOk
21:35:46.046 (1) <=BE ParameterStatus(client_encoding = UNICODE)
21:35:46.046 (1) <=BE ParameterStatus(DateStyle = ISO, YMD)
21:35:46.046 (1) <=BE ParameterStatus(integer_datetimes = off)
21:35:46.062 (1) <=BE ParameterStatus(is_superuser = off)
21:35:46.062 (1) <=BE ParameterStatus(server_encoding = EUC_CN)
21:35:46.062 (1) <=BE ParameterStatus(server_version = 8.2.5)
21:35:46.062 (1) <=BE ParameterStatus(session_authorization =
chineseusername中文用户名)
21:35:46.062 (1) <=BE ParameterStatus(standard_conforming_strings = off)
21:35:46.062 (1) <=BE ParameterStatus(TimeZone = Asia/Hong_Kong)
21:35:46.062 (1) <=BE BackendKeyData(pid=3360,ckey=2007175982)
21:35:46.062 (1) <=BE ReadyForQuery(I)
21:35:46.078 (1) compatible = 8.2
21:35:46.078 (1) loglevel = 2
21:35:46.078 (1) prepare threshold = 5
getConnection returning
driver[className=org.postgresql.Driver,org(dot)postgresql(dot)Driver(at)1bf73fa]
21:35:46.140 (1) simple execute,
handler=org(dot)postgresql(dot)jdbc2(dot)AbstractJdbc2Statement$StatementResultHandler(at)f3d6a5,
maxR
ows=0, fetchSize=0, flags=17
21:35:46.140 (1) FE=> Parse(stmt=null,query="select version()",oids={})
21:35:46.140 (1) FE=> Bind(stmt=null,portal=null)
21:35:46.140 (1) FE=> Describe(portal=null)
21:35:46.156 (1) FE=> Execute(portal=null,limit=0)
21:35:46.156 (1) FE=> Sync
21:35:46.281 (1) <=BE ParseComplete [null]
21:35:46.281 (1) <=BE BindComplete [null]
21:35:46.281 (1) <=BE RowDescription(1)
21:35:46.281 (1) <=BE DataRow
21:35:46.281 (1) <=BE CommandStatus(SELECT)
21:35:46.296 (1) <=BE ReadyForQuery(I)
PostgreSQL 8.2.5 on i686-pc-mingw32, compiled by GCC gcc.exe (GCC) 3.4.2
(mingw-special)
21:35:46.312 (1) FE=> Terminate
--------------------------------------------

Oliver Jowett wrote:
>
> Tom Lane wrote:
>> =?UTF-8?Q?LiuYan_=E5=88=98=E7=A0=94?= <lovetide(at)21cn(dot)com> writes:
>>> I've also tried add '&charSet=GBK' in the jdbcURL, and got the same
>>> result.
>>
>> At a guess, the name of the database will have to be in UTF8. I doubt
>> that JDBC will think it should do any character set conversion on it.
>
> IIRC the last time I raised the issue of the encoding used for things
> like database names & usernames in the startup packet the answer boiled
> down to "don't use anything but 7-bit ASCII". So the driver sends those
> strings as 7-bit ASCII (i.e. String.getBytes("US-ASCII")). If you've got
> a database name or username that can't be represented using only 7-bit
> ASCII, you're out of luck.
>
> Has something changed here so that non-7-bit data in the startup packet
> will work?
>
> -O
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly
>
>

http://www.nabble.com/file/p13384678/postgresql-jdbc-8.2-506-modified%252Btest.zip
postgresql-jdbc-8.2-506-modified%2Btest.zip
--
View this message in context: http://www.nabble.com/Chinese-database-name-in-URL%2C-can-I---tf4678108.html#a13384678
Sent from the PostgreSQL - jdbc mailing list archive at Nabble.com.