Re: UTF8 national character data type support WIP patch and list of open issues.

From: "MauMau" <maumau307(at)gmail(dot)com>
To: "Peter Eisentraut" <peter_e(at)gmx(dot)net>
Cc: <robertmhaas(at)gmail(dot)com>, "Tatsuo Ishii" <ishii(at)postgresql(dot)org>, <tgl(at)sss(dot)pgh(dot)pa(dot)us>, <maksymb(at)fast(dot)au(dot)fujitsu(dot)com>, <hlinnakangas(at)vmware(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: UTF8 national character data type support WIP patch and list of open issues.
Date: 2013-09-24 12:04:14
Message-ID: 7DCFAE8265254605840917B35EEF09F1@maumau
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

From: "Peter Eisentraut" <peter_e(at)gmx(dot)net>
> That assumes that the conversion client encoding -> server encoding ->
> NCHAR encoding is not lossy.

Yes, so Tatsuo san suggested to restrict server encoding <-> NCHAR encoding
combination to those with lossless conversion.

> I thought one main point of this exercise
> was the avoid these conversions and be able to go straight from client
> encoding into NCHAR.

It's slightly different. Please see the following excerpt:

http://www.postgresql.org/message-id/B1A7485194DE4FDAB8FA781AFB570079@maumau

"4. I guess some users really want to continue to use ShiftJIS or EUC_JP for
database encoding, and use NCHAR for a limited set of columns to store
international text in Unicode:
- to avoid code conversion between the server and the client for performance
- because ShiftJIS and EUC_JP require less amount of storage (2 bytes for
most Kanji) than UTF-8 (3 bytes)
This use case is described in chapter 6 of "Oracle Database Globalization
Support Guide"."

Regards
MauMau

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Vesa-Matti J Kari 2013-09-24 12:37:35 Re: Strange hanging bug in a simple milter
Previous Message Andres Freund 2013-09-24 11:25:41 Re: all_visible replay aborting due to uninitialized pages