encoding of PostgreSQL messages

Lists: pgsql-general
From: Karsten Hilbert <Karsten(dot)Hilbert(at)gmx(dot)net>
To: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: encoding of PostgreSQL messages
Date: 2008-12-23 21:24:14
Message-ID: 20081223212414.GD3894@merkur.hilbert.loc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Hi all !

How can I programmatically detect which encoding a
PostgreSQL server I am trying to connect to sends back
messages -- before I connect (so client_encoding and
the pg_settings table are flat out).

Thanks,
Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: encoding of PostgreSQL messages
Date: 2008-12-23 21:45:17
Message-ID: 20081223214517.GF6217@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Karsten Hilbert wrote:
> Hi all !
>
> How can I programmatically detect which encoding a
> PostgreSQL server I am trying to connect to sends back
> messages -- before I connect (so client_encoding and
> the pg_settings table are flat out).

Hmm, isn't client_encoding reported in the startup packet sent by the
server, after auth?

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Karsten Hilbert <Karsten(dot)Hilbert(at)gmx(dot)net>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: encoding of PostgreSQL messages
Date: 2008-12-23 22:17:42
Message-ID: 20081223221742.GF3894@merkur.hilbert.loc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Tue, Dec 23, 2008 at 06:45:17PM -0300, Alvaro Herrera wrote:

> > How can I programmatically detect which encoding a
> > PostgreSQL server I am trying to connect to sends back
> > messages -- before I connect (so client_encoding and
> > the pg_settings table are flat out).
>
> Hmm, isn't client_encoding reported in the startup packet sent by the
> server, after auth?

That would not quite be enough -- I am talking about
messages reported *during* auth, say

FATAL: password authentication failed for user "postgres"

or

fe_sendauth: no password supplied

both of which, in other locales, may contain non-ASCII characters.

Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346


From: John DeSoi <desoi(at)pgedit(dot)com>
To: Karsten Hilbert <Karsten(dot)Hilbert(at)gmx(dot)net>
Cc: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: encoding of PostgreSQL messages
Date: 2008-12-24 04:10:28
Message-ID: 642262EA-CE75-48E0-8574-7BEADDAC5766@pgedit.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general


On Dec 23, 2008, at 4:24 PM, Karsten Hilbert wrote:

> How can I programmatically detect which encoding a
> PostgreSQL server I am trying to connect to sends back
> messages -- before I connect (so client_encoding and
> the pg_settings table are flat out).

I don't think there is a way because you can't get any information
without authorizing first.

It is also unclear to me how things work in the other direction. When
authenticating, what if the user or database name have non-ascii
characters. Are they interpreted in the encoding of the server since
the client has not established an encoding?

John DeSoi, Ph.D.


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: encoding of PostgreSQL messages
Date: 2008-12-29 12:07:14
Message-ID: 20081229120714.GB4545@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Karsten Hilbert wrote:
> On Tue, Dec 23, 2008 at 06:45:17PM -0300, Alvaro Herrera wrote:

> > Hmm, isn't client_encoding reported in the startup packet sent by the
> > server, after auth?
>
> That would not quite be enough -- I am talking about
> messages reported *during* auth, say
>
> FATAL: password authentication failed for user "postgres"
>
> or
>
> fe_sendauth: no password supplied
>
> both of which, in other locales, may contain non-ASCII characters.

Those are sent in the server encoding IIRC (which admittedly you don't
have a way to know, at that point.)

And I'm now wondering if we should delay initializing the translation
stuff until after client_encoding has been reported.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Karsten Hilbert <Karsten(dot)Hilbert(at)gmx(dot)net>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: encoding of PostgreSQL messages
Date: 2008-12-31 09:46:56
Message-ID: 20081231094655.GA3599@merkur.hilbert.loc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Mon, Dec 29, 2008 at 09:07:14AM -0300, Alvaro Herrera wrote:

> > That would not quite be enough -- I am talking about
> > messages reported *during* auth, say
> >
> > FATAL: password authentication failed for user "postgres"
> >
> > or
> >
> > fe_sendauth: no password supplied
> >
> > both of which, in other locales, may contain non-ASCII characters.
>
> Those are sent in the server encoding IIRC (which admittedly you don't
> have a way to know, at that point.)
>
> And I'm now wondering if we should delay initializing the translation
> stuff until after client_encoding has been reported.

Or else

- just don't pass those messages through gettext so they are
always in 7 bit ASCII English

- do pass them through gettext but append a 7-bit ASCII
filter so things do get passed in pseudo ASCII (this will
work for many singlebyte encodings but more often than
not for multibyte ones)

Both changes would be less intrusive than postponing the
translation.

If I had a choice I would opt for the first.

Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Karsten Hilbert <Karsten(dot)Hilbert(at)gmx(dot)net>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: encoding of PostgreSQL messages
Date: 2008-12-31 16:23:06
Message-ID: 27204.1230740586@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Karsten Hilbert <Karsten(dot)Hilbert(at)gmx(dot)net> writes:
> On Mon, Dec 29, 2008 at 09:07:14AM -0300, Alvaro Herrera wrote:
>> And I'm now wondering if we should delay initializing the translation
>> stuff until after client_encoding has been reported.

> Or else

> - just don't pass those messages through gettext so they are
> always in 7 bit ASCII English

What's the difference? The user-visible result would be the same
AFAICS. (One or the other might be less messy internally, but I'm
not sure which offhand.)

regards, tom lane


From: "Karsten Hilbert" <Karsten(dot)Hilbert(at)gmx(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: encoding of PostgreSQL messages
Date: 2008-12-31 16:57:29
Message-ID: 20081231165729.284240@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

> Karsten Hilbert <Karsten(dot)Hilbert(at)gmx(dot)net> writes:
> > On Mon, Dec 29, 2008 at 09:07:14AM -0300, Alvaro Herrera wrote:
> >> And I'm now wondering if we should delay initializing the translation
> >> stuff until after client_encoding has been reported.
>
> > Or else
>
> > - just don't pass those messages through gettext so they are
> > always in 7 bit ASCII English
>
> What's the difference? The user-visible result would be the same
> AFAICS. (One or the other might be less messy internally, but I'm
> not sure which offhand.)

That was the reason for the suggestion: perhaps less messy and surely lower impact on the existing
code as it would not mean moving code later in the initialization but rather just removing the
gettext wrappers around a few strings. No difference in the result.

The difference to my other suggestion (no translation vs. translation but then replacing
characters > 127 by, say '?' or a space) is:

I could *assume* a given encoding, namely 7 bit ASCII. Or rather I could assume
that I can display the message as "something pretty similar to what the original message said,
perhaps without umlauts and accents but still recognizable in the local language".

Now, surely, I could dig down the layers to where "my application space" receives the message
from PostgreSQL and filter there. It is, however, good to have some knowledge of the encoding
where knowledge can be had.

The concrete problem is this: I connect to PostgreSQL from Python. Let's assume PG is set to German.
If the wrong password is supplied the PG error message string contains an umlaut. This is passed to
libpq, which in turn passes it to the C part of psycopg2 which then turns this into an exception. An
exception, by default in Python, is printed to the console, which may be in any encoding incompatible
with the latin1 the PG message happens to be in. Thus, printing the PG message may or may not fail
due to Unicode de-/encoding errors.

The solution is to find the right layer to take control of the encoding but this is eventually only possible
if the encoding is *known*. Thus the plea for "7-bit-ascii English by default until the encoding *can* be
known". Going to "7-bit-ascii filter of the original by default until the encoding can be known" only
tries to preserve a bit more of the original language. I may be wrong in feasibility.

Thanks for considering,
Karsten
--
Sensationsangebot verlängert: GMX FreeDSL - Telefonanschluss + DSL
für nur 16,37 Euro/mtl.!* http://dsl.gmx.de/?ac=OM.AD.PD003K1308T4569a


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: pgsql-general(at)postgresql(dot)org
Cc: "Karsten Hilbert" <Karsten(dot)Hilbert(at)gmx(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: encoding of PostgreSQL messages
Date: 2008-12-31 18:02:00
Message-ID: 200812312002.00865.peter_e@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Wednesday 31 December 2008 18:57:29 Karsten Hilbert wrote:
> The solution is to find the right layer to take control of the encoding but
> this is eventually only possible if the encoding is *known*. Thus the plea
> for "7-bit-ascii English by default until the encoding *can* be known".
> Going to "7-bit-ascii filter of the original by default until the encoding
> can be known" only tries to preserve a bit more of the original language. I
> may be wrong in feasibility.

The proper fix is probably to include the client encoding in the connection
startup message.


From: "Karsten Hilbert" <Karsten(dot)Hilbert(at)gmx(dot)net>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: encoding of PostgreSQL messages
Date: 2008-12-31 18:16:14
Message-ID: 20081231181614.284240@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general


> On Wednesday 31 December 2008 18:57:29 Karsten Hilbert wrote:
> > The solution is to find the right layer to take control of the encoding
> but
> > this is eventually only possible if the encoding is *known*. Thus the
> plea
> > for "7-bit-ascii English by default until the encoding *can* be known".
> > Going to "7-bit-ascii filter of the original by default until the
> encoding
> > can be known" only tries to preserve a bit more of the original
> language. I
> > may be wrong in feasibility.
>
> The proper fix is probably to include the client encoding in the
> connection startup message.

Absolutely, either the desired client encoding being sent to the server or the current
server encoding being sent to the client.

It would, however, take way longer to trickle down into being supported by client
interfaces such as psycopg2 <evil grin>

A fixed startup encoding would not need support from those parts of the equation.

Karsten
--
Psssst! Schon vom neuen GMX MultiMessenger gehört? Der kann`s mit allen: http://www.gmx.net/de/go/multimessenger


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: pgsql-general(at)postgresql(dot)org, "Karsten Hilbert" <Karsten(dot)Hilbert(at)gmx(dot)net>
Subject: Re: encoding of PostgreSQL messages
Date: 2008-12-31 18:23:47
Message-ID: 13211.1230747827@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
> On Wednesday 31 December 2008 18:57:29 Karsten Hilbert wrote:
>> The solution is to find the right layer to take control of the encoding but
>> this is eventually only possible if the encoding is *known*. Thus the plea
>> for "7-bit-ascii English by default until the encoding *can* be known".
>> Going to "7-bit-ascii filter of the original by default until the encoding
>> can be known" only tries to preserve a bit more of the original language. I
>> may be wrong in feasibility.

> The proper fix is probably to include the client encoding in the connection
> startup message.

What of errors occurring before such an option could be applied?

I think that ultimately it's necessary to accept that there will be some
window during connection startup where sending plain ASCII (English)
messages is the best recourse. I'm not sure what the best way to
implement that is. On reflection though, trying to mark the individual
messages that might need that treatment doesn't seem like a winner:
there's too much possibility for mistakes, or code drift causing a
marking to become wrong; and I'm not even convinced that the very same
source-code message might not fall into both categories at different
times. So having a mode switch somehow occurring inside elog.c once
we are ready to translate seems like the right approach.

regards, tom lane


From: "Karsten Hilbert" <Karsten(dot)Hilbert(at)gmx(dot)net>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: encoding of PostgreSQL messages
Date: 2008-12-31 18:44:32
Message-ID: 20081231184432.284220@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

> > The proper fix is probably to include the client encoding in the
> connection
> > startup message.
>
> What of errors occurring before such an option could be applied?
>
> I think that ultimately it's necessary to accept that there will be some
> window during connection startup where sending plain ASCII (English)
> messages is the best recourse. I'm not sure what the best way to
> implement that is. On reflection though, trying to mark the individual
> messages that might need that treatment doesn't seem like a winner:
> there's too much possibility for mistakes, or code drift causing a
> marking to become wrong; and I'm not even convinced that the very same
> source-code message might not fall into both categories at different
> times. So having a mode switch somehow occurring inside elog.c once
> we are ready to translate seems like the right approach.

Hm, so maybe both Peter and Alvaro are right:

1) Setting the translation wrapper to a NOOP as early as possible.

Thus, the first messages are sent in 7-bit ASCII English.

2) Including the server encoding in the very (?) first message sent to the
client *after* which ...

3) ... proper initialization of the gettext system can happen because now
the client can know what to make of them subsequent messages.

Thereby no strings should need special marking and proper translation or
not is achieved automatically ?

It is surely acceptable to see the very first messages in English and/or
7-bit ASCII - I just want to be able to know their encoding either by definition
or by being told.

Thanks,
Karsten
--
Sensationsangebot verlängert: GMX FreeDSL - Telefonanschluss + DSL
für nur 16,37 Euro/mtl.!* http://dsl.gmx.de/?ac=OM.AD.PD003K1308T4569a


From: "Karsten Hilbert" <Karsten(dot)Hilbert(at)gmx(dot)net>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: encoding of PostgreSQL messages
Date: 2008-12-31 18:48:54
Message-ID: 20081231184854.284240@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

> Hm, so maybe both Peter and Alvaro are right:
>
> 1) Setting the translation wrapper to a NOOP as early as possible.
>
> Thus, the first messages are sent in 7-bit ASCII English.

Despite being *marked* for translation and a translation
to exist in the .po file, that is.

Karsten
--
Sensationsangebot verlängert: GMX FreeDSL - Telefonanschluss + DSL
für nur 16,37 Euro/mtl.!* http://dsl.gmx.de/?ac=OM.AD.PD003K1308T4569a


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: pgsql-general(at)postgresql(dot)org
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Karsten Hilbert" <Karsten(dot)Hilbert(at)gmx(dot)net>
Subject: Re: encoding of PostgreSQL messages
Date: 2009-01-01 18:33:56
Message-ID: 200901012033.57020.peter_e@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Wednesday 31 December 2008 20:23:47 Tom Lane wrote:
> > The proper fix is probably to include the client encoding in the
> > connection startup message.
>
> What of errors occurring before such an option could be applied?

Connection errors are handled by the client, which knows the client encoding.
If the setting of the client encoding would be one of the first things to be
done on the server side, you would only have a handful of possible error
conditions left (e.g., setlocale failed, out of memory). You could choose to
report those in plain ASCII or send a special error code that the client can
resolve. Although I guess no one could fault us if "could not set language"
is reported not translated. ;-)

> I think that ultimately it's necessary to accept that there will be some
> window during connection startup where sending plain ASCII (English)
> messages is the best recourse.

Ultimately yes. But we currently handle the client encoding quite late in the
startup sequence so that many connection startup failure messages that are of
interest to normal users would likely be affected. So moving the client
encoding handling to the earliest possible phase would still be desirable.


From: Karsten Hilbert <Karsten(dot)Hilbert(at)gmx(dot)net>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: encoding of PostgreSQL messages
Date: 2009-01-07 19:15:50
Message-ID: 20090107191550.GE3833@merkur.hilbert.loc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Bruce, et al,

given the thread partially quoted below would this warrant a
TODO item "improve communication of encoding between client
and server regarding early startup messages" ?

A very usable band-aid for 8.4 - short of a proper fix -
would be the minimal-invasive sending of messages in 7-bit
English until server_encoding can be retrieved by the client
by current means.

Thanks,
Karsten

On Thu, Jan 01, 2009 at 08:33:56PM +0200, Peter Eisentraut wrote:
> Subject: Re: [GENERAL] encoding of PostgreSQL messages
> User-Agent: KMail/1.9.9
>
> On Wednesday 31 December 2008 20:23:47 Tom Lane wrote:
> > > The proper fix is probably to include the client encoding in the
> > > connection startup message.
> >
> > What of errors occurring before such an option could be applied?
>
> Connection errors are handled by the client, which knows the client encoding.
> If the setting of the client encoding would be one of the first things to be
> done on the server side, you would only have a handful of possible error
> conditions left (e.g., setlocale failed, out of memory). You could choose to
> report those in plain ASCII or send a special error code that the client can
> resolve. Although I guess no one could fault us if "could not set language"
> is reported not translated. ;-)
>
> > I think that ultimately it's necessary to accept that there will be some
> > window during connection startup where sending plain ASCII (English)
> > messages is the best recourse.
>
> Ultimately yes. But we currently handle the client encoding quite late in the
> startup sequence so that many connection startup failure messages that are of
> interest to normal users would likely be affected. So moving the client
> encoding handling to the earliest possible phase would still be desirable.
>
> --
> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general

--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Karsten Hilbert <Karsten(dot)Hilbert(at)gmx(dot)net>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: encoding of PostgreSQL messages
Date: 2009-01-22 01:10:32
Message-ID: 200901220110.n0M1AWY03839@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general


Added to TODO:

Improve encoding of connection startup messages sent to the client

Currently some authentication error messages are sent in the server
encoding

* http://archives.postgresql.org/pgsql-general/2008-12/msg00801.php
* http://archives.postgresql.org/pgsql-general/2009-01/msg00005.php

---------------------------------------------------------------------------

Karsten Hilbert wrote:
> Bruce, et al,
>
> given the thread partially quoted below would this warrant a
> TODO item "improve communication of encoding between client
> and server regarding early startup messages" ?
>
> A very usable band-aid for 8.4 - short of a proper fix -
> would be the minimal-invasive sending of messages in 7-bit
> English until server_encoding can be retrieved by the client
> by current means.
>
> Thanks,
> Karsten
>
> On Thu, Jan 01, 2009 at 08:33:56PM +0200, Peter Eisentraut wrote:
> > Subject: Re: [GENERAL] encoding of PostgreSQL messages
> > User-Agent: KMail/1.9.9
> >
> > On Wednesday 31 December 2008 20:23:47 Tom Lane wrote:
> > > > The proper fix is probably to include the client encoding in the
> > > > connection startup message.
> > >
> > > What of errors occurring before such an option could be applied?
> >
> > Connection errors are handled by the client, which knows the client encoding.
> > If the setting of the client encoding would be one of the first things to be
> > done on the server side, you would only have a handful of possible error
> > conditions left (e.g., setlocale failed, out of memory). You could choose to
> > report those in plain ASCII or send a special error code that the client can
> > resolve. Although I guess no one could fault us if "could not set language"
> > is reported not translated. ;-)
> >
> > > I think that ultimately it's necessary to accept that there will be some
> > > window during connection startup where sending plain ASCII (English)
> > > messages is the best recourse.
> >
> > Ultimately yes. But we currently handle the client encoding quite late in the
> > startup sequence so that many connection startup failure messages that are of
> > interest to normal users would likely be affected. So moving the client
> > encoding handling to the earliest possible phase would still be desirable.
> >
> > --
> > Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
> > To make changes to your subscription:
> > http://www.postgresql.org/mailpref/pgsql-general
>
> --
> GPG key ID E4071346 @ wwwkeys.pgp.net
> E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346
>
> --
> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Karsten Hilbert <Karsten(dot)Hilbert(at)gmx(dot)net>, pgsql-general(at)postgresql(dot)org
Subject: Re: encoding of PostgreSQL messages
Date: 2009-01-30 15:11:05
Message-ID: 49831889.2020103@tpf.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Hi,

This topic seems to be related to the bug report
[ODBC] Localized error messages, wrong charset
.

Bruce Momjian wrote:
> Added to TODO:
>
> Improve encoding of connection startup messages sent to the client
>
> Currently some authentication error messages are sent in the server
> encoding

It it true ?
IIRC the backend knows nothing about the server encoding in
authentication phase.

Psqlodbc Unicode driver sends connection startup message which
contains the client_encoding(=UTF8) guc parameter. Attached is
a trial patch so that the psqlodbc Unicode driver can get
properly localized password error messages.

regards,
Hiroshi Inoue

Attachment Content-Type Size
processStartup.diff text/plain 735 bytes

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp>
Cc: Karsten Hilbert <Karsten(dot)Hilbert(at)gmx(dot)net>, pgsql-general(at)postgresql(dot)org
Subject: Re: encoding of PostgreSQL messages
Date: 2009-02-07 17:14:37
Message-ID: 200902071714.n17HEbg26162@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general


Can someone comment on this?

---------------------------------------------------------------------------

Hiroshi Inoue wrote:
> Hi,
>
> This topic seems to be related to the bug report
> [ODBC] Localized error messages, wrong charset
> .
>
> Bruce Momjian wrote:
> > Added to TODO:
> >
> > Improve encoding of connection startup messages sent to the client
> >
> > Currently some authentication error messages are sent in the server
> > encoding
>
> It it true ?
> IIRC the backend knows nothing about the server encoding in
> authentication phase.
>
> Psqlodbc Unicode driver sends connection startup message which
> contains the client_encoding(=UTF8) guc parameter. Attached is
> a trial patch so that the psqlodbc Unicode driver can get
> properly localized password error messages.
>
> regards,
> Hiroshi Inoue

> Index: postmaster/postmaster.c
> ===================================================================
> RCS file: /projects/cvsroot/pgsql/src/backend/postmaster/postmaster.c,v
> retrieving revision 1.570
> diff -c -c -r1.570 postmaster.c
> *** postmaster/postmaster.c 4 Jan 2009 22:19:59 -0000 1.570
> --- postmaster/postmaster.c 30 Jan 2009 14:05:35 -0000
> ***************
> *** 1552,1557 ****
> --- 1552,1560 ----
> pstrdup(nameptr));
> port->guc_options = lappend(port->guc_options,
> pstrdup(valptr));
> + if (stricmp(nameptr, "client_encoding") == 0 &&
> + stricmp(valptr, "UTF8") == 0)
> + bind_textdomain_codeset(PG_TEXTDOMAIN("postgres"), "UTF-8");
> }
> offset = valoffset + strlen(valptr) + 1;
> }

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp>, Karsten Hilbert <Karsten(dot)Hilbert(at)gmx(dot)net>, pgsql-general(at)postgresql(dot)org
Subject: Re: encoding of PostgreSQL messages
Date: 2009-02-08 09:25:56
Message-ID: 498EA524.9020004@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Bruce Momjian wrote:
> Can someone comment on this?

Looks like a horrible hack to me. Recoding stuff to the client encoding
in the server outside the existing recoding mechanism looks pretty evil
to me. Plus, it does not address the problem of what happens to
messages sent before this, it just moves the point of "before" a bit
earlier for some special cases.

I think we have discussed more proper solutions earlier in this thread.
IMO the best approach would be for the client to include the client
encoding in the startup package.

>
> ---------------------------------------------------------------------------
>
> Hiroshi Inoue wrote:
>> Hi,
>>
>> This topic seems to be related to the bug report
>> [ODBC] Localized error messages, wrong charset
>> .
>>
>> Bruce Momjian wrote:
>>> Added to TODO:
>>>
>>> Improve encoding of connection startup messages sent to the client
>>>
>>> Currently some authentication error messages are sent in the server
>>> encoding
>> It it true ?
>> IIRC the backend knows nothing about the server encoding in
>> authentication phase.
>>
>> Psqlodbc Unicode driver sends connection startup message which
>> contains the client_encoding(=UTF8) guc parameter. Attached is
>> a trial patch so that the psqlodbc Unicode driver can get
>> properly localized password error messages.
>>
>> regards,
>> Hiroshi Inoue
>
>> Index: postmaster/postmaster.c
>> ===================================================================
>> RCS file: /projects/cvsroot/pgsql/src/backend/postmaster/postmaster.c,v
>> retrieving revision 1.570
>> diff -c -c -r1.570 postmaster.c
>> *** postmaster/postmaster.c 4 Jan 2009 22:19:59 -0000 1.570
>> --- postmaster/postmaster.c 30 Jan 2009 14:05:35 -0000
>> ***************
>> *** 1552,1557 ****
>> --- 1552,1560 ----
>> pstrdup(nameptr));
>> port->guc_options = lappend(port->guc_options,
>> pstrdup(valptr));
>> + if (stricmp(nameptr, "client_encoding") == 0 &&
>> + stricmp(valptr, "UTF8") == 0)
>> + bind_textdomain_codeset(PG_TEXTDOMAIN("postgres"), "UTF-8");
>> }
>> offset = valoffset + strlen(valptr) + 1;
>> }
>


From: Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Karsten Hilbert <Karsten(dot)Hilbert(at)gmx(dot)net>, pgsql-general(at)postgresql(dot)org
Subject: Re: encoding of PostgreSQL messages
Date: 2009-02-08 12:24:32
Message-ID: 498ECF00.607@tpf.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Peter Eisentraut wrote:
> Bruce Momjian wrote:
>> Can someone comment on this?
>
> Looks like a horrible hack to me. Recoding stuff to the client encoding
> in the server outside the existing recoding mechanism looks pretty evil
> to me.

> Plus, it does not address the problem of what happens to
> messages sent before this, it just moves the point of "before" a bit
> earlier for some special cases.
>
> I think we have discussed more proper solutions earlier in this thread.
> IMO the best approach would be for the client to include the client
> encoding in the startup package.

??? My patch exactly references the client_encoding included in
the startup message.

regards,
Hiroshi Inoue


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp>, Karsten Hilbert <Karsten(dot)Hilbert(at)gmx(dot)net>, pgsql-general(at)postgresql(dot)org
Subject: Re: encoding of PostgreSQL messages
Date: 2009-02-08 15:38:16
Message-ID: 19896.1234107496@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
> Bruce Momjian wrote:
>> Can someone comment on this?

> Looks like a horrible hack to me. Recoding stuff to the client encoding
> in the server outside the existing recoding mechanism looks pretty evil
> to me. Plus, it does not address the problem of what happens to
> messages sent before this, it just moves the point of "before" a bit
> earlier for some special cases.

> I think we have discussed more proper solutions earlier in this thread.
> IMO the best approach would be for the client to include the client
> encoding in the startup package.

Huh? Clients already do that (or at least some are capable of it,
including libpq). The hard problems are (1) there's still a "before",
ie we might fail before scanning the options in the packet, and (2)
the sent encoding might itself be invalid, and you still have to report
that somehow.

I believe the only real "fix" is to guarantee that messages are sent
as untranslated ASCII until we have sent an encoding indicator at
the end of the startup sequence. Which has its own pretty clear
downside: no more translation of authorization failures.

regards, tom lane


From: Karsten Hilbert <Karsten(dot)Hilbert(at)gmx(dot)net>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: encoding of PostgreSQL messages
Date: 2009-02-08 18:05:07
Message-ID: 20090208180507.GA3873@merkur.hilbert.loc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Sun, Feb 08, 2009 at 10:38:16AM -0500, Tom Lane wrote:

> I believe the only real "fix" is to guarantee that messages are sent
> as untranslated ASCII until we have sent an encoding indicator at
> the end of the startup sequence. Which has its own pretty clear
> downside: no more translation of authorization failures.

I, for one, would be perfectly fine with this. It has
several advantages:

- it does allow to "know" the encoding right from the beginning

- it does not require client library changes (IOW the change does
not need to trickle down: server -> libpq -> psycopg2 -> GNUmed)

- the type of (auth) failure can be deduced from the (now always
English) message string which, again, allows for client-side
translation, if so desired

Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346


From: Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Bruce Momjian <bruce(at)momjian(dot)us>, Karsten Hilbert <Karsten(dot)Hilbert(at)gmx(dot)net>, pgsql-general(at)postgresql(dot)org
Subject: Re: encoding of PostgreSQL messages
Date: 2009-02-08 23:03:14
Message-ID: 498F64B2.8070401@tpf.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Tom Lane wrote:
> Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
>> Bruce Momjian wrote:
>>> Can someone comment on this?
>
>> Looks like a horrible hack to me. Recoding stuff to the client encoding
>> in the server outside the existing recoding mechanism looks pretty evil
>> to me. Plus, it does not address the problem of what happens to
>> messages sent before this, it just moves the point of "before" a bit
>> earlier for some special cases.
>
>> I think we have discussed more proper solutions earlier in this thread.
>> IMO the best approach would be for the client to include the client
>> encoding in the startup package.
>
> Huh? Clients already do that (or at least some are capable of it,
> including libpq).

Yes the psqlodbc driver has done it because protocol 3 allowed it
from the first.

> The hard problems are (1) there's still a "before",

Yes but isn't it an improvement that properly localized password error
or no database error etc can be seen? Currently I see unreadable
error messages for those cases via psqlodbc driver. The attatched
patch in my previous posting is an example to solve the problem.

regards,
Hirosh Inoue


From: Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Bruce Momjian <bruce(at)momjian(dot)us>, Karsten Hilbert <Karsten(dot)Hilbert(at)gmx(dot)net>, pgsql-general(at)postgresql(dot)org
Subject: Re: encoding of PostgreSQL messages
Date: 2009-02-11 00:15:44
Message-ID: 499218B0.1010107@tpf.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Tom Lane wrote:
> Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
>> Bruce Momjian wrote:
>>> Can someone comment on this?
>
>> I think we have discussed more proper solutions earlier in this thread.
>> IMO the best approach would be for the client to include the client
>> encoding in the startup package.
>
> Huh? Clients already do that (or at least some are capable of it,

> including libpq).

What is a recommended way to do it using libpq, via "options" parameter
or "PGCLIENTENCOIDNG" environment value?

> The hard problems are (1) there's still a "before",
> ie we might fail before scanning the options in the packet, and (2)
> the sent encoding might itself be invalid, and you still have to report
> that somehow.
>
> I believe the only real "fix" is to guarantee that messages are sent
> as untranslated ASCII until we have sent an encoding indicator at
> the end of the startup sequence. Which has its own pretty clear
> downside: no more translation of authorization failures.

I'm afraid I'm misunderstanding your point.
I'm thinking of the following steps in the backend code.

1.Set LC_MESSAGES to "C" until the client_encoding is
determined.
2.When a client_encoding is specifed in the startup
message, bind the corrsponding codeset to the
textdomain and set LC_MESSAGES to the specified one
in the startup message or restore the LC_MESSAGES
overridden by step 1 before authorization step.
Then we can see properly localized authorization
failure messages.

3.Reset LC_MESSAGES to the current one in Initialize
ClientEncoding() and unbind the codeset if necessary
in SetDatabaseEncoding().

Comments?

regards,
Hiroshi Inoue


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Bruce Momjian <bruce(at)momjian(dot)us>, Karsten Hilbert <Karsten(dot)Hilbert(at)gmx(dot)net>, pgsql-general(at)postgresql(dot)org
Subject: Re: encoding of PostgreSQL messages
Date: 2009-02-11 01:04:17
Message-ID: 27818.1234314257@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp> writes:
> I'm thinking of the following steps in the backend code.

> 1.Set LC_MESSAGES to "C" until the client_encoding is
> determined.
> 2.When a client_encoding is specifed in the startup
> message, bind the corrsponding codeset to the
> textdomain and set LC_MESSAGES to the specified one
> in the startup message or restore the LC_MESSAGES
> overridden by step 1 before authorization step.
> Then we can see properly localized authorization
> failure messages.

> 3.Reset LC_MESSAGES to the current one in Initialize
> ClientEncoding() and unbind the codeset if necessary
> in SetDatabaseEncoding().

Reflecting on the bigger picture ... I would imagine that the vast
majority of existing applications depend on client_encoding settings
that come from postgresql.conf, ALTER USER SET, ALTER DATABASE SET, or
just the default (== database encoding). I don't think a solution that
penalizes those cases and makes only the case of setting it via
PGCLIENTENCODING work nicely is going to make very many people happy.

Mind you, I don't really know how to do better, but I do see that the
case of client_encoding being specified in the startup message is
not going to help enough people to be particularly useful.

regards, tom lane


From: Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Bruce Momjian <bruce(at)momjian(dot)us>, Karsten Hilbert <Karsten(dot)Hilbert(at)gmx(dot)net>, pgsql-general(at)postgresql(dot)org
Subject: Re: encoding of PostgreSQL messages
Date: 2009-02-11 05:20:47
Message-ID: 4992602F.6070507@tpf.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Tom Lane wrote:
> Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp> writes:
>> I'm thinking of the following steps in the backend code.
>
>> 1.Set LC_MESSAGES to "C" until the client_encoding is
>> determined.
>> 2.When a client_encoding is specifed in the startup
>> message, bind the corrsponding codeset to the
>> textdomain and set LC_MESSAGES to the specified one
>> in the startup message or restore the LC_MESSAGES
>> overridden by step 1 before authorization step.
>> Then we can see properly localized authorization
>> failure messages.
>
>> 3.Reset LC_MESSAGES to the current one in Initialize
>> ClientEncoding() and unbind the codeset if necessary
>> in SetDatabaseEncoding().
>
> Reflecting on the bigger picture ... I would imagine that the vast
> majority of existing applications depend on client_encoding settings
> that come from postgresql.conf, ALTER USER SET, ALTER DATABASE SET, or
> just the default (== database encoding). I don't think a solution that
> penalizes those cases

Removing step 1 resolves the penalty. In the first place step 1
comes from your or Karsten's suggestion.

> and makes only the case of setting it via
> PGCLIENTENCODING work nicely is going to make very many people happy.
> Mind you, I don't really know how to do better, but I do see that the
> case of client_encoding being specified in the startup message is
> not going to help enough people to be particularly useful.

Maybe not enough currently because collaboration between the backend
and clients is needed to solve this problem ovbiously. The backend
should provide clients the way to specify the client_encoding on the
fly which can be applied to authorization failure messages. Then
clients which are eager to solve this problem would use the way.
Using the information in the startup message is almost unique way
to achieve it.

regards,
Hiroshi Inoue


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp>, Bruce Momjian <bruce(at)momjian(dot)us>, Karsten Hilbert <Karsten(dot)Hilbert(at)gmx(dot)net>, pgsql-general(at)postgresql(dot)org
Subject: Re: encoding of PostgreSQL messages
Date: 2009-02-11 07:05:56
Message-ID: 499278D4.7040802@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Tom Lane wrote:
> Reflecting on the bigger picture ... I would imagine that the vast
> majority of existing applications depend on client_encoding settings
> that come from postgresql.conf, ALTER USER SET, ALTER DATABASE SET, or
> just the default (== database encoding). I don't think a solution that
> penalizes those cases and makes only the case of setting it via
> PGCLIENTENCODING work nicely is going to make very many people happy.

I don't have any survey data available, but I think this assessment is
semantically wrong. Usefully, the client encoding can come only from
the client, or be defaulted (and even that is semantically wrong). I
see the other cases as workarounds.


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp>, Karsten Hilbert <Karsten(dot)Hilbert(at)gmx(dot)net>, pgsql-general(at)postgresql(dot)org
Subject: Re: encoding of PostgreSQL messages
Date: 2009-02-11 07:07:12
Message-ID: 49927920.7020304@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Tom Lane wrote:
> I believe the only real "fix" is to guarantee that messages are sent
> as untranslated ASCII until we have sent an encoding indicator at
> the end of the startup sequence. Which has its own pretty clear
> downside: no more translation of authorization failures.

We should process the client encoding sent in the startup package before
authentication.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp>, Bruce Momjian <bruce(at)momjian(dot)us>, Karsten Hilbert <Karsten(dot)Hilbert(at)gmx(dot)net>, pgsql-general(at)postgresql(dot)org
Subject: Re: encoding of PostgreSQL messages
Date: 2009-02-11 16:00:31
Message-ID: 10045.1234368031@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
> Tom Lane wrote:
>> Reflecting on the bigger picture ... I would imagine that the vast
>> majority of existing applications depend on client_encoding settings
>> that come from postgresql.conf, ALTER USER SET, ALTER DATABASE SET, or
>> just the default (== database encoding). I don't think a solution that
>> penalizes those cases and makes only the case of setting it via
>> PGCLIENTENCODING work nicely is going to make very many people happy.

> I don't have any survey data available, but I think this assessment is
> semantically wrong. Usefully, the client encoding can come only from
> the client, or be defaulted (and even that is semantically wrong).

In an ideal world, perhaps so, but do you deny my point that that's not
reality?

regards, tom lane


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: pgsql-general(at)postgresql(dot)org
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp>, Bruce Momjian <bruce(at)momjian(dot)us>, Karsten Hilbert <Karsten(dot)Hilbert(at)gmx(dot)net>
Subject: Re: encoding of PostgreSQL messages
Date: 2009-02-12 06:10:07
Message-ID: 200902120810.09272.peter_e@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Wednesday 11 February 2009 18:00:31 Tom Lane wrote:
> Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
> > Tom Lane wrote:
> >> Reflecting on the bigger picture ... I would imagine that the vast
> >> majority of existing applications depend on client_encoding settings
> >> that come from postgresql.conf, ALTER USER SET, ALTER DATABASE SET, or
> >> just the default (== database encoding). I don't think a solution that
> >> penalizes those cases and makes only the case of setting it via
> >> PGCLIENTENCODING work nicely is going to make very many people happy.
> >
> > I don't have any survey data available, but I think this assessment is
> > semantically wrong. Usefully, the client encoding can come only from
> > the client, or be defaulted (and even that is semantically wrong).
>
> In an ideal world, perhaps so, but do you deny my point that that's not
> reality?

I have never seen a setup where the client encoding did not come from the
default or the client (and the person who set it up knew what they were
doing). I don't think the other cases are worth optimizing.


From: "Paolo Saudin" <paolo(at)ecometer(dot)it>
To: <pgsql-general(at)postgresql(dot)org>
Subject: How to check if 2 series of data are equal
Date: 2009-02-12 08:06:41
Message-ID: 002c01c98ce8$d976e980$8c64bc80$@it
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Hi,

I have 14 tables filled with meteorological data, one record per parameter per hour. The id field holds the parameter type (1=temperature, 2=humidity ...) My problem is that for short periods (maybe one week, one month) there are two stations with the same data, I mean the temperature of table1 is equal to the humidity of table3. I need to discover those cases.

I could pick one record in the first station and then compare it with the ones in the other tables for all the parameters at that particular date. If two records are equals (it probably happens) I must then check the next one in the timeserie. If the second record is equal too, then probably the two series may be equals and I must raise an alert from my application. Is there a better and faster way to perform such a check ?

-- tables
CREATE TABLE table1
(
fulldate timestamp,
id smallint NOT NULL,
meanvalue real
) WITH (OIDS=FALSE);
--.....................
--.....................
CREATE TABLE table14
(
fulldate timestamp,
id smallint NOT NULL,
meanvalue real
) WITH (OIDS=FALSE);
--
-- inserts
insert into table1(select
('2009-01-01'::timestamp + interval '1 hour' * s.a)::timestamp,
1::smallint, round(cast(random() as numeric), 1)::real
from generate_series(0,1000) as s(a)
);
--
insert into table2(select
('2009-01-01'::timestamp + interval '1 hour' * s.a)::timestamp,
1::smallint, round(cast(random() as numeric), 1)::real
from generate_series(0,1000) as s(a)
);
--
-- same data as table 1 -- MUST BE FOUND BY THE CKECK ROUTINE
insert into table3(select fulldate, id, meanvalue from table1);

Thank in advance,
Paolo Saudin


From: Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Bruce Momjian <bruce(at)momjian(dot)us>, Karsten Hilbert <Karsten(dot)Hilbert(at)gmx(dot)net>, pgsql-general(at)postgresql(dot)org
Subject: Re: encoding of PostgreSQL messages
Date: 2009-02-12 13:28:38
Message-ID: 49942406.1010608@tpf.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Tom Lane wrote:
> Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
>> Tom Lane wrote:
>>> Reflecting on the bigger picture ... I would imagine that the vast
>>> majority of existing applications depend on client_encoding settings
>>> that come from postgresql.conf, ALTER USER SET, ALTER DATABASE SET, or
>>> just the default (== database encoding). I don't think a solution that
>>> penalizes those cases and makes only

not only but also the JDBC driver or the ODBC driver sends the
startup packet including the client_encoding. Libpq can be changed
to allow *client_encoding=xxxxx* definition in conninfo.

> the case of setting it via
>>> PGCLIENTENCODING work nicely is going to make very many people happy.

Not a few libraries/applications issue SET client_encoding to ...
immediately after the connection was establised. What's wrong with
urging such clients to eliminate the SET commands and use the nice
feature of FE/BE procotol?

regards,
Hiroshi Inoue


From: Karsten Hilbert <Karsten(dot)Hilbert(at)gmx(dot)net>
To: Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: encoding of PostgreSQL messages
Date: 2009-02-12 13:57:28
Message-ID: 20090212135727.GB3786@merkur.hilbert.loc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Thu, Feb 12, 2009 at 10:28:38PM +0900, Hiroshi Inoue wrote:

> >>> Reflecting on the bigger picture ... I would imagine that the vast
> >>> majority of existing applications depend on client_encoding settings
> >>> that come from postgresql.conf, ALTER USER SET, ALTER DATABASE SET, or
> >>> just the default (== database encoding). I don't think a solution that
> >>> penalizes those cases and makes only

While I agree that it is desirable to find a solution which
does not break (or at least "allows to make not break") such
applications I tend to think that those applications are not
based on sound engineering with respect to this detail. A
clean solution - which allows for the unclean solution to
coexist - is very desirable.

One might introduce a GUC variable "pre_connect_encoding".
If not set it would default to 7-bit ascii as proposed but
can be set to either some explicit encoding or else to, say,
"deduce_from_environment" which would reinstate the current
behaviour. That would allow those who don't want to/cannot
fix client code to keep things working as before.

> not only but also the JDBC driver or the ODBC driver sends the
> startup packet including the client_encoding. Libpq can be changed
> to allow *client_encoding=xxxxx* definition in conninfo.
>
> > the case of setting it via
> >>> PGCLIENTENCODING work nicely is going to make very many people happy.
>
> Not a few libraries/applications issue SET client_encoding to ...
> immediately after the connection was establised. What's wrong with
> urging such clients to eliminate the SET commands and use the nice
> feature of FE/BE procotol?

Sounds sane to me as a developer using a Python PostgreSQL
adapter. It can be argued that it is *still* useful to have
a well known initial encoding (7-bit ascii) -- this would be
used for returning an error if the startup packet included
an invalid encoding ...

Also it would allow current application code to assume that
encoding until the encoding can be set by current means -
this would allow current application code to behave
correctly without having to wait for database adapter code
to support setting the encoding in the startup packet.

Or put another way: I could fix GNUmed properly NOW rather
than having to wait for psycopg2 to support setting the
client encoding in the startup packet.

Thanks,
Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346


From: Karsten Hilbert <Karsten(dot)Hilbert(at)gmx(dot)net>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: encoding of PostgreSQL messages
Date: 2009-02-12 14:01:23
Message-ID: 20090212140123.GC3786@merkur.hilbert.loc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Wed, Feb 11, 2009 at 02:20:47PM +0900, Hiroshi Inoue wrote:

> Tom Lane wrote:
> > Hiroshi Inoue <inoue(at)tpf(dot)co(dot)jp> writes:
> >> I'm thinking of the following steps in the backend code.
> >
> >> 1.Set LC_MESSAGES to "C" until the client_encoding is
> >> determined.

I have tried that but it didn't work out for some reason.

> Removing step 1 resolves the penalty. In the first place step 1
> comes from your or Karsten's suggestion.

Not quite. My suggestion was to not *translate* strings (and
assume 7-bit ascii) until the client encoding is known.

> Maybe not enough currently because collaboration between the backend
> and clients is needed to solve this problem ovbiously. The backend
> should provide clients the way to specify the client_encoding on the
> fly which can be applied to authorization failure messages. Then
> clients which are eager to solve this problem would use the way.
> Using the information in the startup message is almost unique way
> to achieve it.
Sounds good to me as far as I can see.

Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346


From: Adrian Klaver <aklaver(at)comcast(dot)net>
To: pgsql-general(at)postgresql(dot)org
Cc: "Paolo Saudin" <paolo(at)ecometer(dot)it>
Subject: Re: How to check if 2 series of data are equal
Date: 2009-02-12 16:28:29
Message-ID: 200902120828.30292.aklaver@comcast.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Thursday 12 February 2009 12:06:41 am Paolo Saudin wrote:
> Hi,
>
> I have 14 tables filled with meteorological data, one record per parameter
> per hour. The id field holds the parameter type (1=temperature, 2=humidity
> ...) My problem is that for short periods (maybe one week, one month) there
> are two stations with the same data, I mean the temperature of table1 is
> equal to the humidity of table3. I need to discover those cases.

Before I can start to answer this I need some clarification. How can temperature
and humidity be the same data?

>
> I could pick one record in the first station and then compare it with the
> ones in the other tables for all the parameters at that particular date. If
> two records are equals (it probably happens) I must then check the next one
> in the timeserie. If the second record is equal too, then probably the two
> series may be equals and I must raise an alert from my application. Is
> there a better and faster way to perform such a check ?
>
> -- tables
> CREATE TABLE table1
> (
> fulldate timestamp,
> id smallint NOT NULL,
> meanvalue real
> ) WITH (OIDS=FALSE);
> --.....................
> --.....................
> CREATE TABLE table14
> (
> fulldate timestamp,
> id smallint NOT NULL,
> meanvalue real
> ) WITH (OIDS=FALSE);
> --
> -- inserts
> insert into table1(select
> ('2009-01-01'::timestamp + interval '1 hour' * s.a)::timestamp,
> 1::smallint, round(cast(random() as numeric), 1)::real
> from generate_series(0,1000) as s(a)
> );
> --
> insert into table2(select
> ('2009-01-01'::timestamp + interval '1 hour' * s.a)::timestamp,
> 1::smallint, round(cast(random() as numeric), 1)::real
> from generate_series(0,1000) as s(a)
> );
> --
> -- same data as table 1 -- MUST BE FOUND BY THE CKECK ROUTINE
> insert into table3(select fulldate, id, meanvalue from table1);
>
>
> Thank in advance,
> Paolo Saudin

--
Adrian Klaver
aklaver(at)comcast(dot)net


From: "Paolo Saudin" <paolo(at)ecometer(dot)it>
To: "'Adrian Klaver'" <aklaver(at)comcast(dot)net>, <pgsql-general(at)postgresql(dot)org>
Subject: R: How to check if 2 series of data are equal
Date: 2009-02-12 17:44:37
Message-ID: 007101c98d39$9381f630$ba85e290$@it
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

>-----Messaggio originale-----
>Da: pgsql-general-owner(at)postgresql(dot)org [mailto:pgsql-general-owner(at)postgresql(dot)org] Per conto di Adrian Klaver
>Inviato: giovedì 12 febbraio 2009 17.28
>A: pgsql-general(at)postgresql(dot)org
>Cc: Paolo Saudin
>Oggetto: Re: [GENERAL] How to check if 2 series of data are equal

>On Thursday 12 February 2009 12:06:41 am Paolo Saudin wrote:
>> Hi,
>>
>> I have 14 tables filled with meteorological data, one record per parameter
>> per hour. The id field holds the parameter type (1=temperature, 2=humidity
>> ...) My problem is that for short periods (maybe one week, one month) there
>> are two stations with the same data, I mean the temperature of table1 is
>> equal to the humidity of table3. I need to discover those cases.

>Before I can start to answer this I need some clarification. How can temperature
>and humidity be the same data?

Can be the same data ( and it is ) because of errors in the remote stations configurations.
The Stations and parameters IDs were mixed up resulting in same data in different tables ...

>>
>> I could pick one record in the first station and then compare it with the
>> ones in the other tables for all the parameters at that particular date. If
>> two records are equals (it probably happens) I must then check the next one
>> in the timeserie. If the second record is equal too, then probably the two
>> series may be equals and I must raise an alert from my application. Is
>> there a better and faster way to perform such a check ?
>>
>> -- tables
>> CREATE TABLE table1
>> (
>> fulldate timestamp,
>> id smallint NOT NULL,
>> meanvalue real
>> ) WITH (OIDS=FALSE);
>> --.....................
>> --.....................
>> CREATE TABLE table14
>> (
>> fulldate timestamp,
>> id smallint NOT NULL,
>> meanvalue real
>> ) WITH (OIDS=FALSE);
>> --
>> -- inserts
>> insert into table1(select
>> ('2009-01-01'::timestamp + interval '1 hour' * s.a)::timestamp,
>> 1::smallint, round(cast(random() as numeric), 1)::real
>> from generate_series(0,1000) as s(a)
>> );
>> --
>> insert into table2(select
>> ('2009-01-01'::timestamp + interval '1 hour' * s.a)::timestamp,
>> 1::smallint, round(cast(random() as numeric), 1)::real
>> from generate_series(0,1000) as s(a)
>> );
>> --
>> -- same data as table 1 -- MUST BE FOUND BY THE CKECK ROUTINE
>> insert into table3(select fulldate, id, meanvalue from table1);
>>
>>
>> Thank in advance,
>> Paolo Saudin

>--
>Adrian Klaver
>aklaver(at)comcast(dot)net

>--
>Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
>To make changes to your subscription:
>http://www.postgresql.org/mailpref/pgsql-general


From: Adrian Klaver <aklaver(at)comcast(dot)net>
To: Paolo Saudin <paolo(at)ecometer(dot)it>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: R: How to check if 2 series of data are equal
Date: 2009-02-12 17:56:32
Message-ID: 1544341303.115741234461392851.JavaMail.root@sz0030a.emeryville.ca.mail.comcast.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general


----- "Paolo Saudin" <paolo(at)ecometer(dot)it> wrote:

> >-----Messaggio originale-----
> >Da: pgsql-general-owner(at)postgresql(dot)org
> [mailto:pgsql-general-owner(at)postgresql(dot)org] Per conto di Adrian
> Klaver
> >Inviato: giovedì 12 febbraio 2009 17.28
> >A: pgsql-general(at)postgresql(dot)org
> >Cc: Paolo Saudin
> >Oggetto: Re: [GENERAL] How to check if 2 series of data are equal
>
> >On Thursday 12 February 2009 12:06:41 am Paolo Saudin wrote:
> >> Hi,
> >>
> >> I have 14 tables filled with meteorological data, one record per
> parameter
> >> per hour. The id field holds the parameter type (1=temperature,
> 2=humidity
> >> ...) My problem is that for short periods (maybe one week, one
> month) there
> >> are two stations with the same data, I mean the temperature of
> table1 is
> >> equal to the humidity of table3. I need to discover those cases.
>
> >Before I can start to answer this I need some clarification. How can
> temperature
> >and humidity be the same data?
>
> Can be the same data ( and it is ) because of errors in the remote
> stations configurations.
> The Stations and parameters IDs were mixed up resulting in same data
> in different tables ...
>

I am afraid I more confused now. From the table schema the value is a real number only and has no units. As I understand the units designation lies in the id. If the ids are mixed up I can't see how it is possible to differentiate between a value of 25 that maybe degrees C or % relative humidity for instance. You are going to have to step me through this.

Adrian Klaver
aklaver(at)comcast(dot)net


From: Sam Mason <sam(at)samason(dot)me(dot)uk>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: How to check if 2 series of data are equal
Date: 2009-02-12 18:17:02
Message-ID: 20090212181702.GD32672@frubble.xen.chris-lamb.co.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Thu, Feb 12, 2009 at 09:06:41AM +0100, Paolo Saudin wrote:
> I have 14 tables filled with meteorological data, one record
> per parameter per hour. The id field holds the parameter type
> (1=temperature, 2=humidity ...) My problem is that for short periods
> (maybe one week, one month) there are two stations with the same data,
> I mean the temperature of table1 is equal to the humidity of table3. I
> need to discover those cases.

I'm assuming that it's one table per sensor station below.

Would something like this work:

SELECT station, fulldate,
COUNT(CASE WHEN sensor = 1 THEN 1 END) AS num_sensor1_readings,
COUNT(CASE WHEN sensor = 2 THEN 1 END) AS num_sensor2_readings,
COUNT(CASE WHEN sensor = 3 THEN 1 END) AS num_sensor3_readings
FROM (
SELECT 1 AS station, fulldate, id AS sensor FROM table1 UNION ALL
SELECT 2, fulldate, id FROM table2 UNION ALL
SELECT 3, fulldate, id FROM table 3) x
GROUP BY station, fulldate
ORDER BY station, fulldate;

That way you'll get a list of all the duplicate values. If the
"fulldate" column is the actual time it was received and isn't
truncated off to the nearest hour, you will probably want to use
date_trunc('hour',fulldate) in the outer select.

Hope that helps!

--
Sam http://samason.me.uk/


From: "Paolo Saudin" <paolo(at)ecometer(dot)it>
To: "'Adrian Klaver'" <aklaver(at)comcast(dot)net>
Cc: <pgsql-general(at)postgresql(dot)org>
Subject: R: R: How to check if 2 series of data are equal
Date: 2009-02-12 19:37:37
Message-ID: 007501c98d49$5c9b9940$15d2cbc0$@it
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

>-----Messaggio originale-----
>Da: pgsql-general-owner(at)postgresql(dot)org [mailto:pgsql-general-owner(at)postgresql(dot)org] Per conto di Adrian Klaver
>Inviato: giovedì 12 febbraio 2009 18.57
>A: Paolo Saudin
>Cc: pgsql-general(at)postgresql(dot)org
>Oggetto: Re: R: [GENERAL] How to check if 2 series of data are equal

>----- "Paolo Saudin" <paolo(at)ecometer(dot)it> wrote:

> >-----Messaggio originale-----
> >Da: pgsql-general-owner(at)postgresql(dot)org
> [mailto:pgsql-general-owner(at)postgresql(dot)org] Per conto di Adrian
> Klaver
> >Inviato: giovedì 12 febbraio 2009 17.28
> >A: pgsql-general(at)postgresql(dot)org
> >Cc: Paolo Saudin
> >Oggetto: Re: [GENERAL] How to check if 2 series of data are equal
>
> >On Thursday 12 February 2009 12:06:41 am Paolo Saudin wrote:
> >> Hi,
> >>
> >> I have 14 tables filled with meteorological data, one record per
> parameter
> >> per hour. The id field holds the parameter type (1=temperature,
> 2=humidity
> >> ...) My problem is that for short periods (maybe one week, one
> month) there
> >> are two stations with the same data, I mean the temperature of
> table1 is
> >> equal to the humidity of table3. I need to discover those cases.
>
> >Before I can start to answer this I need some clarification. How can
> temperature
> >and humidity be the same data?
>
> Can be the same data ( and it is ) because of errors in the remote
> stations configurations.
> The Stations and parameters IDs were mixed up resulting in same data
> in different tables ...
>

>I am afraid I more confused now. From the table schema the value is a real number only and has no units. As I understand the units >designation lies in the id. If the ids are mixed up I can't see how it is possible to differentiate between a value of 25 that maybe >degrees C or % relative humidity for instance. You are going to have to step me through this.

Yes, the parameter is defined by the id and stored in another table with the name, units and other properties. I need to find out a sequence of meanvalues (without taking care of ids) which exists in another table

Here is some sample data, I need to found out if some sequence of data in table1 is equal to data in table2, table3 ... tableN.

Table1
fulldate, id, meanvalue
2009-01-01 00:00:00, 1, 12.3 -- temperature
2009-01-01 01:00:00, 1, 12.5
2009-01-01 02:00:00, 1, 12.6
2009-01-01 03:00:00, 1, 12.7
2009-01-01 04:00:00, 1, 12.8
2009-01-01 05:00:00, 1, 12.2

Table1
fulldate, id, meanvalue
2009-01-01 00:00:00, 2, 80.3 -- humidity
2009-01-01 01:00:00, 2, 81.6
2009-01-01 02:00:00, 2, 82.1
2009-01-01 03:00:00, 2, 79.8
2009-01-01 04:00:00, 2, 77.2
2009-01-01 05:00:00, 2, 77.1
------------------------------------------------------------------

Table2
fulldate, id, meanvalue
2009-01-01 00:00:00, 1, 12.3 -- temperature
2009-01-01 01:00:00, 1, 11.8
2009-01-01 02:00:00, 1, 82.1 ! WRONG DATA - SAME AS Table2 id->2
2009-01-01 03:00:00, 1, 79.8 ! WRONG DATA - SAME AS Table2 id->2
2009-01-01 04:00:00, 1, 77.2 ! WRONG DATA - SAME AS Table2 id->2
2009-01-01 05:00:00, 1, 13.1

Table2
fulldate, id, meanvalue
2009-01-01 00:00:00, 2, 78.9 -- humidity
2009-01-01 01:00:00, 2, 76.4
2009-01-01 02:00:00, 2, 74.7
2009-01-01 03:00:00, 2, 73.1
2009-01-01 04:00:00, 2, 71.6
2009-01-01 05:00:00, 1, 70.8

Hope this might help,
Paolo Saudin

>Adrian Klaver
>aklaver(at)comcast(dot)net


From: Adrian Klaver <aklaver(at)comcast(dot)net>
To: "Paolo Saudin" <paolo(at)ecometer(dot)it>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: R: R: How to check if 2 series of data are equal
Date: 2009-02-12 22:21:39
Message-ID: 200902121421.40011.aklaver@comcast.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Thursday 12 February 2009 11:37:37 am Paolo Saudin wrote:

> >
> > Can be the same data ( and it is ) because of errors in the remote
> > stations configurations.
> > The Stations and parameters IDs were mixed up resulting in same data
> > in different tables ...
> >
> >
> >I am afraid I more confused now. From the table schema the value is a real
> > number only and has no units. As I understand the units >designation lies
> > in the id. If the ids are mixed up I can't see how it is possible to
> > differentiate between a value of 25 that maybe >degrees C or % relative
> > humidity for instance. You are going to have to step me through this.
>
> Yes, the parameter is defined by the id and stored in another table with
> the name, units and other properties. I need to find out a sequence of
> meanvalues (without taking care of ids) which exists in another table
>
> Here is some sample data, I need to found out if some sequence of data in
> table1 is equal to data in table2, table3 ... tableN.
>
> Table1
> fulldate, id, meanvalue
> 2009-01-01 00:00:00, 1, 12.3 -- temperature
> 2009-01-01 01:00:00, 1, 12.5
> 2009-01-01 02:00:00, 1, 12.6
> 2009-01-01 03:00:00, 1, 12.7
> 2009-01-01 04:00:00, 1, 12.8
> 2009-01-01 05:00:00, 1, 12.2
>
> Table1
> fulldate, id, meanvalue
> 2009-01-01 00:00:00, 2, 80.3 -- humidity
> 2009-01-01 01:00:00, 2, 81.6
> 2009-01-01 02:00:00, 2, 82.1
> 2009-01-01 03:00:00, 2, 79.8
> 2009-01-01 04:00:00, 2, 77.2
> 2009-01-01 05:00:00, 2, 77.1
> ------------------------------------------------------------------
>
> Table2
> fulldate, id, meanvalue
> 2009-01-01 00:00:00, 1, 12.3 -- temperature
> 2009-01-01 01:00:00, 1, 11.8
> 2009-01-01 02:00:00, 1, 82.1 ! WRONG DATA - SAME AS Table2 id->2
> 2009-01-01 03:00:00, 1, 79.8 ! WRONG DATA - SAME AS Table2 id->2
> 2009-01-01 04:00:00, 1, 77.2 ! WRONG DATA - SAME AS Table2 id->2
> 2009-01-01 05:00:00, 1, 13.1

I am going to assume you mean Table1 above.

>
> Table2
> fulldate, id, meanvalue
> 2009-01-01 00:00:00, 2, 78.9 -- humidity
> 2009-01-01 01:00:00, 2, 76.4
> 2009-01-01 02:00:00, 2, 74.7
> 2009-01-01 03:00:00, 2, 73.1
> 2009-01-01 04:00:00, 2, 71.6
> 2009-01-01 05:00:00, 1, 70.8
>
> Hope this might help,
> Paolo Saudin

I modified Sams query-

SELECT fulldate,sensor
FROM (SELECT fulldate,sensor,count(sensor)
FROM (SELECT 1 AS station, fulldate, meanvalue AS sensor FROM table1
UNION
SELECT 2, fulldate, meanvalue FROM table2 ORDER BY
fulldate,sensor) AS x
GROUP BY fulldate,sensor) AS y
WHERE y.count>1;

and got-

fulldate | sensor
---------------------+--------
2009-01-01 00:00:00 | 12.3
2009-01-01 02:00:00 | 82.1
2009-01-01 03:00:00 | 79.8
2009-01-01 04:00:00 | 77.2

Though I think you might want to deal with the remote sensor problem first. I
would be hesitant to trust any of the data. Just a thought.

--
Adrian Klaver
aklaver(at)comcast(dot)net


From: Octavio Alvarez <alvarezp(at)alvarezp(dot)ods(dot)org>
To: Paolo Saudin <paolo(at)ecometer(dot)it>
Cc: Adrian Klaver <aklaver(at)comcast(dot)net>, pgsql-general(at)postgresql(dot)org
Subject: Re: R: R: How to check if 2 series of data are equal
Date: 2009-02-13 04:46:20
Message-ID: 1234500380.23861.4.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Thu, 2009-02-12 at 14:21 -0800, Adrian Klaver wrote:
> On Thursday 12 February 2009 11:37:37 am Paolo Saudin wrote:
>

>
> SELECT fulldate,sensor
> FROM (SELECT fulldate,sensor,count(sensor)
> FROM (SELECT 1 AS station, fulldate, meanvalue AS sensor FROM table1
> UNION
> SELECT 2, fulldate, meanvalue FROM table2 ORDER BY
> fulldate,sensor) AS x
> GROUP BY fulldate,sensor) AS y
> WHERE y.count>1;
>
>
> and got-
>
> fulldate | sensor
> ---------------------+--------
> 2009-01-01 00:00:00 | 12.3
> 2009-01-01 02:00:00 | 82.1
> 2009-01-01 03:00:00 | 79.8
> 2009-01-01 04:00:00 | 77.2

You might want to do a UNION with all your tables to add a field to
identify the table, and use min() and max() to show the offending
tables.


From: "Paolo Saudin" <paolo(at)ecometer(dot)it>
To: "'Adrian Klaver'" <aklaver(at)comcast(dot)net>
Cc: <pgsql-general(at)postgresql(dot)org>
Subject: R: R: R: How to check if 2 series of data are equal
Date: 2009-02-13 07:18:38
Message-ID: 000f01c98dab$4acbe410$e063ac30$@it
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

-----Messaggio originale-----
Da: Adrian Klaver [mailto:aklaver(at)comcast(dot)net]
Inviato: giovedì 12 febbraio 2009 23.22
A: Paolo Saudin
Cc: pgsql-general(at)postgresql(dot)org
Oggetto: Re: R: R: [GENERAL] How to check if 2 series of data are equal

On Thursday 12 February 2009 11:37:37 am Paolo Saudin wrote:

> >
> > Can be the same data ( and it is ) because of errors in the remote
> > stations configurations.
> > The Stations and parameters IDs were mixed up resulting in same data
> > in different tables ...
> >
> >
> >I am afraid I more confused now. From the table schema the value is a real
> > number only and has no units. As I understand the units >designation lies
> > in the id. If the ids are mixed up I can't see how it is possible to
> > differentiate between a value of 25 that maybe >degrees C or % relative
> > humidity for instance. You are going to have to step me through this.
>
> Yes, the parameter is defined by the id and stored in another table with
> the name, units and other properties. I need to find out a sequence of
> meanvalues (without taking care of ids) which exists in another table
>
> Here is some sample data, I need to found out if some sequence of data in
> table1 is equal to data in table2, table3 ... tableN.
>
> Table1
> fulldate, id, meanvalue
> 2009-01-01 00:00:00, 1, 12.3 -- temperature
> 2009-01-01 01:00:00, 1, 12.5
> 2009-01-01 02:00:00, 1, 12.6
> 2009-01-01 03:00:00, 1, 12.7
> 2009-01-01 04:00:00, 1, 12.8
> 2009-01-01 05:00:00, 1, 12.2
>
> Table1
> fulldate, id, meanvalue
> 2009-01-01 00:00:00, 2, 80.3 -- humidity
> 2009-01-01 01:00:00, 2, 81.6
> 2009-01-01 02:00:00, 2, 82.1
> 2009-01-01 03:00:00, 2, 79.8
> 2009-01-01 04:00:00, 2, 77.2
> 2009-01-01 05:00:00, 2, 77.1
> ------------------------------------------------------------------
>
> Table2
> fulldate, id, meanvalue
> 2009-01-01 00:00:00, 1, 12.3 -- temperature
> 2009-01-01 01:00:00, 1, 11.8
> 2009-01-01 02:00:00, 1, 82.1 ! WRONG DATA - SAME AS Table2 id->2
> 2009-01-01 03:00:00, 1, 79.8 ! WRONG DATA - SAME AS Table2 id->2
> 2009-01-01 04:00:00, 1, 77.2 ! WRONG DATA - SAME AS Table2 id->2
> 2009-01-01 05:00:00, 1, 13.1

I am going to assume you mean Table1 above.

>
> Table2
> fulldate, id, meanvalue
> 2009-01-01 00:00:00, 2, 78.9 -- humidity
> 2009-01-01 01:00:00, 2, 76.4
> 2009-01-01 02:00:00, 2, 74.7
> 2009-01-01 03:00:00, 2, 73.1
> 2009-01-01 04:00:00, 2, 71.6
> 2009-01-01 05:00:00, 1, 70.8
>
> Hope this might help,
> Paolo Saudin

>I modified Sams query-
>
>
>SELECT fulldate,sensor
> FROM (SELECT fulldate,sensor,count(sensor)
> FROM (SELECT 1 AS station, fulldate, meanvalue AS sensor FROM table1
>UNION
> SELECT 2, fulldate, meanvalue FROM table2 ORDER BY
>fulldate,sensor) AS x
>GROUP BY fulldate,sensor) AS y
>WHERE y.count>1;
>
>
>and got-
>
> fulldate | sensor
>---------------------+--------
> 2009-01-01 00:00:00 | 12.3
> 2009-01-01 02:00:00 | 82.1
> 2009-01-01 03:00:00 | 79.8
> 2009-01-01 04:00:00 | 77.2

Thank you very much to you all, this one works perfectly !!

>Though I think you might want to deal with the remote sensor problem first. I
>would be hesitant to trust any of the data. Just a thought.

The problem has been fixed and does not happens any more. Unfortunately there are 14 years (1992-2006) in which data could be corrupted for short periods. Now I must found them out ...

Thanks once more

Paolo Saudin

>--
>Adrian Klaver
>aklaver(at)comcast(dot)net


From: "Hiroshi Saito" <z-saito(at)guitar(dot)ocn(dot)ne(dot)jp>
To: "Hiroshi Inoue" <inoue(at)tpf(dot)co(dot)jp>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Peter Eisentraut" <peter_e(at)gmx(dot)net>, "Bruce Momjian" <bruce(at)momjian(dot)us>, "Karsten Hilbert" <Karsten(dot)Hilbert(at)gmx(dot)net>, <pgsql-general(at)postgresql(dot)org>
Subject: Re: encoding of PostgreSQL messages
Date: 2009-03-02 15:35:37
Message-ID: 044CE65303C345CB9176F6EFB343F062@HIRO57887DE653
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Hi.

Sorry very late reaction.
I desire problem solution.Therefore, one evidence....

I tried jdbc program.
http://winpg.jp/~saito/pg_work/LC_MESSAGE_CHECK/connect_problem/jdbctestx.java
C:\home\HIROSHI>java jdbctestx
org.postgresql.util.PSQLException: FATAL: ???[??"postgres"??????????????
at org.postgresql.core.v3.ConnectionFactoryImpl.readStartupMessages(Conn
ectionFactoryImpl.java:444)
at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(Conne
ctionFactoryImpl.java:99)
at org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactor
y.java:66)
at org.postgresql.jdbc2.AbstractJdbc2Connection.<init>(AbstractJdbc2Conn
ection.java:124)
at org.postgresql.jdbc2.Jdbc2Connection.<init>(Jdbc2Connection.java:24)
at org.postgresql.Driver.makeConnection(Driver.java:386)
at org.postgresql.Driver.connect(Driver.java:260)
at java.sql.DriverManager.getConnection(Unknown Source)
at java.sql.DriverManager.getConnection(Unknown Source)
at jdbctestx.main(jdbctestx.java:21)

Then, server log is this.
http://winpg.jp/~saito/pg_work/LC_MESSAGE_CHECK/connect_problem/server.log

Regards,
Hiroshi Saito

----- Original Message -----
From: "Hiroshi Inoue" <inoue(at)tpf(dot)co(dot)jp>

> Tom Lane wrote:
>> Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
>>> Bruce Momjian wrote:
>>>> Can someone comment on this?
>>
>>> Looks like a horrible hack to me. Recoding stuff to the client encoding
>>> in the server outside the existing recoding mechanism looks pretty evil
>>> to me. Plus, it does not address the problem of what happens to
>>> messages sent before this, it just moves the point of "before" a bit
>>> earlier for some special cases.
>>
>>> I think we have discussed more proper solutions earlier in this thread.
>>> IMO the best approach would be for the client to include the client
>>> encoding in the startup package.
>>
>> Huh? Clients already do that (or at least some are capable of it,
>> including libpq).
>
> Yes the psqlodbc driver has done it because protocol 3 allowed it
> from the first.
>
>> The hard problems are (1) there's still a "before",
>
> Yes but isn't it an improvement that properly localized password error
> or no database error etc can be seen? Currently I see unreadable
> error messages for those cases via psqlodbc driver. The attatched
> patch in my previous posting is an example to solve the problem.
>
> regards,
> Hirosh Inoue
>
> --
> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general


From: Karsten Hilbert <Karsten(dot)Hilbert(at)gmx(dot)net>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: encoding of PostgreSQL messages
Date: 2009-03-04 19:59:57
Message-ID: 20090304195957.GL3770@merkur.hilbert.loc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Tue, Mar 03, 2009 at 12:35:37AM +0900, Hiroshi Saito wrote:

> Sorry very late reaction.
> I desire problem solution.
So do I :-)

Ganbatte !

Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346


From: "Hiroshi Saito" <z-saito(at)guitar(dot)ocn(dot)ne(dot)jp>
To: "Karsten Hilbert" <Karsten(dot)Hilbert(at)gmx(dot)net>, <inoue(at)tpf(dot)co(dot)jp>
Cc: <pgsql-general(at)postgresql(dot)org>
Subject: Re: encoding of PostgreSQL messages
Date: 2009-03-05 17:17:17
Message-ID: 1FE2D6AE3FA34FCAB1EEC3DD10C2E0E9@HIRO57887DE653
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Hi. Karsten-san.

Yeah, It was a problem unsolvable by the driver to relay.
although perseverance keeping without giving up!

--
arigatougozaimasu:-)

Regards,
Hiroshi Saito

----- Original Message -----
From: "Karsten Hilbert" <Karsten(dot)Hilbert(at)gmx(dot)net>

> On Tue, Mar 03, 2009 at 12:35:37AM +0900, Hiroshi Saito wrote:
>
>> Sorry very late reaction.
>> I desire problem solution.
> So do I :-)
>
> Ganbatte !
>
> Karsten
> --
> GPG key ID E4071346 @ wwwkeys.pgp.net
> E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346
>
> --
> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general


From: Karsten Hilbert <Karsten(dot)Hilbert(at)gmx(dot)net>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: (problem with) encoding of (early-in-connect) PostgreSQL messages
Date: 2009-03-10 11:39:31
Message-ID: 20090310113931.GO3885@merkur.hilbert.loc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Is there any chance of this being considered for 8.4 ?
Unfortunately I cannot provide patches myself as my
knowledge of C is next to nothing.

Thanks,
Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346