Re: PQescapeBytea is not multibyte aware

Lists: pgsql-hackerspgsql-patches
From: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
To: pgsql-hackers(at)postgresql(dot)org
Subject: PQescapeBytea is not multibyte aware
Date: 2002-04-05 06:24:53
Message-ID: 20020405152453A.t-ishii@sra.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

PQescapebytea() is not multibyte aware and will produce bad multibyte
character sequences. Example:

INSERT INTO t1(bytea_col) VALUES('characters produced by
PQescapebytea');
ERROR: Invalid EUC_JP character sequence found (0x8950)

I think 0x89 should be converted to '\\211' since 0x89 of 0x8950 is
considered as "non printable characters".

Any objection?
--
Tatsuo Ishii


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: PQescapeBytea is not multibyte aware
Date: 2002-04-05 15:18:58
Message-ID: 24188.1018019938@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp> writes:
> PQescapebytea() is not multibyte aware and will produce bad multibyte
> character sequences. Example:
> I think 0x89 should be converted to '\\211' since 0x89 of 0x8950 is
> considered as "non printable characters".

Hmm, so essentially we'd have to convert all codes >= 0x80 to prevent
them from being mistaken for parts of multibyte sequences? Ugh, but
you're probably right. It looks to me like byteaout does the reverse
already.

regards, tom lane


From: Joe Conway <mail(at)joeconway(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: PQescapeBytea is not multibyte aware
Date: 2002-04-05 16:16:01
Message-ID: 3CADCDC1.1020908@joeconway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Tom Lane wrote:
> Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp> writes:
>
>>PQescapebytea() is not multibyte aware and will produce bad multibyte
>>character sequences. Example:
>>I think 0x89 should be converted to '\\211' since 0x89 of 0x8950 is
>>considered as "non printable characters".
>
>
> Hmm, so essentially we'd have to convert all codes >= 0x80 to prevent
> them from being mistaken for parts of multibyte sequences? Ugh, but
> you're probably right. It looks to me like byteaout does the reverse
> already.
>

But the error comes from pg_verifymbstr. Since bytea has no encoding
(it's just an array of bytes afterall), why does pg_verifymbstr get
applied at all to bytea data?

pg_verifymbstr is called by textin, bpcharin, and varcharin. Would it
help to rewrite this as:

INSERT INTO t1(bytea_col) VALUES('characters produced by
PQescapebytea'::bytea);
?

Joe


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Joe Conway <mail(at)joeconway(dot)com>
Cc: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org, Thomas Lockhart <lockhart(at)fourpalms(dot)org>
Subject: Re: PQescapeBytea is not multibyte aware
Date: 2002-04-05 16:32:35
Message-ID: 24708.1018024355@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Joe Conway <mail(at)joeconway(dot)com> writes:
> But the error comes from pg_verifymbstr. Since bytea has no encoding
> (it's just an array of bytes afterall), why does pg_verifymbstr get
> applied at all to bytea data?

Because textin() is used for the initial conversion to an "unknown"
constant --- see make_const() in parse_node.c.

> pg_verifymbstr is called by textin, bpcharin, and varcharin. Would it
> help to rewrite this as:

> INSERT INTO t1(bytea_col) VALUES('characters produced by
> PQescapebytea'::bytea);

Probably that would cause the error to disappear, but it's hardly a
desirable answer.

I wonder whether this says that TEXT is not a good implementation of
type UNKNOWN. That choice was made on the assumption that TEXT would
faithfully preserve the contents of a C string ... but it seems that in
the multibyte world it ain't so. It would not be a huge amount of work
to write a couple more I/O routines and give UNKNOWN its own I/O
behavior.

OTOH, I was surprised to read your message because I had assumed the
damage was being done much further upstream, viz during collection of
the query string by pq_getstr(). Do we need to think twice about that
processing, as well?

regards, tom lane


From: Joe Conway <mail(at)joeconway(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org, Thomas Lockhart <lockhart(at)fourpalms(dot)org>
Subject: Re: PQescapeBytea is not multibyte aware
Date: 2002-04-05 17:21:42
Message-ID: 3CADDD26.3010404@joeconway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Tom Lane wrote:
>>INSERT INTO t1(bytea_col) VALUES('characters produced by
>>PQescapebytea'::bytea);
>
>
> Probably that would cause the error to disappear, but it's hardly a
> desirable answer.
>
> I wonder whether this says that TEXT is not a good implementation of
> type UNKNOWN. That choice was made on the assumption that TEXT would
> faithfully preserve the contents of a C string ... but it seems that in
> the multibyte world it ain't so. It would not be a huge amount of work
> to write a couple more I/O routines and give UNKNOWN its own I/O
> behavior.

I could take a look at this. Any guidance other than "faithfully
preserving the contents of a C string"?

>
> OTOH, I was surprised to read your message because I had assumed the
> damage was being done much further upstream, viz during collection of
> the query string by pq_getstr(). Do we need to think twice about that
> processing, as well?

I'll take a look at this as well.

Joe


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Joe Conway <mail(at)joeconway(dot)com>
Cc: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org, Thomas Lockhart <lockhart(at)fourpalms(dot)org>
Subject: Re: PQescapeBytea is not multibyte aware
Date: 2002-04-05 18:07:16
Message-ID: 25986.1018030036@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Joe Conway <mail(at)joeconway(dot)com> writes:
> I could take a look at this. Any guidance other than "faithfully
> preserving the contents of a C string"?

Take textin/textout, remove multibyte awareness? Actually the hard
part is to figure out which of the existing hardwired calls of textin
and textout would need to be replaced by calls to unknownin/unknownout.
I think the assumption UNKNOWN == TEXT has crept into a fair number of
places by now.

regards, tom lane


From: Joe Conway <mail(at)joeconway(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org, Thomas Lockhart <lockhart(at)fourpalms(dot)org>
Subject: Re: PQescapeBytea is not multibyte aware
Date: 2002-04-05 18:33:32
Message-ID: 3CADEDFC.508@joeconway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Tom Lane wrote:
>
> OTOH, I was surprised to read your message because I had assumed the
> damage was being done much further upstream, viz during collection of
> the query string by pq_getstr(). Do we need to think twice about that
> processing, as well?
>

I just looked in pq_getstr() I see:

#ifdef MULTIBYTE
p = (char *) pg_client_to_server((unsigned char *) s->data, s->len);
if (p != s->data) /* actual conversion has been done? */

and in pg_client_to_server I see:

if (ClientEncoding->encoding == DatabaseEncoding->encoding)
return s;

So I'm guessing that in Tatsuo's case, both client and database encoding
are the same, and therefore the string was passed as-is downstream. I
think you're correct that in a client/database encoding mismatch
scenario, there would be bigger problems. Thoughts on this?

Joe


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Joe Conway <mail(at)joeconway(dot)com>
Cc: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org, Thomas Lockhart <lockhart(at)fourpalms(dot)org>
Subject: Re: PQescapeBytea is not multibyte aware
Date: 2002-04-05 18:40:52
Message-ID: 29864.1018032052@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Joe Conway <mail(at)joeconway(dot)com> writes:
> I think you're correct that in a client/database encoding mismatch
> scenario, there would be bigger problems. Thoughts on this?

This scenario is probably why Tatsuo wants PQescapeBytea to octalize
everything with the high bit set; I'm not sure there's any lesser way
out. Nonetheless, if UNKNOWN conversion introduces additional failures
then it makes sense to fix that.

regards, tom lane


From: Joe Conway <mail(at)joeconway(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org, Thomas Lockhart <lockhart(at)fourpalms(dot)org>
Subject: Re: PQescapeBytea is not multibyte aware
Date: 2002-04-05 21:53:47
Message-ID: 3CAE1CEB.9030401@joeconway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Tom Lane wrote:
> Joe Conway <mail(at)joeconway(dot)com> writes:
>
>>I think you're correct that in a client/database encoding mismatch
>>scenario, there would be bigger problems. Thoughts on this?
>
>
> This scenario is probably why Tatsuo wants PQescapeBytea to octalize
> everything with the high bit set; I'm not sure there's any lesser way

Yuck! At that point you're no better off than converting to hex (and
worse off than converting to base64) for storage.

SQL99 actually defines BLOB as a binary string literal comprised of an
even number of hexadecimal digits, in single quotes, preceded by "X",
e.g. X'1a43fe'. Should we be looking at implementing the standard
instead of, or in addition to, octalizing? Maybe it is possible to do
this by creating a new datatype, BLOB, which uses new IN/OUT functions,
but otherwise uses the various bytea functions?

> out. Nonetheless, if UNKNOWN conversion introduces additional failures
> then it makes sense to fix that.

I'll follow up on this then.

Joe


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Joe Conway <mail(at)joeconway(dot)com>
Cc: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org, Thomas Lockhart <lockhart(at)fourpalms(dot)org>
Subject: Re: PQescapeBytea is not multibyte aware
Date: 2002-04-05 22:10:38
Message-ID: 10498.1018044638@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Joe Conway <mail(at)joeconway(dot)com> writes:
>> This scenario is probably why Tatsuo wants PQescapeBytea to octalize
>> everything with the high bit set; I'm not sure there's any lesser way

> Yuck! At that point you're no better off than converting to hex (and
> worse off than converting to base64) for storage.

No; the *storage* is still compact, it's just the I/O representation
that's not.

> SQL99 actually defines BLOB as a binary string literal comprised of an
> even number of hexadecimal digits, in single quotes, preceded by "X",
> e.g. X'1a43fe'. Should we be looking at implementing the standard
> instead of, or in addition to, octalizing?

Perhaps we should cause the system to regard hex-strings as literals of
type bytea. Right now I think they're taken to be integer constants,
which is clearly not per spec.

regards, tom lane


From: Joe Conway <mail(at)joeconway(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org, Thomas Lockhart <lockhart(at)fourpalms(dot)org>
Subject: Re: PQescapeBytea is not multibyte aware
Date: 2002-04-05 22:58:41
Message-ID: 3CAE2C21.1030308@joeconway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Tom Lane wrote:
>> Yuck! At that point you're no better off than converting to hex
>> (and worse off than converting to base64) for storage.
>
>
> No; the *storage* is still compact, it's just the I/O representation
> that's not.

Yeah, I realized that after I pushed send ;)

But still, doesn't that mean roughly twice as much memory usage for each
copy of the string? And I seem to remember Jan saying that each datum
winds up having 4 copies in memory. It ends up impacting the practical
length limit for a bytea value.

>
>
>> SQL99 actually defines BLOB as a binary string literal comprised
>> of an even number of hexadecimal digits, in single quotes,
>> preceded by "X", e.g. X'1a43fe'. Should we be looking at
>> implementing the standard instead of, or in addition to,
>> octalizing?
>
>
> Perhaps we should cause the system to regard hex-strings as literals
> of type bytea. Right now I think they're taken to be integer
> constants, which is clearly not per spec.

Wow. I didn't realize this was possible:

test=# select X'ffff';
?column?
----------
65535
(1 row)

This does clearly conflict with the spec, but what about backward
compatibility? Do you think many people use this capability?

Joe


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Joe Conway <mail(at)joeconway(dot)com>
Cc: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org, Thomas Lockhart <lockhart(at)fourpalms(dot)org>
Subject: Re: PQescapeBytea is not multibyte aware
Date: 2002-04-05 23:25:03
Message-ID: 10929.1018049103@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Joe Conway <mail(at)joeconway(dot)com> writes:
> But still, doesn't that mean roughly twice as much memory usage for each
> copy of the string? And I seem to remember Jan saying that each datum
> winds up having 4 copies in memory. It ends up impacting the practical
> length limit for a bytea value.

Well, once the data actually reaches Datum form it'll be in internal
representation, hence compact. I'm not sure how many copies the parser
will make in the process of casting to UNKNOWN and then to bytea, but
I'm not terribly concerned by the above argument.

> Wow. I didn't realize this was possible:

> test=# select X'ffff';
> ?column?
> ----------
> 65535
> (1 row)

> This does clearly conflict with the spec, but what about backward
> compatibility? Do you think many people use this capability?

No idea. I don't think it's documented anywhere, though...

regards, tom lane


From: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
To: tgl(at)sss(dot)pgh(dot)pa(dot)us
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: PQescapeBytea is not multibyte aware
Date: 2002-04-06 00:43:08
Message-ID: 20020406094308W.t-ishii@sra.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

> Hmm, so essentially we'd have to convert all codes >= 0x80 to prevent
> them from being mistaken for parts of multibyte sequences?

Yes.

> Ugh, but
> you're probably right. It looks to me like byteaout does the reverse
> already.

As for the new UNKNOWN data type, that seems a good thing for
me. However, I think more aggressive soultion would be having an
encoding info in the text data type itself. This would also opens the
way to implement SQL99's CREATE CHARACTER SET stuffs. I have been
thinking about this for a while and want to make a RFC in the future(I
need to rethink my idea to adopt the SCHEMA you introduced).

BTW, for the 7.2.x tree we need a solution with lesser impact.
For this purpose, I would like to change PQescapeBytea as I stated in
the previous mail. Objection?
--
Tatsuo Ishii


From: Joe Conway <mail(at)joeconway(dot)com>
To: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
Cc: tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-hackers(at)postgresql(dot)org
Subject: Re: PQescapeBytea is not multibyte aware
Date: 2002-04-07 16:38:41
Message-ID: 3CB07611.80700@joeconway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Tatsuo Ishii wrote:
> BTW, for the 7.2.x tree we need a solution with lesser impact.
> For this purpose, I would like to change PQescapeBytea as I stated in
> the previous mail. Objection?
> --
> Tatsuo Ishii

No objection here, but can we wrap the change in #ifdef MULTIBYTE so
there's no effect for people who don't use MULTIBYTE?

Joe


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Joe Conway <mail(at)joeconway(dot)com>
Cc: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: PQescapeBytea is not multibyte aware
Date: 2002-04-07 17:10:53
Message-ID: 1787.1018199453@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Joe Conway <mail(at)joeconway(dot)com> writes:
> No objection here, but can we wrap the change in #ifdef MULTIBYTE so
> there's no effect for people who don't use MULTIBYTE?

That opens up the standard set of issues about "what if your server is
MULTIBYTE but your libpq is not?" It seems risky to me.

regards, tom lane


From: Joe Conway <mail(at)joeconway(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>, pgsql-patches(at)postgresql(dot)org, Thomas Lockhart <lockhart(at)fourpalms(dot)org>
Subject: unknownin/out patch (was [HACKERS] PQescapeBytea is not multibyte aware)
Date: 2002-04-07 23:18:57
Message-ID: 3CB0D3E1.4010508@joeconway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Tom Lane wrote:
> Joe Conway <mail(at)joeconway(dot)com> writes:
>
>>I think you're correct that in a client/database encoding mismatch
>>scenario, there would be bigger problems. Thoughts on this?
>
>
> This scenario is probably why Tatsuo wants PQescapeBytea to octalize
> everything with the high bit set; I'm not sure there's any lesser way
> out. Nonetheless, if UNKNOWN conversion introduces additional failures
> then it makes sense to fix that.
>
> regards, tom lane
>

Here's a patch to add unknownin/unknownout support. I also poked around
looking for places that assume UNKNOWN == TEXT. One of those was the
"SET" type in pg_type.h, which was using textin/textout. This one I took
care of in this patch. The other suspicious place was in
string_to_dataum (which is defined in both selfuncs.c and indxpath.c). I
wasn't too sure about those, so I left them be.

Regression tests all pass with the exception of horology, which also
fails on CVS tip. It looks like that is a daylight savings time issue
though.

Also as a side note, I can't get make check to get past initdb if I
configure with --enable-multibyte on CVS tip. Is there a known problem
or am I just being clueless . . .wait, let's qualify that -- am I being
clueless on this one issue? ;-)

Joe

Attachment Content-Type Size
unk.r0.patch text/plain 6.6 KB

From: Joe Conway <mail(at)joeconway(dot)com>
To: Joe Conway <mail(at)joeconway(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>, pgsql-patches(at)postgresql(dot)org, Thomas Lockhart <lockhart(at)fourpalms(dot)org>
Subject: Re: unknownin/out patch (was [HACKERS] PQescapeBytea is
Date: 2002-04-08 02:43:20
Message-ID: 3CB103C8.1000909@joeconway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Joe Conway wrote:
> Here's a patch to add unknownin/unknownout support. I also poked around
> looking for places that assume UNKNOWN == TEXT. One of those was the
> "SET" type in pg_type.h, which was using textin/textout. This one I took
> care of in this patch. The other suspicious place was in
> string_to_dataum (which is defined in both selfuncs.c and indxpath.c). I
> wasn't too sure about those, so I left them be.
>

I found three other suspicious spots in the source, where UNKNOWN ==
TEXT is assumed. The first looks like it needs to be changed for sure,
the other two I'm less sure about. Feedback would be most appreciated
(on this and the patch itself).

(line numbers based on CVS from earlier today)
parse_node.c
line 428
parse_coerce.c
line 85
parse_coerce.c
line 403

Joe


From: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
To: tgl(at)sss(dot)pgh(dot)pa(dot)us
Cc: mail(at)joeconway(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: PQescapeBytea is not multibyte aware
Date: 2002-04-08 03:52:02
Message-ID: 20020408125202S.t-ishii@sra.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

> Joe Conway <mail(at)joeconway(dot)com> writes:
> > No objection here, but can we wrap the change in #ifdef MULTIBYTE so
> > there's no effect for people who don't use MULTIBYTE?
>
> That opens up the standard set of issues about "what if your server is
> MULTIBYTE but your libpq is not?" It seems risky to me.

I have committed changes to the current source (without MULTIBYTE
ifdes). Will change t.2-stable tree soon.

I also added some careful handlings for memory allocation errors and
changed some questionable codes useing direct ASCII values 92 instead
of '\\' for example.
--
Tatsuo Ishii


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Joe Conway <mail(at)joeconway(dot)com>
Cc: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>, pgsql-patches(at)postgresql(dot)org, Thomas Lockhart <lockhart(at)fourpalms(dot)org>
Subject: Re: unknownin/out patch (was [HACKERS] PQescapeBytea is not multibyte aware)
Date: 2002-04-08 04:40:56
Message-ID: 5671.1018240856@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Joe Conway <mail(at)joeconway(dot)com> writes:
> Regression tests all pass with the exception of horology, which also
> fails on CVS tip. It looks like that is a daylight savings time issue
> though.

Yup, ye olde DST-transition-makes-for-funny-day-length issue. This is
mentioned in the docs at
http://www.ca.postgresql.org/users-lounge/docs/7.2/postgres/regress-evaluation.html#AEN18363
although I see the troublesome tests are now in horology not timestamp.
(Docs fixed...)

> Also as a side note, I can't get make check to get past initdb if I
> configure with --enable-multibyte on CVS tip. Is there a known problem

News to me --- anyone else seeing that?

regards, tom lane


From: "Christopher Kings-Lynne" <chriskl(at)familyhealth(dot)com(dot)au>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Joe Conway" <mail(at)joeconway(dot)com>
Cc: "Tatsuo Ishii" <t-ishii(at)sra(dot)co(dot)jp>, <pgsql-patches(at)postgresql(dot)org>, "Thomas Lockhart" <lockhart(at)fourpalms(dot)org>
Subject: Re: unknownin/out patch (was [HACKERS] PQescapeBytea is not multibyte aware)
Date: 2002-04-08 04:47:36
Message-ID: GNELIHDDFBOCMGBFGEFOOEADCCAA.chriskl@familyhealth.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

> Yup, ye olde DST-transition-makes-for-funny-day-length issue. This is
> mentioned in the docs at
> http://www.ca.postgresql.org/users-lounge/docs/7.2/postgres/regres
> s-evaluation.html#AEN18363
> although I see the troublesome tests are now in horology not timestamp.
> (Docs fixed...)
>
> > Also as a side note, I can't get make check to get past initdb if I
> > configure with --enable-multibyte on CVS tip. Is there a known problem
>
> News to me --- anyone else seeing that?

I get initdb failures all the time when building CVS. You need to gmake
clean to fix some things. Try doing a gmake clean && gmake check

Chris


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Joe Conway <mail(at)joeconway(dot)com>
Cc: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>, pgsql-patches(at)postgresql(dot)org, Thomas Lockhart <lockhart(at)fourpalms(dot)org>
Subject: Re: unknownin/out patch (was [HACKERS] PQescapeBytea is not multibyte aware)
Date: 2002-04-08 05:25:21
Message-ID: 15592.1018243521@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

>> Also as a side note, I can't get make check to get past initdb if I
>> configure with --enable-multibyte on CVS tip. Is there a known problem

> News to me --- anyone else seeing that?

FWIW, CVS tip with --enable-multibyte builds and passes regression tests
here (modulo the horology thing). I concur with Chris' suggestion that
you may not have done a clean reconfiguration. If you're not using
--enable-depend then a "make clean" is certainly needed.

regards, tom lane


From: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
To: tgl(at)sss(dot)pgh(dot)pa(dot)us
Cc: mail(at)joeconway(dot)com, pgsql-patches(at)postgresql(dot)org, lockhart(at)fourpalms(dot)org
Subject: Re: unknownin/out patch (was [HACKERS] PQescapeBytea is not
Date: 2002-04-08 05:45:13
Message-ID: 20020408144513I.t-ishii@sra.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

> >> Also as a side note, I can't get make check to get past initdb if I
> >> configure with --enable-multibyte on CVS tip. Is there a known problem
>
> > News to me --- anyone else seeing that?
>
> FWIW, CVS tip with --enable-multibyte builds and passes regression tests
> here (modulo the horology thing). I concur with Chris' suggestion that
> you may not have done a clean reconfiguration. If you're not using
> --enable-depend then a "make clean" is certainly needed.

Try a multibyte encoding database. For example,

$ createdb -E EUC_JP test
$ psql -c 'SELECT SUBSTRING('1234567890' FROM 3)' test
substring
-----------
3456
(1 row)

Apparently this is wrong.
--
Tatsuo Ishii


From: Joe Conway <mail(at)joeconway(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>, pgsql-patches(at)postgresql(dot)org, Thomas Lockhart <lockhart(at)fourpalms(dot)org>
Subject: Re: unknownin/out patch (was [HACKERS] PQescapeBytea is
Date: 2002-04-08 06:06:43
Message-ID: 3CB13373.1080809@joeconway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Tom Lane wrote:
> FWIW, CVS tip with --enable-multibyte builds and passes regression tests
> here (modulo the horology thing). I concur with Chris' suggestion that
> you may not have done a clean reconfiguration. If you're not using
> --enable-depend then a "make clean" is certainly needed.
>

--enable-depend did the trick.

Patch now passes all tests but horology with:

./configure --enable-locale --enable-debug --enable-cassert
--enable-multibyte --enable-syslog --enable-nls --enable-depend

Thanks!

Joe


From: Joe Conway <mail(at)joeconway(dot)com>
To: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
Cc: tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-patches(at)postgresql(dot)org, lockhart(at)fourpalms(dot)org
Subject: Re: unknownin/out patch (was [HACKERS] PQescapeBytea is
Date: 2002-04-08 23:46:35
Message-ID: 3CB22BDB.7030000@joeconway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Tatsuo Ishii wrote:
>
>
> Try a multibyte encoding database. For example,
>
> $ createdb -E EUC_JP test
> $ psql -c 'SELECT SUBSTRING('1234567890' FROM 3)' test
> substring
> -----------
> 3456
> (1 row)
>
> Apparently this is wrong.
> --
> Tatsuo Ishii

This problem exists in CVS tip *without* the unknownin/out patch:

# psql -U postgres testjp
Welcome to psql, the PostgreSQL interactive terminal.

Type: \copyright for distribution terms
\h for help with SQL commands
\? for help on internal slash commands
\g or terminate with semicolon to execute query
\q to quit

testjp=# SELECT SUBSTRING('1234567890' FROM 3);
substring
-----------
3456
(1 row)

testjp=# select * from pg_type where typname = 'unknown';
typname | typnamespace | typowner | typlen | typprtlen | typbyval |
typtype | typisdefined | typdelim | typrelid | typelem | typinput |
typoutput | typreceive | typsend | typalign | typstorage | typnotnull |
typbasetype | typtypmod | typndims | typdefaultbin | typdefault
---------+--------------+----------+--------+-----------+----------+---------+--------------+----------+----------+---------+----------+-----------+------------+---------+----------+------------+------------+-------------+-----------+----------+---------------+------------
unknown | 11 | 1 | -1 | -1 | f | b
| t | , | 0 | 0 | textin |
textout | textin | textout | i | p | f |
0 | -1 | 0 | |
(1 row)

This is built from source with:
#define CATALOG_VERSION_NO 200204031

./configure --enable-locale --enable-debug --enable-cassert
--enable-multibyte --enable-syslog --enable-nls --enable-depend

Joe


From: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
To: mail(at)joeconway(dot)com
Cc: tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-patches(at)postgresql(dot)org, lockhart(at)fourpalms(dot)org
Subject: Re: unknownin/out patch (was [HACKERS] PQescapeBytea is
Date: 2002-04-09 01:23:42
Message-ID: 20020409102342X.t-ishii@sra.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

> Tatsuo Ishii wrote:
> >
> >
> > Try a multibyte encoding database. For example,
> >
> > $ createdb -E EUC_JP test
> > $ psql -c 'SELECT SUBSTRING('1234567890' FROM 3)' test
> > substring
> > -----------
> > 3456
> > (1 row)
> >
> > Apparently this is wrong.
> > --
> > Tatsuo Ishii
>
> This problem exists in CVS tip *without* the unknownin/out patch:

Sure. That has been broken for a while.
--
Tatsuo Ishii


From: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
To: mail(at)joeconway(dot)com
Cc: tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-patches(at)postgresql(dot)org, lockhart(at)fourpalms(dot)org
Subject: Re: unknownin/out patch (was [HACKERS] PQescapeBytea is
Date: 2002-04-09 05:08:56
Message-ID: 20020409140856E.t-ishii@sra.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

> > Tatsuo Ishii wrote:
> > >
> > >
> > > Try a multibyte encoding database. For example,
> > >
> > > $ createdb -E EUC_JP test
> > > $ psql -c 'SELECT SUBSTRING('1234567890' FROM 3)' test
> > > substring
> > > -----------
> > > 3456
> > > (1 row)
> > >
> > > Apparently this is wrong.
> > > --
> > > Tatsuo Ishii
> >
> > This problem exists in CVS tip *without* the unknownin/out patch:
>
> Sure. That has been broken for a while.

I guess this actually happened in 1.79 of varlena.c:

---------------------------------------------------------------------------
revision 1.79
date: 2002/03/05 05:33:19; author: momjian; state: Exp; lines: +45 -42
I attach a version of my toast-slicing patch, against current CVS
(current as of a few hours ago.)

This patch:

1. Adds PG_GETARG_xxx_P_SLICE() macros and associated support routines.

2. Adds routines in src/backend/access/tuptoaster.c for fetching only
necessary chunks of a toasted value. (Modelled on latest changes to
assume chunks are returned in order).

3. Amends text_substr and bytea_substr to use new methods. It now
handles multibyte cases -and should still lead to a performance
improvement in the multibyte case where the substring is near the
beginning of the string.

4. Added new command: ALTER TABLE tabname ALTER COLUMN colname SET
STORAGE {PLAIN | EXTERNAL | EXTENDED | MAIN} to parser and documented in
alter-table.sgml. (NB I used ColId as the item type for the storage
mode string, rather than a new production - I hope this makes sense!).
All this does is sets attstorage for the specified column.

4. AlterTableAlterColumnStatistics is now AlterTableAlterColumnFlags and
handles both statistics and storage (it uses the subtype code to
distinguish). The previous version of my patch also re-arranged other
code in backend/commands/command.c but I have dropped that from this
patch.(I plan to return to it separately).

5. Documented new macros (and also the PG_GETARG_xxx_P_COPY macros) in
xfunc.sgml. ref/alter_table.sgml also contains documentation for ALTER
COLUMN SET STORAGE.

John Gray
---------------------------------------------------------------------------


From: Joe Conway <mail(at)joeconway(dot)com>
To: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
Cc: tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-patches(at)postgresql(dot)org, lockhart(at)fourpalms(dot)org
Subject: Re: unknownin/out patch (was [HACKERS] PQescapeBytea is
Date: 2002-04-09 05:37:59
Message-ID: 3CB27E37.9010704@joeconway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Tatsuo Ishii wrote:
>>> Tatsuo Ishii wrote:
>>>
>>>>
>>>> Try a multibyte encoding database. For example,
>>>>
>>>> $ createdb -E EUC_JP test $ psql -c 'SELECT
>>>> SUBSTRING('1234567890' FROM 3)' test substring ----------- 3456
>>>>

>>>> (1 row)
>>>>
>>>> Apparently this is wrong. -- Tatsuo Ishii
>>>
>>> This problem exists in CVS tip *without* the unknownin/out
>>> patch:
>>
>> Sure. That has been broken for a while.
>
>
> I guess this actually happened in 1.79 of varlena.c:
>
Yes, I was just looking at that also. It doesn't consider the case of n
= -1 for MB. See the lines:

#ifdef MULTIBYTE
eml = pg_database_encoding_max_length ();

if (eml > 1)
{
sm = 0;
sn = (m + n) * eml + 3;
}
#endif

When n = -1 this does the wrong thing. And also a few lines later:

#ifdef MULTIBYTE
len = pg_mbstrlen_with_len (VARDATA (string), sn - 3);

I think both places need to test for n = -1. Do you agree?

Joe


From: Joe Conway <mail(at)joeconway(dot)com>
To: Joe Conway <mail(at)joeconway(dot)com>
Cc: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>, tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-patches(at)postgresql(dot)org, lockhart(at)fourpalms(dot)org
Subject: Re: unknownin/out patch (was [HACKERS] PQescapeBytea is
Date: 2002-04-09 05:57:47
Message-ID: 3CB282DB.4050708@joeconway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Joe Conway wrote:
> Tatsuo Ishii wrote:
> >>> Tatsuo Ishii wrote:
> >>>
> >>>>
> >>>> Try a multibyte encoding database. For example,
> >>>>
> >>>> $ createdb -E EUC_JP test $ psql -c 'SELECT
> >>>> SUBSTRING('1234567890' FROM 3)' test substring ----------- 3456
> >>>>
>
> >>>> (1 row)
> >>>>
> >>>> Apparently this is wrong. -- Tatsuo Ishii
> >>>
> >>> This problem exists in CVS tip *without* the unknownin/out
> >>> patch:
> >>
> >> Sure. That has been broken for a while.
> >
> >
> > I guess this actually happened in 1.79 of varlena.c:
> >
> Yes, I was just looking at that also. It doesn't consider the case of n
> = -1 for MB. See the lines:
>
> #ifdef MULTIBYTE
> eml = pg_database_encoding_max_length ();
>
> if (eml > 1)
> {
> sm = 0;
> sn = (m + n) * eml + 3;
> }
> #endif
>
> When n = -1 this does the wrong thing. And also a few lines later:
>
> #ifdef MULTIBYTE
> len = pg_mbstrlen_with_len (VARDATA (string), sn - 3);
>
> I think both places need to test for n = -1. Do you agree?
>
>
> Joe
>

The attached patch should fix the bug reported by Tatsuo.

# psql -U postgres testjp
Welcome to psql, the PostgreSQL interactive terminal.

Type: \copyright for distribution terms
\h for help with SQL commands
\? for help on internal slash commands
\g or terminate with semicolon to execute query
\q to quit

testjp=# SELECT SUBSTRING('1234567890' FROM 3);
substring
------------
34567890
(1 row)

Joe

Attachment Content-Type Size
mb_substr.patch text/plain 1.1 KB

From: John Gray <jgray(at)azuli(dot)co(dot)uk>
To: Joe Conway <mail(at)joeconway(dot)com>
Cc: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>, tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-patches(at)postgresql(dot)org, lockhart(at)fourpalms(dot)org
Subject: Re: unknownin/out patch (was [HACKERS] PQescapeBytea is
Date: 2002-04-09 09:11:02
Message-ID: 1018343465.3587.56.camel@adzuki
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Tue, 2002-04-09 at 06:57, Joe Conway wrote:
[snipped]
> > Yes, I was just looking at that also. It doesn't consider the case of n
> > = -1 for MB. See the lines:
> >
> > #ifdef MULTIBYTE
> > eml = pg_database_encoding_max_length ();
> >
> > if (eml > 1)
> > {
> > sm = 0;
> > sn = (m + n) * eml + 3;
> > }
> > #endif
> >
> > When n = -1 this does the wrong thing. And also a few lines later:
> >
> > #ifdef MULTIBYTE
> > len = pg_mbstrlen_with_len (VARDATA (string), sn - 3);
> >
> > I think both places need to test for n = -1. Do you agree?
> >

Sorry folks! I hadn't thought through the logic of that in the n = -1
and multibyte case. The patch looks OK to me.

John


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Joe Conway <mail(at)joeconway(dot)com>, Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>, <pgsql-patches(at)postgresql(dot)org>, Thomas Lockhart <lockhart(at)fourpalms(dot)org>
Subject: Re: unknownin/out patch (was [HACKERS] PQescapeBytea is
Date: 2002-04-09 18:13:19
Message-ID: Pine.LNX.4.30.0204091410380.685-100000@peter.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Tom Lane writes:

> FWIW, CVS tip with --enable-multibyte builds and passes regression tests
> here (modulo the horology thing). I concur with Chris' suggestion that
> you may not have done a clean reconfiguration. If you're not using
> --enable-depend then a "make clean" is certainly needed.

Maybe we should turn on dependency tracking by default? This is about the
(enough + 1)th time I'm seeing this.

--
Peter Eisentraut peter_e(at)gmx(dot)net


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Joe Conway <mail(at)joeconway(dot)com>, Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>, pgsql-patches(at)postgresql(dot)org, Thomas Lockhart <lockhart(at)fourpalms(dot)org>
Subject: Re: unknownin/out patch (was [HACKERS] PQescapeBytea is
Date: 2002-04-09 18:23:47
Message-ID: 200204091823.g39INlV27392@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Peter Eisentraut wrote:
> Tom Lane writes:
>
> > FWIW, CVS tip with --enable-multibyte builds and passes regression tests
> > here (modulo the horology thing). I concur with Chris' suggestion that
> > you may not have done a clean reconfiguration. If you're not using
> > --enable-depend then a "make clean" is certainly needed.
>
> Maybe we should turn on dependency tracking by default? This is about the
> (enough + 1)th time I'm seeing this.

What is the downside to turning it on? I can't think of one.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Joe Conway <mail(at)joeconway(dot)com>, Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>, pgsql-patches(at)postgresql(dot)org, Thomas Lockhart <lockhart(at)fourpalms(dot)org>
Subject: Re: unknownin/out patch (was [HACKERS] PQescapeBytea is
Date: 2002-04-09 19:46:33
Message-ID: 9853.1018381593@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> Peter Eisentraut wrote:
>> Maybe we should turn on dependency tracking by default? This is about the
>> (enough + 1)th time I'm seeing this.

> What is the downside to turning it on? I can't think of one.

Well, we'll still see the same kinds of reports from developers using
non-GCC compilers (surely there are some) ... so enable-depend isn't
going to magically make the issue go away.

Personally I tend to rebuild from "make clean" whenever I've done
anything nontrivial, and certainly after a CVS sync; so I have no use
for enable-depend. But as long as I can turn it off, I don't object
to changing the default.

regards, tom lane


From: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
To: mail(at)joeconway(dot)com
Cc: tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-patches(at)postgresql(dot)org, lockhart(at)fourpalms(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCHES] unknownin/out patch (was PQescapeBytea is
Date: 2002-04-15 07:50:23
Message-ID: 20020415165023R.t-ishii@sra.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

I'm about to commit your patches with a small fix.
--
Tatsuo Ishii

From: Joe Conway <mail(at)joeconway(dot)com>
Subject: Re: [PATCHES] unknownin/out patch (was [HACKERS] PQescapeBytea is
Date: Mon, 08 Apr 2002 22:57:47 -0700
Message-ID: <3CB282DB(dot)4050708(at)joeconway(dot)com>

> Joe Conway wrote:
> > Tatsuo Ishii wrote:
> > >>> Tatsuo Ishii wrote:
> > >>>
> > >>>>
> > >>>> Try a multibyte encoding database. For example,
> > >>>>
> > >>>> $ createdb -E EUC_JP test $ psql -c 'SELECT
> > >>>> SUBSTRING('1234567890' FROM 3)' test substring ----------- 3456
> > >>>>
> >
> > >>>> (1 row)
> > >>>>
> > >>>> Apparently this is wrong. -- Tatsuo Ishii
> > >>>
> > >>> This problem exists in CVS tip *without* the unknownin/out
> > >>> patch:
> > >>
> > >> Sure. That has been broken for a while.
> > >
> > >
> > > I guess this actually happened in 1.79 of varlena.c:
> > >
> > Yes, I was just looking at that also. It doesn't consider the case of n
> > = -1 for MB. See the lines:
> >
> > #ifdef MULTIBYTE
> > eml = pg_database_encoding_max_length ();
> >
> > if (eml > 1)
> > {
> > sm = 0;
> > sn = (m + n) * eml + 3;
> > }
> > #endif
> >
> > When n = -1 this does the wrong thing. And also a few lines later:
> >
> > #ifdef MULTIBYTE
> > len = pg_mbstrlen_with_len (VARDATA (string), sn - 3);
> >
> > I think both places need to test for n = -1. Do you agree?
> >
> >
> > Joe
> >
>
> The attached patch should fix the bug reported by Tatsuo.
>
> # psql -U postgres testjp
> Welcome to psql, the PostgreSQL interactive terminal.
>
> Type: \copyright for distribution terms
> \h for help with SQL commands
> \? for help on internal slash commands
> \g or terminate with semicolon to execute query
> \q to quit
>
> testjp=# SELECT SUBSTRING('1234567890' FROM 3);
> substring
> ------------
> 34567890
> (1 row)
>
> Joe


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Joe Conway <mail(at)joeconway(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>, pgsql-patches(at)postgresql(dot)org, Thomas Lockhart <lockhart(at)fourpalms(dot)org>
Subject: Re: unknownin/out patch (was [HACKERS] PQescapeBytea is not
Date: 2002-04-18 13:30:13
Message-ID: 200204181330.g3IDUDw27192@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


Your patch has been added to the PostgreSQL unapplied patches list at:

http://candle.pha.pa.us/cgi-bin/pgpatches

I will try to apply it within the next 48 hours.

---------------------------------------------------------------------------

Joe Conway wrote:
> Tom Lane wrote:
> > Joe Conway <mail(at)joeconway(dot)com> writes:
> >
> >>I think you're correct that in a client/database encoding mismatch
> >>scenario, there would be bigger problems. Thoughts on this?
> >
> >
> > This scenario is probably why Tatsuo wants PQescapeBytea to octalize
> > everything with the high bit set; I'm not sure there's any lesser way
> > out. Nonetheless, if UNKNOWN conversion introduces additional failures
> > then it makes sense to fix that.
> >
> > regards, tom lane
> >
>
> Here's a patch to add unknownin/unknownout support. I also poked around
> looking for places that assume UNKNOWN == TEXT. One of those was the
> "SET" type in pg_type.h, which was using textin/textout. This one I took
> care of in this patch. The other suspicious place was in
> string_to_dataum (which is defined in both selfuncs.c and indxpath.c). I
> wasn't too sure about those, so I left them be.
>
> Regression tests all pass with the exception of horology, which also
> fails on CVS tip. It looks like that is a daylight savings time issue
> though.
>
> Also as a side note, I can't get make check to get past initdb if I
> configure with --enable-multibyte on CVS tip. Is there a known problem
> or am I just being clueless . . .wait, let's qualify that -- am I being
> clueless on this one issue? ;-)
>
> Joe

> diff -Ncr pgsql.orig/src/backend/utils/adt/varlena.c pgsql/src/backend/utils/adt/varlena.c
> *** pgsql.orig/src/backend/utils/adt/varlena.c Sun Apr 7 10:21:25 2002
> --- pgsql/src/backend/utils/adt/varlena.c Sun Apr 7 11:44:54 2002
> ***************
> *** 228,233 ****
> --- 228,273 ----
> }
>
>
> + /*
> + * unknownin - converts "..." to internal representation
> + */
> + Datum
> + unknownin(PG_FUNCTION_ARGS)
> + {
> + char *inputStr = PG_GETARG_CSTRING(0);
> + unknown *result;
> + int len;
> +
> + len = strlen(inputStr) + VARHDRSZ;
> +
> + result = (unknown *) palloc(len);
> + VARATT_SIZEP(result) = len;
> +
> + memcpy(VARDATA(result), inputStr, len - VARHDRSZ);
> +
> + PG_RETURN_UNKNOWN_P(result);
> + }
> +
> +
> + /*
> + * unknownout - converts internal representation to "..."
> + */
> + Datum
> + unknownout(PG_FUNCTION_ARGS)
> + {
> + unknown *t = PG_GETARG_UNKNOWN_P(0);
> + int len;
> + char *result;
> +
> + len = VARSIZE(t) - VARHDRSZ;
> + result = (char *) palloc(len + 1);
> + memcpy(result, VARDATA(t), len);
> + result[len] = '\0';
> +
> + PG_RETURN_CSTRING(result);
> + }
> +
> +
> /* ========== PUBLIC ROUTINES ========== */
>
> /*
> diff -Ncr pgsql.orig/src/include/c.h pgsql/src/include/c.h
> *** pgsql.orig/src/include/c.h Sun Apr 7 10:21:29 2002
> --- pgsql/src/include/c.h Sun Apr 7 11:40:59 2002
> ***************
> *** 389,394 ****
> --- 389,395 ----
> */
> typedef struct varlena bytea;
> typedef struct varlena text;
> + typedef struct varlena unknown;
> typedef struct varlena BpChar; /* blank-padded char, ie SQL char(n) */
> typedef struct varlena VarChar; /* var-length char, ie SQL varchar(n) */
>
> diff -Ncr pgsql.orig/src/include/catalog/pg_proc.h pgsql/src/include/catalog/pg_proc.h
> *** pgsql.orig/src/include/catalog/pg_proc.h Sun Apr 7 10:21:29 2002
> --- pgsql/src/include/catalog/pg_proc.h Sun Apr 7 11:56:09 2002
> ***************
> *** 235,240 ****
> --- 235,245 ----
> DATA(insert OID = 108 ( scalargtjoinsel PGNSP PGUID 12 f t t s 3 f 701 "0 26 0" 100 0 0 100 scalargtjoinsel - _null_ ));
> DESCR("join selectivity of > and related operators on scalar datatypes");
>
> + DATA(insert OID = 109 ( unknownin PGNSP PGUID 12 f t t i 1 f 705 "0" 100 0 0 100 unknownin - _null_ ));
> + DESCR("(internal)");
> + DATA(insert OID = 110 ( unknownout PGNSP PGUID 12 f t t i 1 f 23 "0" 100 0 0 100 unknownout - _null_ ));
> + DESCR("(internal)");
> +
> DATA(insert OID = 112 ( text PGNSP PGUID 12 f t t i 1 f 25 "23" 100 0 0 100 int4_text - _null_ ));
> DESCR("convert int4 to text");
> DATA(insert OID = 113 ( text PGNSP PGUID 12 f t t i 1 f 25 "21" 100 0 0 100 int2_text - _null_ ));
> diff -Ncr pgsql.orig/src/include/catalog/pg_type.h pgsql/src/include/catalog/pg_type.h
> *** pgsql.orig/src/include/catalog/pg_type.h Sun Apr 7 10:21:29 2002
> --- pgsql/src/include/catalog/pg_type.h Sun Apr 7 11:57:36 2002
> ***************
> *** 302,308 ****
> DESCR("array of INDEX_MAX_KEYS oids, used in system tables");
> #define OIDVECTOROID 30
>
> ! DATA(insert OID = 32 ( SET PGNSP PGUID -1 -1 f b t \054 0 0 textin textout textin textout i p f 0 -1 0 _null_ _null_ ));
> DESCR("set of tuples");
>
> DATA(insert OID = 71 ( pg_type PGNSP PGUID 4 4 t c t \054 1247 0 int4in int4out int4in int4out i p f 0 -1 0 _null_ _null_ ));
> --- 302,308 ----
> DESCR("array of INDEX_MAX_KEYS oids, used in system tables");
> #define OIDVECTOROID 30
>
> ! DATA(insert OID = 32 ( SET PGNSP PGUID -1 -1 f b t \054 0 0 unknownin unknownout unknownin unknownout i p f 0 -1 0 _null_ _null_ ));
> DESCR("set of tuples");
>
> DATA(insert OID = 71 ( pg_type PGNSP PGUID 4 4 t c t \054 1247 0 int4in int4out int4in int4out i p f 0 -1 0 _null_ _null_ ));
> ***************
> *** 366,372 ****
> DATA(insert OID = 704 ( tinterval PGNSP PGUID 12 47 f b t \054 0 0 tintervalin tintervalout tintervalin tintervalout i p f 0 -1 0 _null_ _null_ ));
> DESCR("(abstime,abstime), time interval");
> #define TINTERVALOID 704
> ! DATA(insert OID = 705 ( unknown PGNSP PGUID -1 -1 f b t \054 0 0 textin textout textin textout i p f 0 -1 0 _null_ _null_ ));
> DESCR("");
> #define UNKNOWNOID 705
>
> --- 366,372 ----
> DATA(insert OID = 704 ( tinterval PGNSP PGUID 12 47 f b t \054 0 0 tintervalin tintervalout tintervalin tintervalout i p f 0 -1 0 _null_ _null_ ));
> DESCR("(abstime,abstime), time interval");
> #define TINTERVALOID 704
> ! DATA(insert OID = 705 ( unknown PGNSP PGUID -1 -1 f b t \054 0 0 unknownin unknownout unknownin unknownout i p f 0 -1 0 _null_ _null_ ));
> DESCR("");
> #define UNKNOWNOID 705
>
> diff -Ncr pgsql.orig/src/include/fmgr.h pgsql/src/include/fmgr.h
> *** pgsql.orig/src/include/fmgr.h Sun Apr 7 10:21:29 2002
> --- pgsql/src/include/fmgr.h Sun Apr 7 12:11:30 2002
> ***************
> *** 185,190 ****
> --- 185,191 ----
> /* DatumGetFoo macros for varlena types will typically look like this: */
> #define DatumGetByteaP(X) ((bytea *) PG_DETOAST_DATUM(X))
> #define DatumGetTextP(X) ((text *) PG_DETOAST_DATUM(X))
> + #define DatumGetUnknownP(X) ((unknown *) PG_DETOAST_DATUM(X))
> #define DatumGetBpCharP(X) ((BpChar *) PG_DETOAST_DATUM(X))
> #define DatumGetVarCharP(X) ((VarChar *) PG_DETOAST_DATUM(X))
> /* And we also offer variants that return an OK-to-write copy */
> ***************
> *** 200,205 ****
> --- 201,207 ----
> /* GETARG macros for varlena types will typically look like this: */
> #define PG_GETARG_BYTEA_P(n) DatumGetByteaP(PG_GETARG_DATUM(n))
> #define PG_GETARG_TEXT_P(n) DatumGetTextP(PG_GETARG_DATUM(n))
> + #define PG_GETARG_UNKNOWN_P(n) DatumGetUnknownP(PG_GETARG_DATUM(n))
> #define PG_GETARG_BPCHAR_P(n) DatumGetBpCharP(PG_GETARG_DATUM(n))
> #define PG_GETARG_VARCHAR_P(n) DatumGetVarCharP(PG_GETARG_DATUM(n))
> /* And we also offer variants that return an OK-to-write copy */
> ***************
> *** 239,244 ****
> --- 241,247 ----
> /* RETURN macros for other pass-by-ref types will typically look like this: */
> #define PG_RETURN_BYTEA_P(x) PG_RETURN_POINTER(x)
> #define PG_RETURN_TEXT_P(x) PG_RETURN_POINTER(x)
> + #define PG_RETURN_UNKNOWN_P(x) PG_RETURN_POINTER(x)
> #define PG_RETURN_BPCHAR_P(x) PG_RETURN_POINTER(x)
> #define PG_RETURN_VARCHAR_P(x) PG_RETURN_POINTER(x)
>
> diff -Ncr pgsql.orig/src/include/utils/builtins.h pgsql/src/include/utils/builtins.h
> *** pgsql.orig/src/include/utils/builtins.h Sun Apr 7 10:21:29 2002
> --- pgsql/src/include/utils/builtins.h Sun Apr 7 12:26:17 2002
> ***************
> *** 414,419 ****
> --- 414,422 ----
> extern bool SplitIdentifierString(char *rawstring, char separator,
> List **namelist);
>
> + extern Datum unknownin(PG_FUNCTION_ARGS);
> + extern Datum unknownout(PG_FUNCTION_ARGS);
> +
> extern Datum byteain(PG_FUNCTION_ARGS);
> extern Datum byteaout(PG_FUNCTION_ARGS);
> extern Datum byteaoctetlen(PG_FUNCTION_ARGS);

>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026


From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Joe Conway <mail(at)joeconway(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>, pgsql-patches(at)postgresql(dot)org, Thomas Lockhart <lockhart(at)fourpalms(dot)org>
Subject: Re: unknownin/out patch (was [HACKERS] PQescapeBytea is not
Date: 2002-04-24 02:13:06
Message-ID: 200204240213.g3O2D6q20452@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


Patch applied. Thanks.

Catalog version updated.

---------------------------------------------------------------------------

Joe Conway wrote:
> Tom Lane wrote:
> > Joe Conway <mail(at)joeconway(dot)com> writes:
> >
> >>I think you're correct that in a client/database encoding mismatch
> >>scenario, there would be bigger problems. Thoughts on this?
> >
> >
> > This scenario is probably why Tatsuo wants PQescapeBytea to octalize
> > everything with the high bit set; I'm not sure there's any lesser way
> > out. Nonetheless, if UNKNOWN conversion introduces additional failures
> > then it makes sense to fix that.
> >
> > regards, tom lane
> >
>
> Here's a patch to add unknownin/unknownout support. I also poked around
> looking for places that assume UNKNOWN == TEXT. One of those was the
> "SET" type in pg_type.h, which was using textin/textout. This one I took
> care of in this patch. The other suspicious place was in
> string_to_dataum (which is defined in both selfuncs.c and indxpath.c). I
> wasn't too sure about those, so I left them be.
>
> Regression tests all pass with the exception of horology, which also
> fails on CVS tip. It looks like that is a daylight savings time issue
> though.
>
> Also as a side note, I can't get make check to get past initdb if I
> configure with --enable-multibyte on CVS tip. Is there a known problem
> or am I just being clueless . . .wait, let's qualify that -- am I being
> clueless on this one issue? ;-)
>
> Joe

> diff -Ncr pgsql.orig/src/backend/utils/adt/varlena.c pgsql/src/backend/utils/adt/varlena.c
> *** pgsql.orig/src/backend/utils/adt/varlena.c Sun Apr 7 10:21:25 2002
> --- pgsql/src/backend/utils/adt/varlena.c Sun Apr 7 11:44:54 2002
> ***************
> *** 228,233 ****
> --- 228,273 ----
> }
>
>
> + /*
> + * unknownin - converts "..." to internal representation
> + */
> + Datum
> + unknownin(PG_FUNCTION_ARGS)
> + {
> + char *inputStr = PG_GETARG_CSTRING(0);
> + unknown *result;
> + int len;
> +
> + len = strlen(inputStr) + VARHDRSZ;
> +
> + result = (unknown *) palloc(len);
> + VARATT_SIZEP(result) = len;
> +
> + memcpy(VARDATA(result), inputStr, len - VARHDRSZ);
> +
> + PG_RETURN_UNKNOWN_P(result);
> + }
> +
> +
> + /*
> + * unknownout - converts internal representation to "..."
> + */
> + Datum
> + unknownout(PG_FUNCTION_ARGS)
> + {
> + unknown *t = PG_GETARG_UNKNOWN_P(0);
> + int len;
> + char *result;
> +
> + len = VARSIZE(t) - VARHDRSZ;
> + result = (char *) palloc(len + 1);
> + memcpy(result, VARDATA(t), len);
> + result[len] = '\0';
> +
> + PG_RETURN_CSTRING(result);
> + }
> +
> +
> /* ========== PUBLIC ROUTINES ========== */
>
> /*
> diff -Ncr pgsql.orig/src/include/c.h pgsql/src/include/c.h
> *** pgsql.orig/src/include/c.h Sun Apr 7 10:21:29 2002
> --- pgsql/src/include/c.h Sun Apr 7 11:40:59 2002
> ***************
> *** 389,394 ****
> --- 389,395 ----
> */
> typedef struct varlena bytea;
> typedef struct varlena text;
> + typedef struct varlena unknown;
> typedef struct varlena BpChar; /* blank-padded char, ie SQL char(n) */
> typedef struct varlena VarChar; /* var-length char, ie SQL varchar(n) */
>
> diff -Ncr pgsql.orig/src/include/catalog/pg_proc.h pgsql/src/include/catalog/pg_proc.h
> *** pgsql.orig/src/include/catalog/pg_proc.h Sun Apr 7 10:21:29 2002
> --- pgsql/src/include/catalog/pg_proc.h Sun Apr 7 11:56:09 2002
> ***************
> *** 235,240 ****
> --- 235,245 ----
> DATA(insert OID = 108 ( scalargtjoinsel PGNSP PGUID 12 f t t s 3 f 701 "0 26 0" 100 0 0 100 scalargtjoinsel - _null_ ));
> DESCR("join selectivity of > and related operators on scalar datatypes");
>
> + DATA(insert OID = 109 ( unknownin PGNSP PGUID 12 f t t i 1 f 705 "0" 100 0 0 100 unknownin - _null_ ));
> + DESCR("(internal)");
> + DATA(insert OID = 110 ( unknownout PGNSP PGUID 12 f t t i 1 f 23 "0" 100 0 0 100 unknownout - _null_ ));
> + DESCR("(internal)");
> +
> DATA(insert OID = 112 ( text PGNSP PGUID 12 f t t i 1 f 25 "23" 100 0 0 100 int4_text - _null_ ));
> DESCR("convert int4 to text");
> DATA(insert OID = 113 ( text PGNSP PGUID 12 f t t i 1 f 25 "21" 100 0 0 100 int2_text - _null_ ));
> diff -Ncr pgsql.orig/src/include/catalog/pg_type.h pgsql/src/include/catalog/pg_type.h
> *** pgsql.orig/src/include/catalog/pg_type.h Sun Apr 7 10:21:29 2002
> --- pgsql/src/include/catalog/pg_type.h Sun Apr 7 11:57:36 2002
> ***************
> *** 302,308 ****
> DESCR("array of INDEX_MAX_KEYS oids, used in system tables");
> #define OIDVECTOROID 30
>
> ! DATA(insert OID = 32 ( SET PGNSP PGUID -1 -1 f b t \054 0 0 textin textout textin textout i p f 0 -1 0 _null_ _null_ ));
> DESCR("set of tuples");
>
> DATA(insert OID = 71 ( pg_type PGNSP PGUID 4 4 t c t \054 1247 0 int4in int4out int4in int4out i p f 0 -1 0 _null_ _null_ ));
> --- 302,308 ----
> DESCR("array of INDEX_MAX_KEYS oids, used in system tables");
> #define OIDVECTOROID 30
>
> ! DATA(insert OID = 32 ( SET PGNSP PGUID -1 -1 f b t \054 0 0 unknownin unknownout unknownin unknownout i p f 0 -1 0 _null_ _null_ ));
> DESCR("set of tuples");
>
> DATA(insert OID = 71 ( pg_type PGNSP PGUID 4 4 t c t \054 1247 0 int4in int4out int4in int4out i p f 0 -1 0 _null_ _null_ ));
> ***************
> *** 366,372 ****
> DATA(insert OID = 704 ( tinterval PGNSP PGUID 12 47 f b t \054 0 0 tintervalin tintervalout tintervalin tintervalout i p f 0 -1 0 _null_ _null_ ));
> DESCR("(abstime,abstime), time interval");
> #define TINTERVALOID 704
> ! DATA(insert OID = 705 ( unknown PGNSP PGUID -1 -1 f b t \054 0 0 textin textout textin textout i p f 0 -1 0 _null_ _null_ ));
> DESCR("");
> #define UNKNOWNOID 705
>
> --- 366,372 ----
> DATA(insert OID = 704 ( tinterval PGNSP PGUID 12 47 f b t \054 0 0 tintervalin tintervalout tintervalin tintervalout i p f 0 -1 0 _null_ _null_ ));
> DESCR("(abstime,abstime), time interval");
> #define TINTERVALOID 704
> ! DATA(insert OID = 705 ( unknown PGNSP PGUID -1 -1 f b t \054 0 0 unknownin unknownout unknownin unknownout i p f 0 -1 0 _null_ _null_ ));
> DESCR("");
> #define UNKNOWNOID 705
>
> diff -Ncr pgsql.orig/src/include/fmgr.h pgsql/src/include/fmgr.h
> *** pgsql.orig/src/include/fmgr.h Sun Apr 7 10:21:29 2002
> --- pgsql/src/include/fmgr.h Sun Apr 7 12:11:30 2002
> ***************
> *** 185,190 ****
> --- 185,191 ----
> /* DatumGetFoo macros for varlena types will typically look like this: */
> #define DatumGetByteaP(X) ((bytea *) PG_DETOAST_DATUM(X))
> #define DatumGetTextP(X) ((text *) PG_DETOAST_DATUM(X))
> + #define DatumGetUnknownP(X) ((unknown *) PG_DETOAST_DATUM(X))
> #define DatumGetBpCharP(X) ((BpChar *) PG_DETOAST_DATUM(X))
> #define DatumGetVarCharP(X) ((VarChar *) PG_DETOAST_DATUM(X))
> /* And we also offer variants that return an OK-to-write copy */
> ***************
> *** 200,205 ****
> --- 201,207 ----
> /* GETARG macros for varlena types will typically look like this: */
> #define PG_GETARG_BYTEA_P(n) DatumGetByteaP(PG_GETARG_DATUM(n))
> #define PG_GETARG_TEXT_P(n) DatumGetTextP(PG_GETARG_DATUM(n))
> + #define PG_GETARG_UNKNOWN_P(n) DatumGetUnknownP(PG_GETARG_DATUM(n))
> #define PG_GETARG_BPCHAR_P(n) DatumGetBpCharP(PG_GETARG_DATUM(n))
> #define PG_GETARG_VARCHAR_P(n) DatumGetVarCharP(PG_GETARG_DATUM(n))
> /* And we also offer variants that return an OK-to-write copy */
> ***************
> *** 239,244 ****
> --- 241,247 ----
> /* RETURN macros for other pass-by-ref types will typically look like this: */
> #define PG_RETURN_BYTEA_P(x) PG_RETURN_POINTER(x)
> #define PG_RETURN_TEXT_P(x) PG_RETURN_POINTER(x)
> + #define PG_RETURN_UNKNOWN_P(x) PG_RETURN_POINTER(x)
> #define PG_RETURN_BPCHAR_P(x) PG_RETURN_POINTER(x)
> #define PG_RETURN_VARCHAR_P(x) PG_RETURN_POINTER(x)
>
> diff -Ncr pgsql.orig/src/include/utils/builtins.h pgsql/src/include/utils/builtins.h
> *** pgsql.orig/src/include/utils/builtins.h Sun Apr 7 10:21:29 2002
> --- pgsql/src/include/utils/builtins.h Sun Apr 7 12:26:17 2002
> ***************
> *** 414,419 ****
> --- 414,422 ----
> extern bool SplitIdentifierString(char *rawstring, char separator,
> List **namelist);
>
> + extern Datum unknownin(PG_FUNCTION_ARGS);
> + extern Datum unknownout(PG_FUNCTION_ARGS);
> +
> extern Datum byteain(PG_FUNCTION_ARGS);
> extern Datum byteaout(PG_FUNCTION_ARGS);
> extern Datum byteaoctetlen(PG_FUNCTION_ARGS);

>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Joe Conway <mail(at)joeconway(dot)com>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: unknownin/out patch
Date: 2002-04-25 02:55:35
Message-ID: 5034.1019703335@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Joe Conway <mail(at)joeconway(dot)com> writes:
>> Here's a patch to add unknownin/unknownout support. I also poked around
>> looking for places that assume UNKNOWN == TEXT. One of those was the
>> "SET" type in pg_type.h, which was using textin/textout. This one I took
>> care of in this patch. The other suspicious place was in
>> string_to_dataum (which is defined in both selfuncs.c and indxpath.c). I
>> wasn't too sure about those, so I left them be.

I do not think string_to_datum is a problem. UNKNOWN constants should
never get past the parse analysis stage, so the planner doesn't have to
deal with them. Certainly, it won't be looking at them in the context
of making any interesting selectivity decisions.

> I found three other suspicious spots in the source, where UNKNOWN ==
> TEXT is assumed. The first looks like it needs to be changed for sure,
> the other two I'm less sure about. Feedback would be most appreciated
> (on this and the patch itself).

> (line numbers based on CVS from earlier today)
> parse_node.c
> line 428
> parse_coerce.c
> line 85
> parse_coerce.c
> line 403

The first two of these certainly need to be changed --- these are
exactly the places where we convert literal strings to and (later)
from UNKNOWN-constant representation. The third is okay as-is;
it's a type resolution rule, not code that is touching any literal
constants directly. Will fix these in an upcoming commit.

The patch looks okay otherwise, except that I'm moving the typedef
unknown and the fmgr macros for it into varlena.c. These two routines
are the only routines that will ever need them, so there's no need to
clutter the system-wide headers with 'em. (Also, I am uncomfortable
with having a globally-visible typedef with such a generic name as
"unknown"; strikes me as a recipe for name conflicts.)

regards, tom lane