Identifier case folding notes

Lists: pgsql-hackers
From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Identifier case folding notes
Date: 2008-07-08 17:25:38
Message-ID: 200807081925.40467.peter_e@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I have had some idle thoughts on the issue of identifier case folding. Maybe
we can collect our ideas and inch closer to a solution sometime. Or we
determine that it's useless and impossible, but then I can at least collect
that result in the wiki and point future users to it.

Background: PostgreSQL folds unquoted identifiers to lower case. SQL
specifies folding to upper case, and most other SQL DBMS do it that way. If
an application mixes referring to an object using quoted and unquoted
versions of an identifier, then incompatibilities arise. We have always
stressed to users that one should refer to an object always unquoted or
always quoted. While that remains a good suggestion for a number of reasons,
we have seen occasional complaints that some closed source applications
violate this rule and therefore cannot be run on PostgreSQL.

A number of solutions have been proposed over time, which I summarize here:

1. Change the lexer to fold to upper case, as it is supposed to do according
to the SQL standard. This will break almost everything, because almost all
built-in objects have lower-case names and thus couldn't be referred to
anymore except by quoting. Changing the names of all the internal objects to
upper-case names would involve vast code changes, probably break just as much,
and make everything uglier. So this approach is unworkable.

2. Fold to upper case, but not when referring built-in objects. The lexer
generally doesn't know what a name will refer to, so this is not possible to
implement, at least without resorting to lots of hard-coding or horrid
kludges. Also, a behavior like this will probably create all kinds of weird
inconsistencies, resulting from putting catalog knowledge in the lexer.

3 and 4 -- Two variants of "ignore case altogether":

3. Fold all identifiers to lower case, even quoted ones. This would probably
in fact fix the breakage of many of the above-mentioned problem applications,
and it would in general be very easy to understand for a user. And it could
be implemented in about three lines. One disadvantage is that one could no
longer have objects that have names different only by case, but that is
probably rare and incredibly stupid and can be neglected. The main
disadvantage is that the case of identifiers and in particular column labels
is lost. So applications and programming interfaces that look up result
columns in a case-sensitive manner would fail. And things like SELECT expr AS
"Nice Heading" won't work properly anymore.

4. Compare the "name" data type in a case-insensitive manner. This would
probably address most problem cases. Again, you can't have objects with names
different in case only. One condition to implementing this would be that this
behavior would have be tied down globally at initdb, because it affects system
indexes and shared catalogs. That might be impractical for some, because
you'd need different instances for different behaviors, especially when you
want to host multiple applications or want to port an affected application to
the native PostgreSQL behavior over time.

5. One additional approach I thought of is that you swap the case of
identifiers as you lex them (upper to lower, lower to upper), and then swap
them back when you send them to the client. This needs a small change in the
lexer, one for sending the RowDescription, and support in pg_dump and a few
other places if desired. There will, however, be a number of weird, albeit
self-imposed, side-effects. I have implemented a little test patch for this.
It's weird, but it works in basic ways.

Obviously, no solution will ever work completely. And we probably don't want
such a solution, because it would create two different and incompatible
PostgreSQL universes. If we are aiming for a solution that would allow most
affected applications to hobble along, we would probably serve most users.
Implementing some or all of 3, 4, and 5 would probably achieve that.

Comments?


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Identifier case folding notes
Date: 2008-07-08 17:47:13
Message-ID: 1215539233.4051.974.camel@ebony.2ndQuadrant
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On Tue, 2008-07-08 at 19:25 +0200, Peter Eisentraut wrote:
> 4. Compare the "name" data type in a case-insensitive manner. This
> would probably address most problem cases. Again, you can't have
> objects with names
> different in case only. One condition to implementing this would be
> that this
> behavior would have be tied down globally at initdb, because it
> affects system
> indexes and shared catalogs. That might be impractical for some,
> because
> you'd need different instances for different behaviors, especially
> when you
> want to host multiple applications or want to port an affected
> application to
> the native PostgreSQL behavior over time.

That sounds the most workable, given your descriptions.

If objects are never different solely by case alone, then you will have
the same index ordering as if you had sent them all to lower case.

Surely it is possible to mix the two approaches somehow?

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Identifier case folding notes
Date: 2008-07-08 17:53:25
Message-ID: 11909.1215539605@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
> I have had some idle thoughts on the issue of identifier case folding.
> ...
> Comments?

IMHO, practically the only solid argument for changing from the way
we do things now is "to meet the letter of the spec". The various sorts
of gamesmanship you list would most definitely not meet the letter of
the spec; between that and the inevitability of breaking some apps,
I'm inclined to reject them all on sight.

What I think would perhaps be worth investigating is a compile-time
(or at latest initdb-time) option that flips the case folding behavior
to SQL-spec-compliant and also changes all the built-in catalog entries
to upper case. We would then have a solution we could offer to people
who really need to run apps that depend on SQL-spec case folding ...
and if the upper case hurts their eyes, or breaks some other apps that
they wish they could run in the same DB, well it's their problem.

Of course there would be large amounts of work to try to make psql,
pg_dump, etc behave as nicely as possible with either case-folding rule,
but it doesn't strike me as being so obviously "unworkable" as to be
dismissed at once.

regards, tom lane


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Identifier case folding notes
Date: 2008-07-08 18:18:06
Message-ID: 200807082018.06813.peter_e@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Am Dienstag, 8. Juli 2008 schrieb Tom Lane:
> IMHO, practically the only solid argument for changing from the way
> we do things now is "to meet the letter of the spec".

Well no. As I have mentioned, there have actually been occasional complaints
by people who can't run their code generated by closed-source applications,
because they handle the case differently.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Identifier case folding notes
Date: 2008-07-08 18:23:15
Message-ID: 12388.1215541395@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
> Am Dienstag, 8. Juli 2008 schrieb Tom Lane:
>> IMHO, practically the only solid argument for changing from the way
>> we do things now is "to meet the letter of the spec".

> Well no. As I have mentioned, there have actually been occasional complaints
> by people who can't run their code generated by closed-source applications,
> because they handle the case differently.

Sure, otherwise we wouldn't really be worrying about this. But if
someone comes to us and says "this closed source app requires some
weird non-spec-compliant case folding rule, please make Postgres
do that", we're going to say no. Their argument only has weight
if they say their app expects the SQL-spec behavior.

regards, tom lane


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Eisentraut <peter_e(at)gmx(dot)net>
Subject: Re: Identifier case folding notes
Date: 2008-07-08 21:04:51
Message-ID: 200807081404.52673.josh@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom,

> IMHO, practically the only solid argument for changing from the way
> we do things now is "to meet the letter of the spec". The various sorts
> of gamesmanship you list would most definitely not meet the letter of
> the spec; between that and the inevitability of breaking some apps,
> I'm inclined to reject them all on sight.

Actually, there are a number of *very* popular database tools, particularly
in the Java world (such as Netbeans and BIRT) which do mix quoted and
unquoted identifiers. In general, users of those tools reject PostgreSQL
as "broken" for our nonstandard behavoir rather than trying to work around
it.

So it's not just a standards issue; this problem really *is* hurting us in
adoption.

--
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco


From: Andreas Joseph Krogh <andreak(at)officenet(dot)no>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Identifier case folding notes
Date: 2008-07-08 21:20:29
Message-ID: 200807082320.29530.andreak@officenet.no
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tuesday 08 July 2008 23:04:51 Josh Berkus wrote:
> Tom,
>
> > IMHO, practically the only solid argument for changing from the way
> > we do things now is "to meet the letter of the spec". The various sorts
> > of gamesmanship you list would most definitely not meet the letter of
> > the spec; between that and the inevitability of breaking some apps,
> > I'm inclined to reject them all on sight.
>
> Actually, there are a number of *very* popular database tools, particularly
> in the Java world (such as Netbeans and BIRT) which do mix quoted and
> unquoted identifiers. In general, users of those tools reject PostgreSQL
> as "broken" for our nonstandard behavoir rather than trying to work around
> it.
>
> So it's not just a standards issue; this problem really *is* hurting us in
> adoption.
>
> --
> --Josh
>
> Josh Berkus
> PostgreSQL @ Sun
> San Francisco

Right. From a user's perspective 4) sounds best. I often run into problems having keywords as column-names:

andreak=# create table test(user varchar);
ERROR: syntax error at or near "user"
LINE 1: create table test(user varchar);
^
andreak=# create table test("user" varchar);
CREATE TABLE
andreak=# insert into test("USER") values('testuser');
ERROR: column "USER" of relation "test" does not exist
LINE 1: insert into test("USER") values('testuser');
^
andreak=# insert into test(user) values('testuser');
ERROR: syntax error at or near "user"
LINE 1: insert into test(user) values('testuser');
^
andreak=# insert into test("user") values('testuser');
INSERT 0 1

As you know, the only way of referring to the "user"-column is to qoute it in lowercase, which many apps and tools don't do.

--
Andreas Joseph Krogh <andreak(at)officenet(dot)no>
Senior Software Developer / Manager


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: <josh(at)agliodbs(dot)com>,<pgsql-hackers(at)postgresql(dot)org>
Cc: "Peter Eisentraut" <peter_e(at)gmx(dot)net>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Identifier case folding notes
Date: 2008-07-08 21:33:26
Message-ID: 48739707.EE98.0025.0@wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>>> Josh Berkus <josh(at)agliodbs(dot)com> wrote:

> Actually, there are a number of *very* popular database tools,
particularly
> in the Java world (such as Netbeans and BIRT) which do mix quoted and

> unquoted identifiers. In general, users of those tools reject
PostgreSQL
> as "broken" for our nonstandard behavoir rather than trying to work
around
> it.

Do these tools expect an unquoted identifier to be treated according
to the standard? As I read it, an unquoted identifier should be
treated identically to the same identifier folded to uppercase and
wrapped in quotes, except that it will be guaranteed to be considered
an identifier, rather than possibly considered as a reserved word,
etc.

>From our perspective, we're OK with the status quo since we always
quote all identifiers. I don't think any of the suggestions would
bite us (if implemented bug-free) because we also forbid names which
differ only in capitalization. We help out our programmers by letting
them ignore quoting (except identifiers which are reserved words) and
capitalization when they write queries in our tool; we correct the
capitalization and wrap the identifiers in quotes as we generate the
Java query classes. Doing something like that in psql autocompletion
and in other PostgreSQL tools would be a nice feature, if practicable.

-Kevin


From: Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Identifier case folding notes
Date: 2008-07-08 21:51:21
Message-ID: 4873E159.6070106@cheapcomplexdevices.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> What I think would perhaps be worth investigating is a compile-time
> (or at latest initdb-time) option that flips the case folding behavior
> to SQL-spec-compliant and also changes all the built-in catalog entries
> to upper case. We would then have a solution we could offer to people
> who really need to run apps that depend on SQL-spec case folding ...
> and if the upper case hurts their eyes, or breaks some other apps that
> they wish they could run in the same DB, well it's their problem.

+1 for a compile-time option for spec-compliant behavior. Even
where the spec is stupid (timestamp with time zone literals) it'd
be nice to have the option; both for feature completeness
checklists and for teachers who want to teach targeting the spec.


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Ron Mayer" <rm_pg(at)cheapcomplexdevices(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Peter Eisentraut" <peter_e(at)gmx(dot)net>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Identifier case folding notes
Date: 2008-07-08 22:00:01
Message-ID: 48739D42.EE98.0025.0@wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>>> Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com> wrote:

> +1 for a compile-time option for spec-compliant behavior. Even
> where the spec is stupid (timestamp with time zone literals) it'd
> be nice to have the option; both for feature completeness
> checklists and for teachers who want to teach targeting the spec.

In my world it would be even more important for feature completeness
itself, and for production applications written to the spec for
portability. But, agreed: +1

-Kevin


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: josh(at)agliodbs(dot)com, pgsql-hackers(at)postgresql(dot)org, "Peter Eisentraut" <peter_e(at)gmx(dot)net>
Subject: Re: Identifier case folding notes
Date: 2008-07-08 22:30:15
Message-ID: 17237.1215556215@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> writes:
> Josh Berkus <josh(at)agliodbs(dot)com> wrote:
>> Actually, there are a number of *very* popular database tools,
>> particularly
>> in the Java world (such as Netbeans and BIRT) which do mix quoted and
>> unquoted identifiers.

> Do these tools expect an unquoted identifier to be treated according
> to the standard?

Unfortunately, they almost certainly don't. I'd bet long odds that
what they expect is mysql's traditional behavior, which is not even
within hailing distance of being spec compliant. (In a quick test,
it looks like mysql 5.0's default behavior is never to fold case
at all; and then there's the problem that they use the wrong kind
of quotes ...)

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andreas Joseph Krogh <andreak(at)officenet(dot)no>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Identifier case folding notes
Date: 2008-07-08 22:35:07
Message-ID: 17327.1215556507@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andreas Joseph Krogh <andreak(at)officenet(dot)no> writes:
> Right. From a user's perspective 4) sounds best. I often run into problems having keywords as column-names:

None of the proposals on the table will remove the need to use quotes in
that case.

regards, tom lane


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Identifier case folding notes
Date: 2008-07-08 23:18:08
Message-ID: 4873F5B0.7030703@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Peter Eisentraut wrote:
> I have had some idle thoughts on the issue of identifier case folding. Maybe
> we can collect our ideas and inch closer to a solution sometime. Or we
> determine that it's useless and impossible, but then I can at least collect
> that result in the wiki and point future users to it.
>
> Background: PostgreSQL folds unquoted identifiers to lower case. SQL
> specifies folding to upper case, and most other SQL DBMS do it that way. If
> an application mixes referring to an object using quoted and unquoted
> versions of an identifier, then incompatibilities arise. We have always
> stressed to users that one should refer to an object always unquoted or
> always quoted. While that remains a good suggestion for a number of reasons,
> we have seen occasional complaints that some closed source applications
> violate this rule and therefore cannot be run on PostgreSQL.
>
> A number of solutions have been proposed over time, which I summarize here:
>
> 1. Change the lexer to fold to upper case, as it is supposed to do according
> to the SQL standard. This will break almost everything, because almost all
> built-in objects have lower-case names and thus couldn't be referred to
> anymore except by quoting. Changing the names of all the internal objects to
> upper-case names would involve vast code changes, probably break just as much,
> and make everything uglier. So this approach is unworkable.
>
> 2. Fold to upper case, but not when referring built-in objects. The lexer
> generally doesn't know what a name will refer to, so this is not possible to
> implement, at least without resorting to lots of hard-coding or horrid
> kludges. Also, a behavior like this will probably create all kinds of weird
> inconsistencies, resulting from putting catalog knowledge in the lexer.
>
> 3 and 4 -- Two variants of "ignore case altogether":
>
> 3. Fold all identifiers to lower case, even quoted ones. This would probably
> in fact fix the breakage of many of the above-mentioned problem applications,
> and it would in general be very easy to understand for a user. And it could
> be implemented in about three lines. One disadvantage is that one could no
> longer have objects that have names different only by case, but that is
> probably rare and incredibly stupid and can be neglected. The main
> disadvantage is that the case of identifiers and in particular column labels
> is lost. So applications and programming interfaces that look up result
> columns in a case-sensitive manner would fail. And things like SELECT expr AS
> "Nice Heading" won't work properly anymore.
>
> 4. Compare the "name" data type in a case-insensitive manner. This would
> probably address most problem cases. Again, you can't have objects with names
> different in case only. One condition to implementing this would be that this
> behavior would have be tied down globally at initdb, because it affects system
> indexes and shared catalogs. That might be impractical for some, because
> you'd need different instances for different behaviors, especially when you
> want to host multiple applications or want to port an affected application to
> the native PostgreSQL behavior over time.
>
> 5. One additional approach I thought of is that you swap the case of
> identifiers as you lex them (upper to lower, lower to upper), and then swap
> them back when you send them to the client. This needs a small change in the
> lexer, one for sending the RowDescription, and support in pg_dump and a few
> other places if desired. There will, however, be a number of weird, albeit
> self-imposed, side-effects. I have implemented a little test patch for this.
> It's weird, but it works in basic ways.
>
> Obviously, no solution will ever work completely. And we probably don't want
> such a solution, because it would create two different and incompatible
> PostgreSQL universes. If we are aiming for a solution that would allow most
> affected applications to hobble along, we would probably serve most users.
> Implementing some or all of 3, 4, and 5 would probably achieve that.
>
>

I'm not sure if you've read all the archive history on this. Here are
the pointers from the TODO list:

http://archives.postgresql.org/pgsql-hackers/2004-04/msg00818.php
http://archives.postgresql.org/pgsql-hackers/2006-10/msg01527.php
http://archives.postgresql.org/pgsql-hackers/2008-03/msg00849.php

The fact is that we have substantial groups of users who want different
things:
. current users who want no change so there is no breakage in existing apps
. users on other DBs who want Spec compliance
. users on yet other DBs who want case preservation

The last group should not be lightly dismissed - it is quite common
behaviour on MSSQL as well as on MySQL, so we have some incentive to
make this possible to encourage migration.

I'm strongly of the opinion therefore that this should be behaviour
determined at initdb time (can't make it later because of shared
catalogs). I suspect that we won't be able to do all this by simple
transformations in the lexer, unlike what we do now. But I do think it's
worth doing.

cheers

andrew


From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Peter Eisentraut" <peter_e(at)gmx(dot)net>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Identifier case folding notes
Date: 2008-07-08 23:19:10
Message-ID: 87y74cotg1.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Peter Eisentraut" <peter_e(at)gmx(dot)net> writes:

> One disadvantage is that one could no longer have objects that have names
> different only by case, but that is probably rare and incredibly stupid and
> can be neglected.

Certainly not if you hope to claim being within a mile of spec -- which seems
like the only point of fiddling with this. Breaking this would take as further
from spec-compliance than we are today.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's PostGIS support!


From: Russell Smith <mr-russ(at)pws(dot)com(dot)au>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Identifier case folding notes
Date: 2008-07-09 00:29:35
Message-ID: 4874066F.5060206@pws.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andrew Dunstan wrote:
> I'm not sure if you've read all the archive history on this. Here are
> the pointers from the TODO list:
>
> http://archives.postgresql.org/pgsql-hackers/2004-04/msg00818.php
> http://archives.postgresql.org/pgsql-hackers/2006-10/msg01527.php
> http://archives.postgresql.org/pgsql-hackers/2008-03/msg00849.php
>
>
> The fact is that we have substantial groups of users who want
> different things:
> . current users who want no change so there is no breakage in existing
> apps
> . users on other DBs who want Spec compliance
> . users on yet other DBs who want case preservation
>
> The last group should not be lightly dismissed - it is quite common
> behaviour on MSSQL as well as on MySQL, so we have some incentive to
> make this possible to encourage migration.
>
> I'm strongly of the opinion therefore that this should be behaviour
> determined at initdb time (can't make it later because of shared
> catalogs). I suspect that we won't be able to do all this by simple
> transformations in the lexer, unlike what we do now. But I do think
> it's worth doing.
>
> cheers
>
> andrew
>
Hi,

as part of the
http://archives.postgresql.org/pgsql-hackers/2008-03/msg00849.php
thread, I did a reasonable amount of discovery work on making the 3
options andrew presents a reality. As I'm not skilled enough I never
got far enough to make them all work at once. I did however get lower
case and case preservation working.

To make those tow work the catalogs need no changes. Some of the
regressions tests expect case folding, so they did need changing to
operate correctly. I was unable to make the input files to initdb
correctly fold the case of system catalogs for the upper case version.
I'm sure somebody with more experience would not find it as difficult as
I did. Function names tended to be where all the gotchas were. Count()
vs count() vs COUNT() for example.

Once the db was up and running, the issue becomes all the supporting
tools. psql was made to autocomplete with case preservation, I was
going to make pg_dump just quote everything. I then got to the point of
adding a fixed GUC like LC_LOCALE that allows psql to read the case
folding situation and act according. That is where my progress ended.

Attached is what i had worked in. It's a patch against 8.3.1. I know
it's not CVS head, but it is what I was using at the time to experiment.

Regards

Russell

Attachment Content-Type Size
case-folding-patch-WIP.diff text/x-diff 35.6 KB

From: Andreas Joseph Krogh <andreak(at)officenet(dot)no>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Identifier case folding notes
Date: 2008-07-09 07:42:18
Message-ID: 200807090942.18571.andreak@officenet.no
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wednesday 09 July 2008 00:35:07 Tom Lane wrote:
> Andreas Joseph Krogh <andreak(at)officenet(dot)no> writes:
> > Right. From a user's perspective 4) sounds best. I often run into problems having keywords as column-names:
>
> None of the proposals on the table will remove the need to use quotes in
> that case.

I know, but then tools/frameworks won't fail when they produce queries like SELECT "USER" FROM test, because it sends "USER" and not "user".

--
Andreas Joseph Krogh <andreak(at)officenet(dot)no>
Senior Software Developer / Manager
------------------------+---------------------------------------------+
OfficeNet AS | The most difficult thing in the world is to |
Karenslyst Allé 11 | know how to do a thing and to watch |
PO. Box 529 Skøyen | somebody else doing it wrong, without |
0214 Oslo | comment. |
NORWAY | |
Tlf: +47 24 15 38 90 | |
Fax: +47 24 15 38 91 | |
Mobile: +47 909 56 963 | |
------------------------+---------------------------------------------+


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Gregory Stark <stark(at)enterprisedb(dot)com>
Subject: Re: Identifier case folding notes
Date: 2008-07-09 07:59:30
Message-ID: 200807090959.33999.peter_e@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Am Mittwoch, 9. Juli 2008 schrieb Gregory Stark:
> "Peter Eisentraut" <peter_e(at)gmx(dot)net> writes:
> > One disadvantage is that one could no longer have objects that have names
> > different only by case, but that is probably rare and incredibly stupid
> > and can be neglected.
>
> Certainly not if you hope to claim being within a mile of spec

Which I don't.

> -- which seems like the only point of fiddling with this.

No, the point is making more applications run.

> Breaking this would take as further from spec-compliance than we are today.

As far as standards compliance goes, I don't think there is a well-defined
distance measure, but if you can come up with one, I would actually be quite
keen on debating this point. :-) In any case, any of this would only be an
option anyway.


From: Tino Wildenhain <tino(at)wildenhain(dot)de>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Identifier case folding notes
Date: 2008-07-09 10:32:18
Message-ID: 487493B2.90506@wildenhain.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

Peter Eisentraut wrote:
...
> 4. Compare the "name" data type in a case-insensitive manner. This would
> probably address most problem cases. Again, you can't have objects with names
> different in case only. One condition to implementing this would be that this
> behavior would have be tied down globally at initdb, because it affects system
> indexes and shared catalogs. That might be impractical for some, because
> you'd need different instances for different behaviors, especially when you
> want to host multiple applications or want to port an affected application to
> the native PostgreSQL behavior over time.

The whole stuff as I understand is to fix the behavior with applications
creating objects without quotes and accessing them "QUOTEDUPPERCASE"?

Would a small script fixing the schema by using rename not fix this for
many applications?

T.


From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Tino Wildenhain" <tino(at)wildenhain(dot)de>
Cc: "Peter Eisentraut" <peter_e(at)gmx(dot)net>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Identifier case folding notes
Date: 2008-07-09 11:42:15
Message-ID: 873amjthbc.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Tino Wildenhain" <tino(at)wildenhain(dot)de> writes:

> The whole stuff as I understand is to fix the behavior with applications
> creating objects without quotes and accessing them "QUOTEDUPPERCASE"?
>
> Would a small script fixing the schema by using rename not fix this for
> many applications?

Well there are other circumstances where this can arise

select "FOO" from (select col as foo from bar)

Not to mention temporary objects which are created and dropped inside
functions.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's 24x7 Postgres support!


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, pgsql-hackers(at)postgresql(dot)org, Peter Eisentraut <peter_e(at)gmx(dot)net>
Subject: Re: Identifier case folding notes
Date: 2008-07-09 17:51:36
Message-ID: 4874FAA8.9060701@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom,

> Unfortunately, they almost certainly don't. I'd bet long odds that
> what they expect is mysql's traditional behavior,

Nope. They're looking for Oracle, which is spec-complaint since they
wrote that spec.

--Josh