Re: Bringing PostgreSQL torwards the standard regarding

From: "Andrew Dunstan" <andrew(at)dunslane(dot)net>
To: <db(at)zigo(dot)dhs(dot)org>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Bringing PostgreSQL torwards the standard regarding
Date: 2004-04-26 06:34:58
Message-ID: 3320.24.211.141.25.1082961298.squirrel@www.dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dennis Bjorklund said:
> On Sun, 25 Apr 2004, Andrew Dunstan wrote:
>
>> >> Why do you want two names? Just keep the original casing, and a
>> >> boolean saying if it's quoted or not.
>>
>> Sorry - brain malfunction - yes, original casing plus boolean would
>> work. In effect you could derive the canonical form from those two.
>

Dennis,

Ideas still swirling a bit, but I was thinking that there would be a per
database flag (which could indeed be set at db creation time) which would
specify the flavor of canonical names being used - upper, or lower, or we
could also consider exact (i.e. full case sensitivity, which I seem to
recall is a mode that SQLServer allows, possibly even the default, but my
memory could be rusty).

The canonical form of an unquoted name is dictated by this setting, while
the canonical form of a quoted name is the name as supplied. Two names
clash if their canonical forms are identical, quoted or not.

Assuming that we have a database with the flag set to use upper case
canonical names, as per the standard, then ...

> Say that you have this in the table with the identifier
>
> name quoted
> ---- ------
> Foo False
>
> Now you want to add the name "FOO"
>
> FOO True
>
> should you be allowed or is it a clash with the above?

It's a clash. The canonical for of both is "FOO"

>
> What if you also add "foo"
>
> foo True
>

No clash - "FOO" <> "foo"

> One of these two should be forbidden. And what about a quoted "FOO":
>
> FOO False
> FOO True
>

clash

> This case says it is not enough with an expressional unique index on
> (upper(name), quoted). It would be easier to enforce uniqueness if one
> store both the converted name and the original name:
>

The constraint would in effect be on CASE WHEN quoted THEN name ELSE upper
(name) END.

The advantage of using a boolean is that a lot less work would need to be
done to use whatever flag was being used for the DB. Possibly a reindex
after the files are copied. It might fail on some highly pathological
cases, but should never fail on our standard template databases.

> name orig_name
> ---- ---------
> FOO NULL <-- quoted one
> FOO FOO <-- unquoted one
>
> and the first case
>
> FOO Foo <-- unquoted
> FOO NULL <-- clashes with the first, good foo
> NULL <-- no clash, works fine
>
> With this one can always use upper case translation as per sql spec and
> psql can optionally show all unquoted identifiers as upper, lower or
> mixed case.
>

My thought was that there would be a user setting that would allow
resultset labels to use either canonical or literal names.

> Then we also have the INFORMATION_SCHEMA that should show the names in
> UPPER CASE when not quoted, this since applications that are written
> for the standard might depend on that (probably no application do today
> but it would be a valid case of use of the information schema).
>

I see 2 possibilities: either use the upper case canonical setting I
envisioned above, or change the information schema setup to force upper
case labels via AS clauses in the views.

cheers

andrew

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dennis Bjorklund 2004-04-26 06:42:14 Re: Bringing PostgreSQL torwards the standard regarding
Previous Message wespvp 2004-04-26 05:25:36 Re: thread_test.c problems