Re: New to_timestamp implementation is pretty strict

Lists: pgsql-hackers
From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: New to_timestamp implementation is pretty strict
Date: 2008-12-01 12:08:07
Message-ID: 4933D3A7.702@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I like strict in general, but this doesn't seem logical:

postgres=# SELECT to_timestamp('29-12-2005 01:2:03', 'DD-MM-YYYY
HH24:MI:SS'); -- works
to_timestamp
------------------------
2005-12-29 01:02:03+02
(1 row)

postgres=# SELECT to_timestamp('29-12-2005 01:02:3', 'DD-MM-YYYY
HH24:MI:SS'); -- doesn't work
ERROR: source string too short for "SS" formatting field
DETAIL: Field requires 2 characters, but only 1 remain.
HINT: If your source string is not fixed-width, try using the "FM"
modifier.

I think the end of string should be treated like a field separator,
colon in this example, and we should accept both of the above. Opinions?

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: "David E(dot) Wheeler" <david(at)kineticode(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New to_timestamp implementation is pretty strict
Date: 2008-12-01 14:45:21
Message-ID: F456DC32-06EB-4546-8683-7F0BE4E55B1F@kineticode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Dec 1, 2008, at 1:08 PM, Heikki Linnakangas wrote:

> postgres=# SELECT to_timestamp('29-12-2005 01:02:3', 'DD-MM-YYYY
> HH24:MI:SS'); -- doesn't work
> ERROR: source string too short for "SS" formatting field
> DETAIL: Field requires 2 characters, but only 1 remain.
> HINT: If your source string is not fixed-width, try using the "FM"
> modifier.
>
> I think the end of string should be treated like a field separator,
> colon in this example, and we should accept both of the above.
> Opinions?

I'm generally in favor of being generous in the input one can accept,
but in this case it seems ambiguous to me. Is that supposed to be :30
or :03? There's no way to tell.

My $0.02.

Best,

David


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "David E(dot) Wheeler" <david(at)kineticode(dot)com>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New to_timestamp implementation is pretty strict
Date: 2008-12-01 14:52:21
Message-ID: 17683.1228143141@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"David E. Wheeler" <david(at)kineticode(dot)com> writes:
> On Dec 1, 2008, at 1:08 PM, Heikki Linnakangas wrote:
>> I think the end of string should be treated like a field separator,
>> colon in this example, and we should accept both of the above.
>> Opinions?

> I'm generally in favor of being generous in the input one can accept,
> but in this case it seems ambiguous to me. Is that supposed to be :30
> or :03? There's no way to tell.

But notice that we are allowing a single digit for the hour and minute
fields. It's inconsistent that the last field works differently.
(And it is that it's the last field, not that it's SS --- try minutes
as the last field.)

regards, tom lane


From: "Dave Page" <dpage(at)pgadmin(dot)org>
To: "David E(dot) Wheeler" <david(at)kineticode(dot)com>
Cc: "Heikki Linnakangas" <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New to_timestamp implementation is pretty strict
Date: 2008-12-01 14:55:12
Message-ID: 937d27e10812010655h971b1cbq84d8d08e35adc21c@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Dec 1, 2008 at 2:45 PM, David E. Wheeler <david(at)kineticode(dot)com> wrote:
> On Dec 1, 2008, at 1:08 PM, Heikki Linnakangas wrote:
>
>> postgres=# SELECT to_timestamp('29-12-2005 01:02:3', 'DD-MM-YYYY
>> HH24:MI:SS'); -- doesn't work
>> ERROR: source string too short for "SS" formatting field
>> DETAIL: Field requires 2 characters, but only 1 remain.
>> HINT: If your source string is not fixed-width, try using the "FM"
>> modifier.
>>
>> I think the end of string should be treated like a field separator, colon
>> in this example, and we should accept both of the above. Opinions?
>
> I'm generally in favor of being generous in the input one can accept, but in
> this case it seems ambiguous to me. Is that supposed to be :30 or :03?
> There's no way to tell.

How is it ambiguous? The leading zero is technically redundant. A
trailing on most certainly isn't.

--
Dave Page
EnterpriseDB UK: http://www.enterprisedb.com


From: "David E(dot) Wheeler" <david(at)kineticode(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New to_timestamp implementation is pretty strict
Date: 2008-12-01 14:56:52
Message-ID: 7E82016B-4E21-4DED-943F-339A3FD3B02C@kineticode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Dec 1, 2008, at 3:52 PM, Tom Lane wrote:

>> I'm generally in favor of being generous in the input one can accept,
>> but in this case it seems ambiguous to me. Is that supposed to be :30
>> or :03? There's no way to tell.
>
> But notice that we are allowing a single digit for the hour and minute
> fields. It's inconsistent that the last field works differently.
> (And it is that it's the last field, not that it's SS --- try minutes
> as the last field.)

Oh, well yeah, it should be consistent. But I'm still not sure that :3
should be allowed. OTOH, who does that, anyway?

Best,

David


From: "David E(dot) Wheeler" <david(at)kineticode(dot)com>
To: "Dave Page" <dpage(at)pgadmin(dot)org>
Cc: "Heikki Linnakangas" <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New to_timestamp implementation is pretty strict
Date: 2008-12-01 15:02:59
Message-ID: C28D0F97-787C-44C0-86CD-E26680EEC476@kineticode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Dec 1, 2008, at 3:55 PM, Dave Page wrote:

>> I'm generally in favor of being generous in the input one can
>> accept, but in
>> this case it seems ambiguous to me. Is that supposed to be :30 or :
>> 03?
>> There's no way to tell.
>
> How is it ambiguous? The leading zero is technically redundant. A
> trailing on most certainly isn't.

it depends on how you look at it, I suppose. If you look at ":xy" as
"x" being the 10s position and "y" being the 1s position, it makes no
sense. If you look at it as an integer, it does.

Best,

David


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: "David E(dot) Wheeler" <david(at)kineticode(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New to_timestamp implementation is pretty strict
Date: 2008-12-01 15:07:47
Message-ID: 20081201150747.GD4459@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

David E. Wheeler wrote:

> Oh, well yeah, it should be consistent. But I'm still not sure that :3
> should be allowed. OTOH, who does that, anyway?

Anyone who prints times as %d:%d:%d. You can find those in the wild.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


From: "Dave Page" <dpage(at)pgadmin(dot)org>
To: "David E(dot) Wheeler" <david(at)kineticode(dot)com>
Cc: "Heikki Linnakangas" <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New to_timestamp implementation is pretty strict
Date: 2008-12-01 15:09:23
Message-ID: 937d27e10812010709u5a290f70kf47f90a70b8d487a@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Dec 1, 2008 at 3:02 PM, David E. Wheeler <david(at)kineticode(dot)com> wrote:

> it depends on how you look at it, I suppose. If you look at ":xy" as "x"
> being the 10s position and "y" being the 1s position, it makes no sense.

Suffice it to say, I don't look at it that way :-). I'd wager most
people wouldn't either, but I have no data to back that up of course.

--
Dave Page
EnterpriseDB UK: http://www.enterprisedb.com


From: "David E(dot) Wheeler" <david(at)kineticode(dot)com>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New to_timestamp implementation is pretty strict
Date: 2008-12-01 15:13:41
Message-ID: 5377C5A3-68DB-4BCE-B197-7435A82BEFBE@kineticode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Dec 1, 2008, at 4:07 PM, Alvaro Herrera wrote:

> David E. Wheeler wrote:
>
>> Oh, well yeah, it should be consistent. But I'm still not sure
>> that :3
>> should be allowed. OTOH, who does that, anyway?
>
> Anyone who prints times as %d:%d:%d. You can find those in the wild.

I guess I should have expected that. Sheesh.

Best,

David


From: "David E(dot) Wheeler" <david(at)kineticode(dot)com>
To: Dave Page <dpage(at)pgadmin(dot)org>
Cc: "Heikki Linnakangas" <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New to_timestamp implementation is pretty strict
Date: 2008-12-01 15:14:18
Message-ID: 6F541CCC-FC8C-4278-B31E-0DDF6DD6A192@kineticode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Dec 1, 2008, at 4:09 PM, Dave Page wrote:

> On Mon, Dec 1, 2008 at 3:02 PM, David E. Wheeler
> <david(at)kineticode(dot)com> wrote:
>
>> it depends on how you look at it, I suppose. If you look at ":xy"
>> as "x"
>> being the 10s position and "y" being the 1s position, it makes no
>> sense.
>
> Suffice it to say, I don't look at it that way :-). I'd wager most
> people wouldn't either, but I have no data to back that up of course.

Yeah, I could see that. It makes no sense to me (":3" just looks
weird), but maybe I just think too much like a computer. ;-)

Best,

David


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Dave Page <dpage(at)pgadmin(dot)org>
Cc: "David E(dot) Wheeler" <david(at)kineticode(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New to_timestamp implementation is pretty strict
Date: 2008-12-01 15:26:07
Message-ID: 4934020F.3060901@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dave Page wrote:
> On Mon, Dec 1, 2008 at 3:02 PM, David E. Wheeler <david(at)kineticode(dot)com> wrote:
>
>
>> it depends on how you look at it, I suppose. If you look at ":xy" as "x"
>> being the 10s position and "y" being the 1s position, it makes no sense.
>>
>
> Suffice it to say, I don't look at it that way :-). I'd wager most
> people wouldn't either, but I have no data to back that up of course.
>
>
>

Isn't the point that ambiguity is undesirable, as is inconsistency? So
counts of people who see this one way or the other should be irrelevant.

Alvaro noted the use in the wild of formats like "%d:%d:%d" for times.
IMNSHO we should not cater to such bad code.

cheers

andrew

cheers


From: "Greg Stark" <stark(at)enterprisedb(dot)com>
To: "Andrew Dunstan" <andrew(at)dunslane(dot)net>
Cc: "Dave Page" <dpage(at)pgadmin(dot)org>, "David E(dot) Wheeler" <david(at)kineticode(dot)com>, "Heikki Linnakangas" <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New to_timestamp implementation is pretty strict
Date: 2008-12-01 15:33:28
Message-ID: 4136ffa0812010733v7c337779l47e6ec6a0e566002@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

How would you parse an input format of just 'SS' ? is there something
ambiguous about '3' ? I don't see anything "bad" about using %d to
output an integer number of seconds.

--
greg


From: "Robert Haas" <robertmhaas(at)gmail(dot)com>
To: "Greg Stark" <stark(at)enterprisedb(dot)com>
Cc: "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "Dave Page" <dpage(at)pgadmin(dot)org>, "David E(dot) Wheeler" <david(at)kineticode(dot)com>, "Heikki Linnakangas" <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New to_timestamp implementation is pretty strict
Date: 2008-12-01 15:46:49
Message-ID: 603c8f070812010746ufa66e15n8e7a8a09149c94e5@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Dec 1, 2008 at 10:33 AM, Greg Stark <stark(at)enterprisedb(dot)com> wrote:
> How would you parse an input format of just 'SS' ? is there something
> ambiguous about '3' ? I don't see anything "bad" about using %d to
> output an integer number of seconds.

+1.

It seems to me that it's pretty silly to say that we "know" that the 2
in "01:2:03" is intended to mean 02, but we are somehow confused about
whether the 3 in "01:02:3" is intended to mean 03 or 30. Sure, the
latter could be the result of a truncation, but if the user is
randomly truncating their strings, they're going to have problems with
a lot more than to_timestamp().

...Robert


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Greg Stark <stark(at)enterprisedb(dot)com>
Cc: Dave Page <dpage(at)pgadmin(dot)org>, "David E(dot) Wheeler" <david(at)kineticode(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New to_timestamp implementation is pretty strict
Date: 2008-12-01 15:50:35
Message-ID: 493407CB.5090403@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Stark wrote:
> How would you parse an input format of just 'SS' ? is there something
> ambiguous about '3' ? I don't see anything "bad" about using %d to
> output an integer number of seconds.
>
>

The docs say that SS corresponds to "second (00-59)", so clearly it
should expect a two digit zero padded number.

What's so hard about using "%0.2d" ?

cheers

andrew


From: "Robert Haas" <robertmhaas(at)gmail(dot)com>
To: "Greg Stark" <stark(at)enterprisedb(dot)com>
Cc: "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "Dave Page" <dpage(at)pgadmin(dot)org>, "David E(dot) Wheeler" <david(at)kineticode(dot)com>, "Heikki Linnakangas" <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New to_timestamp implementation is pretty strict
Date: 2008-12-01 16:05:59
Message-ID: 603c8f070812010805y1e390982nab6fdafdd918f776@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Another point here is that we have always accepted single digits in dates:

portal=> select '2008-11-1'::date;
date
------------
2008-11-01
(1 row)

portal=> select '2008-1-11'::date;
date
------------
2008-01-11
(1 row)

If we're going to handle dates and timestamps inconsistently, there
should be a good reason for it.

...Robert


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Robert Haas" <robertmhaas(at)gmail(dot)com>
Cc: "Greg Stark" <stark(at)enterprisedb(dot)com>, "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "Dave Page" <dpage(at)pgadmin(dot)org>, "David E(dot) Wheeler" <david(at)kineticode(dot)com>, "Heikki Linnakangas" <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New to_timestamp implementation is pretty strict
Date: 2008-12-01 17:22:45
Message-ID: 23118.1228152165@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Robert Haas" <robertmhaas(at)gmail(dot)com> writes:
> Another point here is that we have always accepted single digits in dates:

Yeah, but that's the general datetime input code, which has rather
different goals than to_timestamp().

After thinking about it I'm inclined to feel that SS and friends should
insist on exactly 2 digits. If you want to allow 1-or-2-digits then use
FMSS, just like the error message tells you. (However, I have a vague
feeling that Oracle doesn't insist on this, and in the end we ought to
follow Oracle's behavior. Can anyone check?)

In any case, it's certainly broken that the last field behaves
differently from not-last fields. I'm not all that set on whether we
insist on two digits or not, but I do think the inconsistency needs
to be fixed.

regards, tom lane


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Greg Stark <stark(at)enterprisedb(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Dave Page <dpage(at)pgadmin(dot)org>, "David E(dot) Wheeler" <david(at)kineticode(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New to_timestamp implementation is pretty strict
Date: 2008-12-01 18:46:41
Message-ID: 49343111.8070300@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas wrote:
> On Mon, Dec 1, 2008 at 10:33 AM, Greg Stark <stark(at)enterprisedb(dot)com> wrote:
>> How would you parse an input format of just 'SS' ? is there something
>> ambiguous about '3' ? I don't see anything "bad" about using %d to
>> output an integer number of seconds.
>
> +1.
>
> It seems to me that it's pretty silly to say that we "know" that the 2
> in "01:2:03" is intended to mean 02, but we are somehow confused about
> whether the 3 in "01:02:3" is intended to mean 03 or 30.

Yep. It's a fair argument that we shouldn't accept either, but the
inconsistency is just wrong. I've committed a patch fixing the
inconsistency, by allowing "01:02:3".

Now whether we should forbid both, my opinion is that we shouldn't; that
would just unnecessarily brake old applications, and I don't think
there's much danger of ambiguity in what "01:2:03" means.

For better or worse, we also allow these more questionable inputs:

postgres=# SELECT to_timestamp('2008/-3/01', 'YYYY/MM/DD');
to_timestamp
------------------------
2007-09-01 00:00:00+03
(1 row)

postgres=# SELECT to_timestamp('2008--3-01', 'YYYY-MM-DD');
to_timestamp
------------------------
2007-09-01 00:00:00+03
(1 row)

postgres=# SELECT to_timestamp('2008-03', 'YYYY-MM-DD');
to_timestamp
------------------------
2008-03-01 00:00:00+02
(1 row)

postgres=# SELECT to_timestamp('2008-03-04-foobar', 'YYYY-MM-DD');
to_timestamp
------------------------
2008-03-04 00:00:00+02
(1 row)

The argument for rejecting these is stronger, IMHO, but given that we
allowed these in previous releases as well, I don't think we try to
forbid them either.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: "Robert Haas" <robertmhaas(at)gmail(dot)com>
To: "Heikki Linnakangas" <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: "Greg Stark" <stark(at)enterprisedb(dot)com>, "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "Dave Page" <dpage(at)pgadmin(dot)org>, "David E(dot) Wheeler" <david(at)kineticode(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New to_timestamp implementation is pretty strict
Date: 2008-12-01 19:00:50
Message-ID: 603c8f070812011100j4d6b990ftd98fbd531f635123@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> For better or worse, we also allow these more questionable inputs:

Wow. Those are all pretty atrocious.

Even so, it's not clear to me that there's a lot of merit to changing
the behavior. If to_timestamp() isn't rigorous enough, you can always
stick some additional error checking in front of it; it's easy to
write a regular expression that will only match EXACTLY YYYY-MM-DD if
that's what you want to do. If to_timestamp() is excessively
pedantic, it forces you into rewriting to_timestamp(), which is a lot
more work. I probably still wouldn't make it accept anything quite
as... creative... as these examples if starting over, but now that the
existing version is out there, I think breaking backward compatibility
isn't warranted.

...Robert


From: "Brendan Jurd" <direvus(at)gmail(dot)com>
To: "Heikki Linnakangas" <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New to_timestamp implementation is pretty strict
Date: 2008-12-02 04:48:56
Message-ID: 37ed240d0812012048p6c7bec3fgda7c16b335af3a7e@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Dec 1, 2008 at 11:08 PM, Heikki Linnakangas
<heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
> postgres=# SELECT to_timestamp('29-12-2005 01:02:3', 'DD-MM-YYYY
> HH24:MI:SS'); -- doesn't work
...
> I think the end of string should be treated like a field separator, colon in
> this example, and we should accept both of the above. Opinions?
>

(With apologies for being late to the thread)

I can agree that it is inconsistent to treat separator characters,
like colons and hyphens, differently to the end of the string. If
we're allowing separators to imply variable-width nodes, then the end
of the string should do so as well, so I think Heikki did the right
thing here.

However, I actually think that, ultimately, Heikki's example above
*should* be rejected, and so should any other attempt to provide a
value of the wrong width, even if there are separators present, unless
the user has specified the 'FM' fill mode flag.

Heikki's example shows a too-short string for SS, but what about one
that is too long? Should we accept

# to_timestamp('29122005 0102333', 'DDMMYYYY HH24MISS')

As meaning three hundred and thirty-three seconds? I would argue we
shouldn't; it's most likely that the user made an error, so the right
thing to do is throw an exception and give them an opportunity to fix
it. Making guesses about the user's intention when the input is
heavily ambiguous isn't a fun game to be playing, for us or for the
user.

Given the contrary arguments (backwards- and Oracle- compatibility,
mostly) I decided that was too much for me to bite off in my patch.
That was the same reason I didn't fiddle with the treatment of
end-of-string; 8.3 didn't treat it as a separator either. The
different is that, although it always treated the final field as
fixed-width, prior to my patch it didn't actually throw an error when
fixed-width fields were too short.

Cheers,
BJ


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Stark <stark(at)enterprisedb(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Dave Page <dpage(at)pgadmin(dot)org>, "David E(dot) Wheeler" <david(at)kineticode(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New to_timestamp implementation is pretty strict
Date: 2008-12-02 08:19:42
Message-ID: 4934EF9E.7040600@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> After thinking about it I'm inclined to feel that SS and friends should
> insist on exactly 2 digits. If you want to allow 1-or-2-digits then use
> FMSS, just like the error message tells you. (However, I have a vague
> feeling that Oracle doesn't insist on this, and in the end we ought to
> follow Oracle's behavior. Can anyone check?)

Oracle doesn't insist on it.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com