Re: Incorrect "invalid AM/PM string" error from to_timestamp

Lists: pgsql-bugs
From: "Joshua Tolley" <eggyknap(at)gmail(dot)com>
To: pgsql-bugs(at)postgresql(dot)org
Subject: Incorrect "invalid AM/PM string" error from to_timestamp
Date: 2008-09-25 16:05:17
Message-ID: e7e0a2570809250905h2ad60d7cj33a490816341538b@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

I'm running this on 8.4devel built from CVS HEAD as of 9:44 AM Moutain
Daylight time today, on a 32-bit Ubuntu 7.04 box. This is a completely
new database.

jtolley(at)polonium:~/devel/raybo$ psql raybo -f test.sql
CREATE TABLE
INSERT 0 1
INSERT 0 1
psql:test.sql:21: ERROR: invalid AM/PM string
psql:test.sql:22: ERROR: invalid AM/PM string

The script in question is below:

CREATE TABLE ft (
id1 numeric(13,0),
am1 numeric(13,2),
am1 numeric(13,2),
ct timestamp without time zone
);

-- -- The commands below come from a dataset I was loading. While
trying to reduce the problem to a small test case, I found that
various combinations of
-- -- the available columns showed the error, and others did not. The
commands below are one set that don't display the error.
-- INSERT INTO ft (am1, am2, ct) VALUES (45749.25, 45749.25,
to_timestamp('21-SEP-08 03.26.04.378954 AM', 'DD-MON-YY HH.MI.SS.US
AM'));
-- INSERT INTO ft (am1, am2, ct) VALUES (60, 60,
to_timestamp('21-SEP-08 08.15.42.950620 PM', 'DD-MON-YY HH.MI.SS.US
AM'));
-- INSERT INTO ft (am1, am2, ct) VALUES (46.17, 46.17,
to_timestamp('21-SEP-08 08.15.42.950620 PM', 'DD-MON-YY HH.MI.SS.US
AM'));
-- INSERT INTO ft (am1, am2, ct) VALUES (61.55, 61.55,
to_timestamp('21-SEP-08 03.26.04.350738 AM', 'DD-MON-YY HH.MI.SS.US
AM'));

-- -- This set of commands shows the error. Note that running these
statements in a different order, or inserting other commands into the
statements, or other
-- -- such modifications may very well prevent the error from occurring.
INSERT INTO ft (id1, am1, ct) VALUES (6924257, 45749.25,
to_timestamp('21-SEP-08 03.26.04.378954 AM', 'DD-MON-YY HH.MI.SS.US
AM'));
INSERT INTO ft (id1, am1, ct) VALUES (6924611, 60,
to_timestamp('21-SEP-08 08.15.42.950620 PM', 'DD-MON-YY HH.MI.SS.US
AM'));
INSERT INTO ft (id1, am1, ct) VALUES (6924612, 46.17,
to_timestamp('21-SEP-08 08.15.42.950620 PM', 'DD-MON-YY HH.MI.SS.US
AM'));
INSERT INTO ft (id1, am1, ct) VALUES (6924254, 61.55,
to_timestamp('21-SEP-08 03.26.04.350738 AM', 'DD-MON-YY HH.MI.SS.US
AM'));

-- Josh Tolley / eggyknap


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Joshua Tolley" <eggyknap(at)gmail(dot)com>
Cc: pgsql-bugs(at)postgresql(dot)org, "Brendan Jurd" <direvus(at)gmail(dot)com>
Subject: Re: Incorrect "invalid AM/PM string" error from to_timestamp
Date: 2008-09-25 16:22:31
Message-ID: 8148.1222359751@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

"Joshua Tolley" <eggyknap(at)gmail(dot)com> writes:
> I'm running this on 8.4devel built from CVS HEAD as of 9:44 AM Moutain
> Daylight time today, on a 32-bit Ubuntu 7.04 box. This is a completely
> new database.

Ugh, apparently there's some state-dependent bug in the recent
to_timestamp fixes:

regression=# \c -
psql (8.4devel)
You are now connected to database "regression".
regression=# select to_timestamp('21-SEP-08 08.15.42.950620 PM', 'DD-MON-YY HH.MI.SS.US AM');
to_timestamp
------------------------------
2008-09-21 08:15:42.95062-04
(1 row)

regression=# select to_timestamp('21-SEP-08 08.15.42.950620 PM', 'DD-MON-YY HH.MI.SS.US AM');
ERROR: invalid AM/PM string
regression=#

Yes, those are the same command. The error is repeatable on subsequent
executions in the same session.

A likely bet is that this is caused by use of uninitialized memory,
which happens to have garbage rather than zeroes in it the second
time through.

Brendan, do you have time to look into this?

BTW, the error message is pretty unhelpful: ISTM it really ought to show
the value it's complaining of.

regards, tom lane


From: "Alex Hunsaker" <badalex(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Joshua Tolley" <eggyknap(at)gmail(dot)com>, pgsql-bugs(at)postgresql(dot)org, "Brendan Jurd" <direvus(at)gmail(dot)com>
Subject: Re: Incorrect "invalid AM/PM string" error from to_timestamp
Date: 2008-09-25 22:05:04
Message-ID: 34d269d40809251505r6ac84315ncde79e884bbb5fae@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Thu, Sep 25, 2008 at 10:22 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> A likely bet is that this is caused by use of uninitialized memory,
> which happens to have garbage rather than zeroes in it the second
> time through.

Yep both DCH_MC and DCH_US were going past the end of the string
because they still added the length of the string where
from_char_parse_int_len takes care of that for us now...

The attach patch fixes it and tries to improve the "invalid AM/PM
string" a bit by showing the string.


From: "Alex Hunsaker" <badalex(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Joshua Tolley" <eggyknap(at)gmail(dot)com>, pgsql-bugs(at)postgresql(dot)org, "Brendan Jurd" <direvus(at)gmail(dot)com>
Subject: Re: Incorrect "invalid AM/PM string" error from to_timestamp
Date: 2008-09-25 22:07:23
Message-ID: 34d269d40809251507y4eec6ad7s641c20d9b62ed5a4@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Thu, Sep 25, 2008 at 4:05 PM, Alex Hunsaker <badalex(at)gmail(dot)com> wrote:
> On Thu, Sep 25, 2008 at 10:22 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> A likely bet is that this is caused by use of uninitialized memory,
>> which happens to have garbage rather than zeroes in it the second
>> time through.
>
> Yep both DCH_MC and DCH_US were going past the end of the string
> because they still added the length of the string where
> from_char_parse_int_len takes care of that for us now...
>
> The attach patch fixes it and tries to improve the "invalid AM/PM
> string" a bit by showing the string.

[Actually attaches the patch...]

Attachment Content-Type Size
fix_to_timestamp.patch application/octet-stream 1.5 KB

From: "Brendan Jurd" <direvus(at)gmail(dot)com>
To: "Alex Hunsaker" <badalex(at)gmail(dot)com>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Joshua Tolley" <eggyknap(at)gmail(dot)com>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: Incorrect "invalid AM/PM string" error from to_timestamp
Date: 2008-09-26 14:04:57
Message-ID: 37ed240d0809260704m2a642230sc4280e583d3cbff1@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Fri, Sep 26, 2008 at 8:05 AM, Alex Hunsaker <badalex(at)gmail(dot)com> wrote:
> On Thu, Sep 25, 2008 at 10:22 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> A likely bet is that this is caused by use of uninitialized memory,
>> which happens to have garbage rather than zeroes in it the second
>> time through.
>
> Yep both DCH_MC and DCH_US were going past the end of the string
> because they still added the length of the string where
> from_char_parse_int_len takes care of that for us now...
>

Nice catch. I think your patch does the right thing here by pulling
"len" out of those lines which advance the pointer after calling
from_char_parse_int_len.

> The attach patch fixes it and tries to improve the "invalid AM/PM
> string" a bit by showing the string.
>

Not so sure about this part ... because it just spits out the variable
"s", it will show whatever is left in the string at the time the macro
is called. That works okay when the AM/PM string is at the end of the
pattern, but ends up looking pretty weird otherwise:

postgres=# select to_timestamp('11:47 Pm 26 Sep 2008', 'HH:MI AM DD Mon YYYY');
ERROR: invalid AM/PM string for 'Pm 26 Sep 2008'

I have some thoughts on this and other issues surrounding AM/PM, but
perhaps they are better reserved for a separate thread. Might I
suggest we apply Alex's bugfix and hold off on the message changes
pending further discussion?

Cheers,
BJ


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Brendan Jurd" <direvus(at)gmail(dot)com>
Cc: "Alex Hunsaker" <badalex(at)gmail(dot)com>, "Joshua Tolley" <eggyknap(at)gmail(dot)com>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: Incorrect "invalid AM/PM string" error from to_timestamp
Date: 2008-09-26 14:11:19
Message-ID: 16029.1222438279@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

"Brendan Jurd" <direvus(at)gmail(dot)com> writes:
> I have some thoughts on this and other issues surrounding AM/PM, but
> perhaps they are better reserved for a separate thread. Might I
> suggest we apply Alex's bugfix and hold off on the message changes
> pending further discussion?

Agreed on separating the message issue. What I wanted to know was
whether there are similar bugs elsewhere in to_timestamp, or whether
you're pretty sure this is the only occurrence of the coding pattern?
I've always found it helpful to think "where else did we make this
same mistake" after identifying a bug.

regards, tom lane


From: "Joshua Tolley" <eggyknap(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Brendan Jurd" <direvus(at)gmail(dot)com>, "Alex Hunsaker" <badalex(at)gmail(dot)com>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: Incorrect "invalid AM/PM string" error from to_timestamp
Date: 2008-09-26 14:15:15
Message-ID: e7e0a2570809260715t2ac3d737md8497b33f84d192c@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Fri, Sep 26, 2008 at 8:11 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> "Brendan Jurd" <direvus(at)gmail(dot)com> writes:
>> I have some thoughts on this and other issues surrounding AM/PM, but
>> perhaps they are better reserved for a separate thread. Might I
>> suggest we apply Alex's bugfix and hold off on the message changes
>> pending further discussion?
>
> Agreed on separating the message issue. What I wanted to know was
> whether there are similar bugs elsewhere in to_timestamp, or whether
> you're pretty sure this is the only occurrence of the coding pattern?
> I've always found it helpful to think "where else did we make this
> same mistake" after identifying a bug.
>
> regards, tom lane
>

I guess we know this already, but for the sake of trying to appear
involved, I've confirmed this fixes the problem as I was seeing it. +1
for the message being a bit more specific, though it's a definite
improvement as is.

- Josh

PS> My apologies for sending this directly to Tom instead of to the
list the first time I sent it.


From: "Brendan Jurd" <direvus(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Alex Hunsaker" <badalex(at)gmail(dot)com>, "Joshua Tolley" <eggyknap(at)gmail(dot)com>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: Incorrect "invalid AM/PM string" error from to_timestamp
Date: 2008-09-26 14:35:26
Message-ID: 37ed240d0809260735t70e65e53i4154b0e1f6992380@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Sat, Sep 27, 2008 at 12:11 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Agreed on separating the message issue. What I wanted to know was
> whether there are similar bugs elsewhere in to_timestamp, or whether
> you're pretty sure this is the only occurrence of the coding pattern?

I'm pretty sure that "MS" and "US" were the only occurrences of this
particular coding pattern, and Alex's fix hits both of those.

I've just reviewed all of the sites in DCH_from_char() which advance
the pointer (with Alex's patch applied). All have the standard "s +=
SKIP_THth(n->suffix)", with the single exception of "Y,YYY", which
needs its own special logic.

Cheers,
BJ


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Brendan Jurd" <direvus(at)gmail(dot)com>
Cc: "Alex Hunsaker" <badalex(at)gmail(dot)com>, "Joshua Tolley" <eggyknap(at)gmail(dot)com>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: Incorrect "invalid AM/PM string" error from to_timestamp
Date: 2008-09-26 15:36:08
Message-ID: 17771.1222443368@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

"Brendan Jurd" <direvus(at)gmail(dot)com> writes:
> On Sat, Sep 27, 2008 at 12:11 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Agreed on separating the message issue. What I wanted to know was
>> whether there are similar bugs elsewhere in to_timestamp, or whether
>> you're pretty sure this is the only occurrence of the coding pattern?

> I'm pretty sure that "MS" and "US" were the only occurrences of this
> particular coding pattern, and Alex's fix hits both of those.

OK, applied (without the message change, pending further discussion).

regards, tom lane