Meridiem markers (was: [BUGS] Incorrect "invalid AM/PM string" error from to_timestamp)

From: "Brendan Jurd" <direvus(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Alex Hunsaker" <badalex(at)gmail(dot)com>
Subject: Meridiem markers (was: [BUGS] Incorrect "invalid AM/PM string" error from to_timestamp)
Date: 2008-09-26 17:25:48
Message-ID: 37ed240d0809261025j70a64981s9d8325569a69af67@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hey folks,

Tom pointed out in the parent thread [1] that the error message for
bogus AM/PM markers in to_timestamp is pretty lame:

ERROR: invalid AM/PM string

I agree, and once I started thinking about this, I came up with other
gripes concerning the treatment of 12-hour time in to_timestamp, the
use of the meridiem markers (AM/PM) in particular.

Currently, Postgres accepts four separate flavours for specifying
meridiem markers, given by uppercase/lowercase and with/without
periods:

* am/pm
* AM/PM
* a.m./p.m.
* A.M./P.M.

What I find surprising about the implementation is that you must
specify the correct formatting pattern for the particular flavour of
meridiem marker in your input string, or you get the aforementioned
lame error message. Consider:

postgres=# select to_timestamp('11:47 pm 27 Sep 2008', 'HH:MI PM DD Mon YYYY');
ERROR: invalid AM/PM string

Here it seems to me that Postgres is being unnecessarily pedantic.
Yes, the case of the marker differs from the case of the formatting
keyword, but is that really grounds for an ERROR? The user's
intention is perfectly unambiguous, so why not just accept the input?

I would go so far as to say that we should accept any of the 8 valid
meridiem markers, regardless of which flavour is indicated by the
formatting keyword.

Day and month names already work this way. We don't throw an error if
a user specifies a mixed-case month name like "Sep" but uses the
uppercase formatting keyword "MON".

I suspect that the current behaviour isn't so much a deliberate choice
to be draconian as it is a consequence of the way the code was
developed; as the inverse of to_char().

One way to tidy this up would be to re-implement the meridiem markers
using the seq_search functions, i.e., make it work like the day and
month names. This would make it easy to accept any flavour of marker,
and the error messages thrown for bogus input would then be the same
as those for bogus day and month names.

Note that all of the above applies equally to the era markers B.C. and A.D.

Comments?

Cheers,
BJ

[1] http://archives.postgresql.org/message-id/8148.1222359751@sss.pgh.pa.us

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2008-09-26 17:36:29 Re: Meridiem markers (was: [BUGS] Incorrect "invalid AM/PM string" error from to_timestamp)
Previous Message Stefan Kaltenbrunner 2008-09-26 17:06:31 Re: lock contention on parallel COPY ?