Re: Timestamp input + copy

Lists: pgsql-general
From: "Kevin Bartz" <kbartz(at)loyaltymatrix(dot)com>
To: <pgsql-general(at)postgresql(dot)org>
Subject: Timestamp input + copy
Date: 2004-07-28 22:11:05
Message-ID: 20040728221644.A42F23FE8B@omta16.mta.everyone.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Hi Postgressers! I really like Postgres. Thanks for all your work on it. I
just have a problem with the way it's handling my flat file's timestamp
columns.

I have a flat file with a column with dates formatted like this:

2004-04-15 18:04:26 PM

It's a bit strange, I know, but I didn't create the file. My idea of
Postgres's proper behavior would be to load this date as a military time
(and ignore the "PM"). MS SQL Server behaves in this way. Postgres, however,
won't even load the file:

bonusticket=copy bonusticket2004Q2 from
'/home/kevin/bonusticket/data3/uberfile/uberfile1.txt' null as '';
# ERROR: date/time field value out of range: "2004-04-15 18:04:26 PM"
CONTEXT: COPY bonusticket2004q2, line 17, column submit_date: "2004-04-15
18:04:26 PM"

I presume that Postgres is complaining about the fact that I have an 18 in
the hour slot of a supposedly PM time. What can I do about this? Can I
possibly specify a time format (similar to the 'YYYY-MM-DD HH24:MI:SS' I
might pass to to_timestamp) at load time?

Thanks for any help you can provide,

Kevin


From: "Chris Smith" <cdsmith(at)twu(dot)net>
To: <pgsql-general(at)postgresql(dot)org>
Subject: Re: Timestamp input + copy
Date: 2004-07-28 23:00:57
Message-ID: 0f6701c474f6$be5e89a0$6f00000a@KYA
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Kevin Bartz wrote:
> I have a flat file with a column with dates formatted like this:
>
> 2004-04-15 18:04:26 PM
>
> It's a bit strange, I know, but I didn't create the file. My idea of
> Postgres's proper behavior would be to load this date as a military
> time (and ignore the "PM"). MS SQL Server behaves in this way.

I couldn't disagree more that it would be correct behavior to ignore the PM
and treat it as 24-hour time. It's one of the most important features of a
database that when you give is bad data, it responds with an error message
rather than trying to guess at what you mean. Why are you using a database,
if not to ensure that you can trust your data; and how can you trust data that
comes from an ambiguous source?

I'm sorry to hear that SQL Server accepts this input without complaint. It's
caused me to lose confidence in that product.

> What can I do about
> this? Can I possibly specify a time format (similar to the
> 'YYYY-MM-DD HH24:MI:SS' I might pass to to_timestamp) at load time?

If you know that the time is in 24-hour form and want to ignore the AM or PM
specifier, then you can certainly run it by a processor written in pretty much
any programming language that will fix it. In UNIX sed, it looks like this
(all on one line):

cat data.txt | sed 's([0-9]{2,4}\-[0-9]{1,2}\-[0-9]{1,2}
[0-9]{1,2}\:[0-9]{1,2}:[0-9]{1,2}) ((AM|PM))?/\1/g' > data.txt.fixed

--
www.designacourse.com
The Easiest Way to Train Anyone... Anywhere.

Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation


From: Steve Crawford <scrawford(at)pinpointresearch(dot)com>
To: "Kevin Bartz" <kbartz(at)loyaltymatrix(dot)com>, <pgsql-general(at)postgresql(dot)org>
Subject: Re: Timestamp input + copy
Date: 2004-07-28 23:15:42
Message-ID: 200407281615.42212.scrawford@pinpointresearch.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Wednesday 28 July 2004 3:11 pm, Kevin Bartz wrote:
...
> I have a flat file with a column with dates formatted like this:
>
> 2004-04-15 18:04:26 PM
>
> It's a bit strange, I know, but I didn't create the file. My idea
> of Postgres's proper behavior would be to load this date as a
> military time (and ignore the "PM"). MS SQL Server behaves in this
> way. Postgres, however, won't even load the file:

Edit the file to eliminate the PM. I don't know how large the file is
but fixing the date should be pretty easy with vi, sed, perl, etc.
(choose the one that works for you).

What seems like PostgreSQL being annoying and pedantic is really
PostgreSQL protecting your a** - the importance of data integrity and
all that.

Some databases try to be "helpful" (or are just sloppy) so if you try
to insert a number bigger than that allowed by the field it just
truncates it to the largest number that will fit. I hope nobody uses
that database for financial data.

What should a database do if confronted with '2004-04-15 18:04:26 AM'?
I would much prefer the system to throw an error and let me evaluate
and fix it than silently "help" me by loading corrupt data.

Cheers and welcome to PostgreSQL,
Steve