Re: Ragged CSV import

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Ragged CSV import
Date: 2009-09-09 20:46:59
Message-ID: 4AA81443.9090203@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane wrote:
> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
>
>> I have received a requirement for the ability to import ragged CSV
>> files, i.e. files that contain variable numbers of columns per row. The
>> requirement is that extra columns would be ignored and missing columns
>> filled with NULL. The client wanting this has wrestled with some
>> preprocessors to try to get what they want, but they would feel happier
>> with this built in. This isn't the first time I have received this
>> request since we implemented CSV import. People have complained on
>> numerous occasions about the strictness of the import routines w.r.t.
>> the number of columns.
>>
>
> Hmm. Accepting too few columns and filling with nulls isn't any
> different than what INSERT has always done. But ignoring extra columns
> seems like a different ballgame. Can you talk your client out of that
> one? It just seems like a bad idea.
>

No, that's critical. The application this is wanted for uploads data
that users put in spreadsheets. The users apparently expect that they
will be able to put comments on some rows off to the right of the data
they want loaded, and have it ignored.

To answer your other point made later, my intention was to make this
optional behaviour, not default behaviour. I agree that it would be too
slack for default behaviour. Yes, we have quite a few options, but
that's not surprising in dealing with a format that is at best
ill-defined and which we do not control.

> As for the "numerous occasions", maybe I've not been paying attention,
> but I don't recall any ...
>
>

The requests have been made on IRC, at conferences, in private emails.

cheers

andrew

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Kevin Grittner 2009-09-09 20:47:48 Re: Ragged CSV import
Previous Message Pavel Stehule 2009-09-09 20:46:51 Re: RfD: more powerful "any" types