Re: Patch: FORCE_NULL option for copy COPY in CSV mode

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Ian Lawrence Barwick <barwick(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Patch: FORCE_NULL option for copy COPY in CSV mode
Date: 2013-10-09 19:25:42
Message-ID: CA+Tgmob8-m0DdNBxJhDxxw+4SF+URbu=sXfoHKqnSBp7xBBVpg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Oct 9, 2013 at 9:52 AM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
> I really don't know what you're saying here.
>
> Here is the situation we have today (assuming the default null marker of
> empty-string):
>
> default: empty-string -> null, quoted-empty-string ->
> emptystring
> with force_not_null: empty-string -> emptystring, quoted-empty-string ->
> emptystring
>
> and the proposal would add to that:
>
> with force-null: empty-string -> null, quoted-empty-string -> null
>
> So it appears to be quite on all fours with the way force_not_null works
> now, it just does the reverse.

That principle seems sound as far as it goes, but somehow I feel we've
dug ourselves into a hole here naming-wise. Apparently an unquoted
empty string is normally null, but you can use force-not-null to make
it an empty string instead. And a quoted empty string is normally an
empty string, but you can use force-null to make it a null instead.
Therefore, a user who wants the opposite of the default behavior -
namely, unquoted empty strings as empty strings and quoted empty
strings as nulls - should specify both FORCE NULL and FORCE NOT NULL.
An unsuspecting user confronted with a column marked with both of
those options at the same time might be a bit perplexed.

It seems to me that it might be better to have an option called
empty_input, and then you could have four values with names we can
bikeshed about - e.g. auto (the current default behavior), null
(proposed force null), empty_string (current force not null), reversed
(force null + force not null).

(In the interest of full disclosure, there is an EDB product that as
of recently offers something very much like this, with slightly
different naming and offering only the first three of those four
options, motivated by a customer complaint about the default behavior,
which was the same as COPY's default behavior. I don't know that the
solution we adopted there has any bearing on what ought to be done
here, and I can certainly live with it if people prefer to have FORCE
NULL and FORCE NOT NULL with not-quite-opposite meanings, but I do
think it's sort of hilariously awful, right up there with constraint
exclusion vs. exclusion constraints.)

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kohei KaiGai 2013-10-09 19:31:57 Re: Triggers on foreign tables
Previous Message Andres Freund 2013-10-09 19:07:55 Re: Auto-tuning work_mem and maintenance_work_mem