Review: Patch FORCE_NULL option for copy COPY in CSV mode

From: Payal Singh <payal(at)omniti(dot)com>
To: barwick(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Review: Patch FORCE_NULL option for copy COPY in CSV mode
Date: 2013-10-31 19:49:06
Message-ID: CANUg7LDW0WDrDpMhVHce9EDrOpR4O2_op6ySU3k0JoSdxrmHFA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

The post was made before I subscribed to the mailing list, so posting my
review separately. The link to the original patch mail is
http://www.postgresql.org/message-id/CAB8KJ=jS-Um4TGwenS5wLUfJK6K4rNOm_V6GRUj+tcKekL2=GQ@mail.gmail.com

> Hi,
>
> This patch implements the following TODO item:
>
> Allow COPY in CSV mode to control whether a quoted zero-length
> string is treated as NULL
>
> Currently this is always treated as a zero-length string,
> which generates an error when loading into an integer column
>
> Re: [PATCHES] allow CSV quote in NULL
> http://archives.postgresql.org/pgsql-hackers/2007-07/msg00905.php
>
>
> http://wiki.postgresql.org/wiki/Todo#COPY
>
>
> I had a very definite use-case for this functionality recently while importing
> CSV files generated by Oracle, and was somewhat frustrated by the existence
> of a FORCE_NOT_NULL option for specific columns, but not one for
> FORCE_NULL.
>
> I'll add this to the November commitfest.
>
>
> Regards
>
> Ian Barwick
>
>
==================
Contents & Purpose
==================

This patch introduces a new 'FORCE_NULL' option which has the opposite
functionality of the already present 'FORCE_NOT_NULL' option for the COPY
command. Prior to this option there was no way to convert a zeroed-length
quoted value to a NULL value when COPY FROM is used to import data from CSV
formatted files.

==================
Submission Review
==================

The patch is in the correct context diff format. It includes changes to the
documentation as well as additional regression tests. The description has
been discussed and defined in the previous mails leading to this patch.

=====================
Functionality Review
=====================

CORRECTION NEEDED - Due to a minor error in code (details in 'Code Review'
section below), force_null option is not limited to COPY FROM, and works
even when COPY TO is used. This should instead give an error message.

The updated documentation describes the added functionality clearly.

All regression tests passed successfully.

Code compilation after including patch was successful. No warnings either.

Manually tested COPY FROM with FORCE_NULL for a number of scenarios, all
with expected outputs. No issues.

Been testing the patch for a few days, no crashes or weird behavior
observed.

=============================================
Code Formatting Review (Needs Improvement)
=============================================

Looks good, the tabs between variable declaration and accompanying comment
can be improved.

=================================
Code Review (Needs Improvement)
=================================

1. There is a " NOTE: force_not_null option are not applied to the returned
fields." before COPY FROM block. Similar note should be added for
force_null option too.

2. One of the conditions to check and give an error if force_null is true
and copy from is false is wrong, cstate->force_null should be checked
instead of cstate->force_notnull:

/* Check force_notnull */
if (!cstate->csv_mode && cstate->force_notnull != NIL)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("COPY force not null available only
in CSV mode")));
if (cstate->force_notnull != NIL && !is_from)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("COPY force not null only available using
COPY FROM")));

/* Check force_null */
if (!cstate->csv_mode && cstate->force_null != NIL)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("COPY force null available only in
CSV mode")));

==> if (cstate->force_notnull != NIL && !is_from)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("COPY force null only available using COPY
FROM")));

===============================
Suggested Changes & Conclusion
===============================

The Above mentioned error condition should be corrected. Minor comments and
tab changes are upto the author.

In all, suggested modifications aside, the patch works well and in my
opinion, would be a useful addition to the COPY command.

Payal Singh,
OmniTi Computer Consulting Inc.
Junior Database Architect

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kevin Grittner 2013-10-31 19:51:53 Re: Record comparison compiler warning
Previous Message Robert Haas 2013-10-31 19:42:33 Re: How can I build OSSP UUID support on Windows to avoid duplicate UUIDs?