Re: COPY CSV header line feature

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: pgman(at)candle(dot)pha(dot)pa(dot)us
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl>, pgsql-patches(at)postgresql(dot)org
Subject: Re: COPY CSV header line feature
Date: 2005-05-07 02:22:47
Message-ID: 200505070222.j472Mll26765@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches


Applied.

---------------------------------------------------------------------------

pgman wrote:
>
> Here is an updated version of this patch, with documentation changes.
>
> I have already updated the gram.y comment you suggested.
>
> ---------------------------------------------------------------------------
>
> Andrew Dunstan wrote:
> >
> > ammended patch attached. sorry for the oversight. I agree with Tom's
> > remark - it's far too easy to miss this.
> >
> > cheers
> >
> > andrew
> >
> > Alvaro Herrera wrote:
> >
> > >On Tue, Mar 15, 2005 at 08:55:36PM -0600, Andrew Dunstan wrote:
> > >
> > >
> > >>Alvaro Herrera said:
> > >>
> > >>
> > >>>On Sun, Mar 13, 2005 at 06:32:20PM -0500, Andrew Dunstan wrote:
> > >>>
> > >>>
> > >>>
> > >>>>The attached patch implements the previously discussed header line
> > >>>>feature for CSV mode COPY. It is triggered by the keyword HEADER
> > >>>>(blame Bruce - he chose it ;-) ).
> > >>>>
> > >>>>
> > >>>I think you should add the new reserved keyword to the
> > >>>unreserved_keywords list or some other.
> > >>>
> > >>>
> > >>Please be more specific. I'll be happy to add in anything I've missed.
> > >>
> > >>
> > >
> > >The Postgres grammar classifies keywords in one of several lists, in
> > >order to make them available as names to users (column names, function
> > >names, etc). So each time you create a new keyword and add it to the
> > >keywords.c list, you have to add it to one of the lists on gram.y too.
> > >See gram.y line 7669 ff.
> > >
> > >I'd add a comment on this on gram.y:
> > >
> > >Index: gram.y
> > >===================================================================
> > >RCS file: /home/alvherre/cvs/pgsql/src/backend/parser/gram.y,v
> > >retrieving revision 2.484
> > >diff -c -w -b -B -c -r2.484 gram.y
> > >*** gram.y 14 Mar 2005 00:19:36 -0000 2.484
> > >--- gram.y 16 Mar 2005 03:12:48 -0000
> > >***************
> > >*** 327,333 ****
> > > /*
> > > * If you make any token changes, update the keyword table in
> > > * parser/keywords.c and add new keywords to the appropriate one of
> > >! * the reserved-or-not-so-reserved keyword lists, below.
> > > */
> > >
> > > /* ordinary key words in alphabetical order */
> > >--- 327,334 ----
> > > /*
> > > * If you make any token changes, update the keyword table in
> > > * parser/keywords.c and add new keywords to the appropriate one of
> > >! * the reserved-or-not-so-reserved keyword lists, below; search this
> > >! * file for "Name classification hierarchy."
> > > */
> > >
> > > /* ordinary key words in alphabetical order */
> > >
> > >
> > >
>
>
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 4: Don't 'kill -9' the postmaster
>
> --
> Bruce Momjian | http://candle.pha.pa.us
> pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
> + If your life is a hard drive, | 13 Roberts Road
> + Christ can be your backup. | Newtown Square, Pennsylvania 19073

> Index: doc/src/sgml/ref/copy.sgml
> ===================================================================
> RCS file: /cvsroot/pgsql/doc/src/sgml/ref/copy.sgml,v
> retrieving revision 1.63
> diff -c -c -r1.63 copy.sgml
> *** doc/src/sgml/ref/copy.sgml 4 Jan 2005 00:39:53 -0000 1.63
> --- doc/src/sgml/ref/copy.sgml 6 May 2005 03:36:30 -0000
> ***************
> *** 24,34 ****
> COPY <replaceable class="parameter">tablename</replaceable> [ ( <replaceable class="parameter">column</replaceable> [, ...] ) ]
> FROM { '<replaceable class="parameter">filename</replaceable>' | STDIN }
> [ [ WITH ]
> ! [ BINARY ]
> [ OIDS ]
> [ DELIMITER [ AS ] '<replaceable class="parameter">delimiter</replaceable>' ]
> [ NULL [ AS ] '<replaceable class="parameter">null string</replaceable>' ]
> ! [ CSV [ QUOTE [ AS ] '<replaceable class="parameter">quote</replaceable>' ]
> [ ESCAPE [ AS ] '<replaceable class="parameter">escape</replaceable>' ]
> [ FORCE NOT NULL <replaceable class="parameter">column</replaceable> [, ...] ]
>
> --- 24,35 ----
> COPY <replaceable class="parameter">tablename</replaceable> [ ( <replaceable class="parameter">column</replaceable> [, ...] ) ]
> FROM { '<replaceable class="parameter">filename</replaceable>' | STDIN }
> [ [ WITH ]
> ! [ BINARY ]
> [ OIDS ]
> [ DELIMITER [ AS ] '<replaceable class="parameter">delimiter</replaceable>' ]
> [ NULL [ AS ] '<replaceable class="parameter">null string</replaceable>' ]
> ! [ CSV [ HEADER ]
> ! [ QUOTE [ AS ] '<replaceable class="parameter">quote</replaceable>' ]
> [ ESCAPE [ AS ] '<replaceable class="parameter">escape</replaceable>' ]
> [ FORCE NOT NULL <replaceable class="parameter">column</replaceable> [, ...] ]
>
> ***************
> *** 36,45 ****
> TO { '<replaceable class="parameter">filename</replaceable>' | STDOUT }
> [ [ WITH ]
> [ BINARY ]
> [ OIDS ]
> [ DELIMITER [ AS ] '<replaceable class="parameter">delimiter</replaceable>' ]
> [ NULL [ AS ] '<replaceable class="parameter">null string</replaceable>' ]
> ! [ CSV [ QUOTE [ AS ] '<replaceable class="parameter">quote</replaceable>' ]
> [ ESCAPE [ AS ] '<replaceable class="parameter">escape</replaceable>' ]
> [ FORCE QUOTE <replaceable class="parameter">column</replaceable> [, ...] ]
> </synopsis>
> --- 37,48 ----
> TO { '<replaceable class="parameter">filename</replaceable>' | STDOUT }
> [ [ WITH ]
> [ BINARY ]
> + [ HEADER ]
> [ OIDS ]
> [ DELIMITER [ AS ] '<replaceable class="parameter">delimiter</replaceable>' ]
> [ NULL [ AS ] '<replaceable class="parameter">null string</replaceable>' ]
> ! [ CSV [ HEADER ]
> ! [ QUOTE [ AS ] '<replaceable class="parameter">quote</replaceable>' ]
> [ ESCAPE [ AS ] '<replaceable class="parameter">escape</replaceable>' ]
> [ FORCE QUOTE <replaceable class="parameter">column</replaceable> [, ...] ]
> </synopsis>
> ***************
> *** 192,197 ****
> --- 195,211 ----
> </varlistentry>
>
> <varlistentry>
> + <term><literal>HEADER</literal></term>
> + <listitem>
> + <para>
> + Specifies the file contains a header line with the names of each
> + column in the file. On output, the first line contains the column
> + names from the table, and on input, the first line is ignored.
> + </para>
> + </listitem>
> + </varlistentry>
> +
> + <varlistentry>
> <term><replaceable class="parameter">quote</replaceable></term>
> <listitem>
> <para>
> Index: src/backend/commands/copy.c
> ===================================================================
> RCS file: /cvsroot/pgsql/src/backend/commands/copy.c,v
> retrieving revision 1.242
> diff -c -c -r1.242 copy.c
> *** src/backend/commands/copy.c 6 May 2005 02:56:42 -0000 1.242
> --- src/backend/commands/copy.c 6 May 2005 03:36:31 -0000
> ***************
> *** 130,142 ****
> /* non-export function prototypes */
> static void DoCopyTo(Relation rel, List *attnumlist, bool binary, bool oids,
> char *delim, char *null_print, bool csv_mode, char *quote,
> ! char *escape, List *force_quote_atts, bool fe_copy);
> static void CopyTo(Relation rel, List *attnumlist, bool binary, bool oids,
> char *delim, char *null_print, bool csv_mode, char *quote, char *escape,
> ! List *force_quote_atts);
> static void CopyFrom(Relation rel, List *attnumlist, bool binary, bool oids,
> char *delim, char *null_print, bool csv_mode, char *quote, char *escape,
> ! List *force_notnull_atts);
> static bool CopyReadLine(char * quote, char * escape);
> static char *CopyReadAttribute(const char *delim, const char *null_print,
> CopyReadResult *result, bool *isnull);
> --- 130,142 ----
> /* non-export function prototypes */
> static void DoCopyTo(Relation rel, List *attnumlist, bool binary, bool oids,
> char *delim, char *null_print, bool csv_mode, char *quote,
> ! char *escape, List *force_quote_atts, bool header_line, bool fe_copy);
> static void CopyTo(Relation rel, List *attnumlist, bool binary, bool oids,
> char *delim, char *null_print, bool csv_mode, char *quote, char *escape,
> ! List *force_quote_atts, bool header_line);
> static void CopyFrom(Relation rel, List *attnumlist, bool binary, bool oids,
> char *delim, char *null_print, bool csv_mode, char *quote, char *escape,
> ! List *force_notnull_atts, bool header_line);
> static bool CopyReadLine(char * quote, char * escape);
> static char *CopyReadAttribute(const char *delim, const char *null_print,
> CopyReadResult *result, bool *isnull);
> ***************
> *** 694,699 ****
> --- 694,700 ----
> bool binary = false;
> bool oids = false;
> bool csv_mode = false;
> + bool header_line = false;
> char *delim = NULL;
> char *quote = NULL;
> char *escape = NULL;
> ***************
> *** 751,756 ****
> --- 752,765 ----
> errmsg("conflicting or redundant options")));
> csv_mode = intVal(defel->arg);
> }
> + else if (strcmp(defel->defname, "header") == 0)
> + {
> + if (header_line)
> + ereport(ERROR,
> + (errcode(ERRCODE_SYNTAX_ERROR),
> + errmsg("conflicting or redundant options")));
> + header_line = intVal(defel->arg);
> + }
> else if (strcmp(defel->defname, "quote") == 0)
> {
> if (quote)
> ***************
> *** 824,829 ****
> --- 833,844 ----
> (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> errmsg("COPY delimiter must be a single character")));
>
> + /* Check header */
> + if (!csv_mode && header_line)
> + ereport(ERROR,
> + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> + errmsg("COPY HEADER available only in CSV mode")));
> +
> /* Check quote */
> if (!csv_mode && quote != NULL)
> ereport(ERROR,
> ***************
> *** 1014,1020 ****
> }
> }
> CopyFrom(rel, attnumlist, binary, oids, delim, null_print, csv_mode,
> ! quote, escape, force_notnull_atts);
> }
> else
> { /* copy from database to file */
> --- 1029,1035 ----
> }
> }
> CopyFrom(rel, attnumlist, binary, oids, delim, null_print, csv_mode,
> ! quote, escape, force_notnull_atts, header_line);
> }
> else
> { /* copy from database to file */
> ***************
> *** 1078,1084 ****
> }
>
> DoCopyTo(rel, attnumlist, binary, oids, delim, null_print, csv_mode,
> ! quote, escape, force_quote_atts, fe_copy);
> }
>
> if (!pipe)
> --- 1093,1099 ----
> }
>
> DoCopyTo(rel, attnumlist, binary, oids, delim, null_print, csv_mode,
> ! quote, escape, force_quote_atts, header_line, fe_copy);
> }
>
> if (!pipe)
> ***************
> *** 1110,1116 ****
> static void
> DoCopyTo(Relation rel, List *attnumlist, bool binary, bool oids,
> char *delim, char *null_print, bool csv_mode, char *quote,
> ! char *escape, List *force_quote_atts, bool fe_copy)
> {
> PG_TRY();
> {
> --- 1125,1131 ----
> static void
> DoCopyTo(Relation rel, List *attnumlist, bool binary, bool oids,
> char *delim, char *null_print, bool csv_mode, char *quote,
> ! char *escape, List *force_quote_atts, bool header_line, bool fe_copy)
> {
> PG_TRY();
> {
> ***************
> *** 1118,1124 ****
> SendCopyBegin(binary, list_length(attnumlist));
>
> CopyTo(rel, attnumlist, binary, oids, delim, null_print, csv_mode,
> ! quote, escape, force_quote_atts);
>
> if (fe_copy)
> SendCopyEnd(binary);
> --- 1133,1139 ----
> SendCopyBegin(binary, list_length(attnumlist));
>
> CopyTo(rel, attnumlist, binary, oids, delim, null_print, csv_mode,
> ! quote, escape, force_quote_atts, header_line);
>
> if (fe_copy)
> SendCopyEnd(binary);
> ***************
> *** 1142,1148 ****
> static void
> CopyTo(Relation rel, List *attnumlist, bool binary, bool oids,
> char *delim, char *null_print, bool csv_mode, char *quote,
> ! char *escape, List *force_quote_atts)
> {
> HeapTuple tuple;
> TupleDesc tupDesc;
> --- 1157,1163 ----
> static void
> CopyTo(Relation rel, List *attnumlist, bool binary, bool oids,
> char *delim, char *null_print, bool csv_mode, char *quote,
> ! char *escape, List *force_quote_atts, bool header_line)
> {
> HeapTuple tuple;
> TupleDesc tupDesc;
> ***************
> *** 1225,1230 ****
> --- 1240,1269 ----
> null_print = (char *)
> pg_server_to_client((unsigned char *) null_print,
> strlen(null_print));
> +
> + /* if a header has been requested send the line */
> + if (header_line)
> + {
> + bool hdr_delim = false;
> + char *colname;
> +
> + foreach(cur, attnumlist)
> + {
> + int attnum = lfirst_int(cur);
> +
> + if (hdr_delim)
> + CopySendChar(delim[0]);
> + hdr_delim = true;
> +
> + colname = NameStr(attr[attnum - 1]->attname);
> +
> + CopyAttributeOutCSV(colname, delim, quote, escape,
> + strcmp(colname, null_print) == 0);
> + }
> +
> + CopySendEndOfRow(binary);
> +
> + }
> }
>
> scandesc = heap_beginscan(rel, ActiveSnapshot, 0, NULL);
> ***************
> *** 1426,1432 ****
> static void
> CopyFrom(Relation rel, List *attnumlist, bool binary, bool oids,
> char *delim, char *null_print, bool csv_mode, char *quote,
> ! char *escape, List *force_notnull_atts)
> {
> HeapTuple tuple;
> TupleDesc tupDesc;
> --- 1465,1471 ----
> static void
> CopyFrom(Relation rel, List *attnumlist, bool binary, bool oids,
> char *delim, char *null_print, bool csv_mode, char *quote,
> ! char *escape, List *force_notnull_atts, bool header_line)
> {
> HeapTuple tuple;
> TupleDesc tupDesc;
> ***************
> *** 1652,1657 ****
> --- 1691,1703 ----
> errcontext.previous = error_context_stack;
> error_context_stack = &errcontext;
>
> + /* on input just throw the header line away */
> + if (header_line)
> + {
> + copy_lineno++;
> + done = CopyReadLine(quote, escape) ;
> + }
> +
> while (!done)
> {
> bool skip_tuple;
> Index: src/backend/parser/gram.y
> ===================================================================
> RCS file: /cvsroot/pgsql/src/backend/parser/gram.y,v
> retrieving revision 2.489
> diff -c -c -r2.489 gram.y
> *** src/backend/parser/gram.y 28 Apr 2005 21:47:14 -0000 2.489
> --- src/backend/parser/gram.y 6 May 2005 03:36:36 -0000
> ***************
> *** 361,367 ****
>
> GLOBAL GRANT GROUP_P
>
> ! HANDLER HAVING HOLD HOUR_P
>
> ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IN_P INCLUDING INCREMENT
> INDEX INHERITS INITIALLY INNER_P INOUT INPUT_P
> --- 361,367 ----
>
> GLOBAL GRANT GROUP_P
>
> ! HANDLER HAVING HEADER HOLD HOUR_P
>
> ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IN_P INCLUDING INCREMENT
> INDEX INHERITS INITIALLY INNER_P INOUT INPUT_P
> ***************
> *** 1443,1448 ****
> --- 1443,1452 ----
> {
> $$ = makeDefElem("csv", (Node *)makeInteger(TRUE));
> }
> + | HEADER
> + {
> + $$ = makeDefElem("header", (Node *)makeInteger(TRUE));
> + }
> | QUOTE opt_as Sconst
> {
> $$ = makeDefElem("quote", (Node *)makeString($3));
> ***************
> *** 7786,7791 ****
> --- 7790,7796 ----
> | FUNCTION
> | GLOBAL
> | HANDLER
> + | HEADER
> | HOLD
> | HOUR_P
> | IMMEDIATE
> Index: src/backend/parser/keywords.c
> ===================================================================
> RCS file: /cvsroot/pgsql/src/backend/parser/keywords.c,v
> retrieving revision 1.154
> diff -c -c -r1.154 keywords.c
> *** src/backend/parser/keywords.c 31 Dec 2004 22:00:27 -0000 1.154
> --- src/backend/parser/keywords.c 6 May 2005 03:36:36 -0000
> ***************
> *** 148,153 ****
> --- 148,154 ----
> {"group", GROUP_P},
> {"handler", HANDLER},
> {"having", HAVING},
> + {"header", HEADER},
> {"hold", HOLD},
> {"hour", HOUR_P},
> {"ilike", ILIKE},
> Index: src/bin/psql/copy.c
> ===================================================================
> RCS file: /cvsroot/pgsql/src/bin/psql/copy.c,v
> retrieving revision 1.56
> diff -c -c -r1.56 copy.c
> *** src/bin/psql/copy.c 22 Feb 2005 04:40:54 -0000 1.56
> --- src/bin/psql/copy.c 6 May 2005 03:36:39 -0000
> ***************
> *** 66,71 ****
> --- 66,72 ----
> bool binary;
> bool oids;
> bool csv_mode;
> + bool header;
> char *delim;
> char *null;
> char *quote;
> ***************
> *** 289,294 ****
> --- 290,297 ----
> result->oids = true;
> else if (pg_strcasecmp(token, "csv") == 0)
> result->csv_mode = true;
> + else if (pg_strcasecmp(token, "header") == 0)
> + result->header = true;
> else if (pg_strcasecmp(token, "delimiter") == 0)
> {
> token = strtokx(NULL, whitespace, NULL, "'",
> ***************
> *** 481,486 ****
> --- 484,492 ----
> if (options->csv_mode)
> appendPQExpBuffer(&query, " CSV");
>
> + if (options->header)
> + appendPQExpBuffer(&query, " HEADER");
> +
> if (options->quote)
> {
> if (options->quote[0] == '\'')
> Index: src/bin/psql/tab-complete.c
> ===================================================================
> RCS file: /cvsroot/pgsql/src/bin/psql/tab-complete.c,v
> retrieving revision 1.126
> diff -c -c -r1.126 tab-complete.c
> *** src/bin/psql/tab-complete.c 4 May 2005 14:25:24 -0000 1.126
> --- src/bin/psql/tab-complete.c 6 May 2005 03:36:40 -0000
> ***************
> *** 1040,1046 ****
> pg_strcasecmp(prev3_wd, "TO") == 0))
> {
> static const char *const list_CSV[] =
> ! {"QUOTE", "ESCAPE", "FORCE QUOTE", NULL};
>
> COMPLETE_WITH_LIST(list_CSV);
> }
> --- 1040,1046 ----
> pg_strcasecmp(prev3_wd, "TO") == 0))
> {
> static const char *const list_CSV[] =
> ! {"HEADER", "QUOTE", "ESCAPE", "FORCE QUOTE", NULL};
>
> COMPLETE_WITH_LIST(list_CSV);
> }

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

Browse pgsql-patches by date

  From Date Subject
Next Message Hideyuki Kawashima 2005-05-07 04:56:27 Please define PQ_BUFFER_SIZE in interfaces/libpq/fe-misc.c
Previous Message Bruce Momjian 2005-05-07 02:10:12 Re: [HACKERS] read-only database