Re: COPY table FROM STDIN doesn't show count tag

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Rajeev rastogi <rajeev(dot)rastogi(at)huawei(dot)com>
Cc: Amit Khandekar <amit(dot)khandekar(at)enterprisedb(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: COPY table FROM STDIN doesn't show count tag
Date: 2014-03-10 18:14:19
Message-ID: 14537.1394475259@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Rajeev rastogi <rajeev(dot)rastogi(at)huawei(dot)com> writes:
> On 12th December 2013, Rajeev Rastogi Wrote:
>> On 9th December, Amit Khandelkar wrote:
>>> But copystream can be different than pset.cur_cmd_source , right ?

>> As per the earlier code, condition result was always true. So pset.lineno was always incremented.
>> In the earlier code pset.cur_cmd_source was sent as parameter to function and inside the function same parameter was used with the name copystream. So on entry of this function both will be one and same.

The problem with that argument is you're assuming that the previous
behavior was correct :-(. It isn't. If you try a case like this:

$ cat int8data
123 456
123 4567890123456789
4567890123456789 123
4567890123456789 4567890123456789
4567890123456789 -4567890123456789
$ cat testcase.sql
select 1+1;

\copy int8_tbl from 'int8data'

select 1/0;

select 2/0;
$ psql -f testcase.sql regression
?column?
----------
2
(1 row)

psql:testcase.sql:11: ERROR: division by zero
psql:testcase.sql:13: ERROR: division by zero

the script line numbers shown in the error messages are *wrong*,
because handleCopyIn has incorrectly incremented pset.lineno because
it thought it was reading from the current script file. So the
override_file business is wrong, and getting rid of it with a separate
copyStream variable is a good thing.

However, there wasn't much else that I liked about the patch :-(.
It seemed bizarre to me that the copy source/sink selection logic was
partially in ProcessResult and partially in handleCopyOut/handleCopyIn.
Also you'd created a memory leak because ProcessResult now failed to
PQclear the original PGRES_COPY_OUT/IN PGresult. I did a bit of work
to clean that up, and the attached updated patch is the result.

Unfortunately, while testing it I noticed that there's a potentially
fatal backwards-compatibility problem, namely that the "COPY n" status
gets printed on stdout, which is the same place that COPY OUT data is
going. While this isn't such a big problem for interactive use,
usages like this one are pretty popular:

psql -c 'copy mytable to stdout' mydatabase | some-program

With the patch, "COPY n" gets included in the data sent to some-program,
which never happened before and is surely not what the user wants.
The same if the -c string uses \copy.

There are several things we could do about this:

1. Treat this as a non-backwards-compatible change, and document that
people have to use -q if they don't want the COPY tag in the output.
I'm not sure this is acceptable.

2. Kluge ProcessResult so that it continues to not pass back a PGresult
for the COPY TO STDOUT case, or does so only in limited circumstances
(perhaps only if isatty(stdout), for instance).

3. Modify PrintQueryStatus so that command status goes to stderr not
stdout. While this is probably how it should've been done in the first
place, this would be a far more severe compatibility break than #1.
(For one thing, there are probably scripts out there that think that any
output to stderr is an error message.) I'm afraid this one is definitely
not acceptable, though it would be by far the cleanest solution were it
not for compatibility concerns.

4. As #3, but print the command status to stderr only if it's "COPY n",
otherwise to stdout. This is a smaller compatibility break than #3,
but still a break since COPY status was formerly issued to stdout
in non TO STDOUT/FROM STDIN cases. (Note that PrintQueryStatus can't
tell whether it was COPY TO STDOUT rather than any other kind of COPY;
if we want that to factor into the behavior, we need ProcessResult to
do it.)

5. Give up on the print-the-tag aspect of the change, and just fix the
wrong-line-number issue (so we'd still introduce the copyStream variable,
but not change how PGresults are passed around).

I'm inclined to think #2 is the best answer if we can't stomach #1.
But the exact rule for when to print a COPY OUT result probably
still requires some debate. Or maybe someone has another idea?

Also, I'm thinking we should back-patch the aspects of the patch
needed to fix the wrong-line-number issue. That appears to have been
introduced in 9.2; older versions of PG get the above example right.

Comments?

regards, tom lane

Attachment Content-Type Size
psql-copy-count-tag-20140310.patch text/x-diff 11.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2014-03-10 18:19:26 Re: pg_upgrade on high number tables database issues
Previous Message Robert Haas 2014-03-10 18:07:13 Re: Retain dynamic shared memory segments for postmaster lifetime