Re: ToDo: allow to get a number of processed rows by COPY statement [Review of Patch]

From: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
To: "'Pavel Stehule'" <pavel(dot)stehule(at)gmail(dot)com>
Cc: "'Heikki Linnakangas'" <heikki(dot)linnakangas(at)enterprisedb(dot)com>, "'Pg Hackers'" <pgsql-hackers(at)postgresql(dot)org>, "'Bruce Momjian'" <bruce(at)momjian(dot)us>
Subject: Re: ToDo: allow to get a number of processed rows by COPY statement [Review of Patch]
Date: 2012-09-21 09:36:06
Message-ID: 005e01cd97dc$86a964a0$93fc2de0$@kapila@huawei.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Friday, September 21, 2012 1:23 AM Pavel Stehule wrote:

>> Basic stuff:
>> ------------
>> - Patch applies OK. but offset difference in line numbers.
>> - Compiles with errors in contrib [pg_stat_statements, sepgsql] modules
>> - Regression failed; one test-case in COPY due to incomplete test-case
>> attached patch. – same as reported by Heikki
>
>fixed patch is in attachment

After modifications:
---------------------------
- Patch applies OK
- Compiles cleanly without any errors/warnings
- Regression tests pass.

>>
>> What it does:
>> --------------
>> Modification to get the number of processed rows evaluated via SPI. The
>> changes are to add extra parameter in ProcessUtility to get the number of
>> rows processed by COPY command.
>>
>> Code Review Comments:
>> ---------------------
>> 1. New parameter is added to ProcessUtility_hook_type function
>> but the functions which get assigned to these functions like
>> sepgsql_utility_command, pgss_ProcessUtility, prototype & definition is
>> not modified.

Functionality is not fixed correctly for hook functions, In function pgss_ProcessUtility
for bellow snippet of code processed parameter is passed NULL, as well as not initialized.
because of this when "pg_stat_statements" extention is utilized COPY command is giving garbage values.
if (prev_ProcessUtility)
prev_ProcessUtility(parsetree, queryString, params,
dest, completionTag, context, NULL);
else
standard_ProcessUtility(parsetree, queryString, params,
dest, completionTag, context, NULL);

Testcase is attached.
In this testcase table has only 1000 records but it show garbage value.
postgres=# show shared_preload_libraries ;
shared_preload_libraries
--------------------------
pg_stat_statements
(1 row)
postgres=# CREATE TABLE tbl (a int);
CREATE TABLE
postgres=# INSERT INTO tbl VALUES(generate_series(1,1000));
INSERT 0 1000
postgres=# do $$
declare r int;
begin
copy tbl to '/home/kiran/copytest.csv' csv;
get diagnostics r = row_count;
raise notice 'exported % rows', r;
truncate tbl;
copy tbl from '/home/kiran/copytest.csv' csv;
get diagnostics r = row_count;
raise notice 'imported % rows', r;
end;
$$ language plpgsql;
postgres$#
NOTICE: exported 13281616 rows
NOTICE: imported 13281616 rows
DO

>>
>> 2. Why to add the new parameter if completionTag hold the number of
>> processed tuple information; can be extracted
>>
>> from it as follows:
>> _SPI_current->processed = strtoul(completionTag + 7,
>> NULL, 10);
>
>this is basic question. I prefer a natural type for counter - uint64
>instead text. And there are no simply way to get offset (7 in this
>case)

I agree with your point, but currently in few other places we are parsing the completion tag for getting number of tuples processed. So may be in future we can change those places as well. For example

pgss_ProcessUtility
{
..

/* parse command tag to retrieve the number of affected rows. */
if (completionTag &&
sscanf(completionTag, "COPY " UINT64_FORMAT, &rows) != 1)
rows = 0;

}

_SPI_execute_plan
{
..
..
if (IsA(stmt, CreateTableAsStmt))
{
Assert(strncmp(completionTag, "SELECT ", 7) == 0);
_SPI_current->processed = strtoul(completionTag + 7,
NULL, 10);

..
}

With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tatsuo Ishii 2012-09-21 09:39:28 Re: 64-bit API for large object
Previous Message Kohei KaiGai 2012-09-21 09:34:12 Re: 64-bit API for large object