Re: ToDo: allow to get a number of processed rows by COPY statement [Review of Patch]

Lists: pgsql-hackers
From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: Re: ToDo: allow to get a number of processed rows by COPY statement [Review of Patch]
Date: 2012-09-20 19:53:07
Message-ID: CAFj8pRDpCT89d3rKtsk4DDbZXWSH2L_s8LXBZ2CPDeWP=yyOeA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello

>
>
> Basic stuff:
> ------------
> - Patch applies OK. but offset difference in line numbers.
> - Compiles with errors in contrib [pg_stat_statements, sepgsql] modules
> - Regression failed; one test-case in COPY due to incomplete test-case
> attached patch. – same as reported by Heikki

fixed patch is in attachment

>
>
> Details of compilation errors and regression failure:
> ------------------------------------------------------
> PATH : postgres/src/contrib/pg_stat_statements
> gcc -g -ggdb3 -Wall -Wmissing-prototypes -Wpointer-arith
> -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute
> -Wformat-security -fno-strict-aliasing -fwrapv -fpic -I. -I.
> -I../../src/include -D_GNU_SOURCE -c -o pg_stat_statements.o
> pg_stat_statements.c
> pg_stat_statements.c: In function â_PG_initâ:
> pg_stat_statements.c:363: warning: assignment from incompatible pointer type
> pg_stat_statements.c: In function âpgss_ProcessUtilityâ:
> pg_stat_statements.c:823: error: too few arguments to function
> âprev_ProcessUtilityâ
> pg_stat_statements.c:826: error: too few arguments to function
> âstandard_ProcessUtilityâ
> pg_stat_statements.c:884: error: too few arguments to function
> âprev_ProcessUtilityâ
> pg_stat_statements.c:887: error: too few arguments to function
> âstandard_ProcessUtilityâ
> make: *** [pg_stat_statements.o] Error 1
>
> PATH : postgres/src/contrib/sepgsql
> gcc -g -ggdb3 -Wall -Wmissing-prototypes -Wpointer-arith
> -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute
> -Wformat-security -fno-strict-aliasing -fwrapv -fpic -I. -I.
> -I../../src/include -D_GNU_SOURCE -c -o hooks.o hooks.c
> In file included from hooks.c:26:
> sepgsql.h:17:29: error: selinux/selinux.h: No such file or directory
> sepgsql.h:18:25: error: selinux/avc.h: No such file or directory
> In file included from hooks.c:26:
> sepgsql.h:239: warning: âstruct av_decisionâ declared inside parameter list
> sepgsql.h:239: warning: its scope is only this definition or declaration,
> which is probably not what you want
> hooks.c: In function âsepgsql_utility_commandâ:
> hooks.c:331: error: too few arguments to function ânext_ProcessUtility_hookâ
> hooks.c:334: error: too few arguments to function âstandard_ProcessUtilityâ
> hooks.c: In function â_PG_initâ:
> hooks.c:365: warning: implicit declaration of function âis_selinux_enabledâ
> hooks.c:426: warning: assignment from incompatible pointer type
> make: *** [hooks.o] Error 1
>
> REGRESSION TEST:
> ------------------
> make installcheck
> test copy ... FAILED
> ========================
> 1 of 132 tests failed.
> ========================
> ~/postgres/src/test/regress> cat regression.diffs
> *** /home/postgres/code/src/test/regress/expected/copy.out 2012-09-18
> 19:57:02.000000000 +0530
> --- /home/postgres/code/src/test/regress/results/copy.out 2012-09-18
> 19:57:18.000000000 +0530
> ***************
> *** 71,73 ****
> --- 71,80 ----
> c1,"col with , comma","col with "" quote"
> 1,a,1
> 2,b,2
> + -- copy should to return processed rows
> + do $$
> +
> + ERROR: unterminated dollar-quoted string at or near "$$
> + "
> + LINE 1: do $$
> + ^
>

fixed

>
> What it does:
> --------------
> Modification to get the number of processed rows evaluated via SPI. The
> changes are to add extra parameter in ProcessUtility to get the number of
> rows processed by COPY command.
>
>
>
> Code Review Comments:
> ---------------------
> 1. New parameter is added to ProcessUtility_hook_type function
> but the functions which get assigned to these functions like
> sepgsql_utility_command, pgss_ProcessUtility, prototype & definition is
> not modified.
>
> 2. Why to add the new parameter if completionTag hold the number of
> processed tuple information; can be extracted
>
> from it as follows:
> _SPI_current->processed = strtoul(completionTag + 7,
> NULL, 10);

this is basic question. I prefer a natural type for counter - uint64
instead text. And there are no simply way to get offset (7 in this
case)

>
> 3. Why new variable added in portal [portal->processed] this is not used in
> code ?

This value can be used anywhere instead parsing completionTag to get
returned numbers

>
>
> Testing:
> ------------
> The following test is carried out on the patch.
> 1. Normal SQL command COPY FROM / TO is tested.
> 2. SQL command COPY FROM / TO is tested from plpgsql.
>
> Test cases are attached in Testcases_Copy_Processed.txt
>

Regards

Pavel

>
>
> With Regards,
>
> Amit Kapila.
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

Attachment Content-Type Size
copy-processed-rows.diff application/octet-stream 19.3 KB

From: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
To: "'Pavel Stehule'" <pavel(dot)stehule(at)gmail(dot)com>
Cc: "'Heikki Linnakangas'" <heikki(dot)linnakangas(at)enterprisedb(dot)com>, "'Pg Hackers'" <pgsql-hackers(at)postgresql(dot)org>, "'Bruce Momjian'" <bruce(at)momjian(dot)us>
Subject: Re: ToDo: allow to get a number of processed rows by COPY statement [Review of Patch]
Date: 2012-09-21 09:36:06
Message-ID: 005e01cd97dc$86a964a0$93fc2de0$@kapila@huawei.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Friday, September 21, 2012 1:23 AM Pavel Stehule wrote:

>> Basic stuff:
>> ------------
>> - Patch applies OK. but offset difference in line numbers.
>> - Compiles with errors in contrib [pg_stat_statements, sepgsql] modules
>> - Regression failed; one test-case in COPY due to incomplete test-case
>> attached patch. – same as reported by Heikki
>
>fixed patch is in attachment

After modifications:
---------------------------
- Patch applies OK
- Compiles cleanly without any errors/warnings
- Regression tests pass.

>>
>> What it does:
>> --------------
>> Modification to get the number of processed rows evaluated via SPI. The
>> changes are to add extra parameter in ProcessUtility to get the number of
>> rows processed by COPY command.
>>
>> Code Review Comments:
>> ---------------------
>> 1. New parameter is added to ProcessUtility_hook_type function
>> but the functions which get assigned to these functions like
>> sepgsql_utility_command, pgss_ProcessUtility, prototype & definition is
>> not modified.

Functionality is not fixed correctly for hook functions, In function pgss_ProcessUtility
for bellow snippet of code processed parameter is passed NULL, as well as not initialized.
because of this when "pg_stat_statements" extention is utilized COPY command is giving garbage values.
if (prev_ProcessUtility)
prev_ProcessUtility(parsetree, queryString, params,
dest, completionTag, context, NULL);
else
standard_ProcessUtility(parsetree, queryString, params,
dest, completionTag, context, NULL);

Testcase is attached.
In this testcase table has only 1000 records but it show garbage value.
postgres=# show shared_preload_libraries ;
shared_preload_libraries
--------------------------
pg_stat_statements
(1 row)
postgres=# CREATE TABLE tbl (a int);
CREATE TABLE
postgres=# INSERT INTO tbl VALUES(generate_series(1,1000));
INSERT 0 1000
postgres=# do $$
declare r int;
begin
copy tbl to '/home/kiran/copytest.csv' csv;
get diagnostics r = row_count;
raise notice 'exported % rows', r;
truncate tbl;
copy tbl from '/home/kiran/copytest.csv' csv;
get diagnostics r = row_count;
raise notice 'imported % rows', r;
end;
$$ language plpgsql;
postgres$#
NOTICE: exported 13281616 rows
NOTICE: imported 13281616 rows
DO

>>
>> 2. Why to add the new parameter if completionTag hold the number of
>> processed tuple information; can be extracted
>>
>> from it as follows:
>> _SPI_current->processed = strtoul(completionTag + 7,
>> NULL, 10);
>
>this is basic question. I prefer a natural type for counter - uint64
>instead text. And there are no simply way to get offset (7 in this
>case)

I agree with your point, but currently in few other places we are parsing the completion tag for getting number of tuples processed. So may be in future we can change those places as well. For example

pgss_ProcessUtility
{
..

/* parse command tag to retrieve the number of affected rows. */
if (completionTag &&
sscanf(completionTag, "COPY " UINT64_FORMAT, &rows) != 1)
rows = 0;

}

_SPI_execute_plan
{
..
..
if (IsA(stmt, CreateTableAsStmt))
{
Assert(strncmp(completionTag, "SELECT ", 7) == 0);
_SPI_current->processed = strtoul(completionTag + 7,
NULL, 10);

..
}

With Regards,
Amit Kapila.


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: Re: ToDo: allow to get a number of processed rows by COPY statement [Review of Patch]
Date: 2012-09-27 20:58:23
Message-ID: CAFj8pRBD8bczZfGVam_V+36YCP_iMuX4Wa=G++gcZAdDmX4DGg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello

I reduced this patch as just plpgsql (SPI) problem solution.

Correct generic solution needs a discussion about enhancing our libpq
interface - we can take a number of returned rows, but we should to
get number of processed rows too. A users should to parse this number
from completationTag, but this problem is not too hot and usually is
not blocker, because anybody can process completationTag usually.

So this issue is primary for PL/pgSQL, where this information is not
possible. There is precedent - CREATE AS SELECT (thanks for sample
:)), and COPY should to have same impact on ROW_COUNT like this
statement (last patch implements it).

Regards

Pavel

2012/9/21 Amit Kapila <amit(dot)kapila(at)huawei(dot)com>:
> On Friday, September 21, 2012 1:23 AM Pavel Stehule wrote:
>
>>> Basic stuff:
>>> ------------
>>> - Patch applies OK. but offset difference in line numbers.
>>> - Compiles with errors in contrib [pg_stat_statements, sepgsql] modules
>>> - Regression failed; one test-case in COPY due to incomplete test-case
>>> attached patch. – same as reported by Heikki
>>
>>fixed patch is in attachment
>
> After modifications:
> ---------------------------
> - Patch applies OK
> - Compiles cleanly without any errors/warnings
> - Regression tests pass.
>
>>>
>>> What it does:
>>> --------------
>>> Modification to get the number of processed rows evaluated via SPI. The
>>> changes are to add extra parameter in ProcessUtility to get the number of
>>> rows processed by COPY command.
>>>
>>> Code Review Comments:
>>> ---------------------
>>> 1. New parameter is added to ProcessUtility_hook_type function
>>> but the functions which get assigned to these functions like
>>> sepgsql_utility_command, pgss_ProcessUtility, prototype & definition is
>>> not modified.
>
> Functionality is not fixed correctly for hook functions, In function pgss_ProcessUtility
> for bellow snippet of code processed parameter is passed NULL, as well as not initialized.
> because of this when "pg_stat_statements" extention is utilized COPY command is giving garbage values.
> if (prev_ProcessUtility)
> prev_ProcessUtility(parsetree, queryString, params,
> dest, completionTag, context, NULL);
> else
> standard_ProcessUtility(parsetree, queryString, params,
> dest, completionTag, context, NULL);
>
> Testcase is attached.
> In this testcase table has only 1000 records but it show garbage value.
> postgres=# show shared_preload_libraries ;
> shared_preload_libraries
> --------------------------
> pg_stat_statements
> (1 row)
> postgres=# CREATE TABLE tbl (a int);
> CREATE TABLE
> postgres=# INSERT INTO tbl VALUES(generate_series(1,1000));
> INSERT 0 1000
> postgres=# do $$
> declare r int;
> begin
> copy tbl to '/home/kiran/copytest.csv' csv;
> get diagnostics r = row_count;
> raise notice 'exported % rows', r;
> truncate tbl;
> copy tbl from '/home/kiran/copytest.csv' csv;
> get diagnostics r = row_count;
> raise notice 'imported % rows', r;
> end;
> $$ language plpgsql;
> postgres$#
> NOTICE: exported 13281616 rows
> NOTICE: imported 13281616 rows
> DO
>
>>>
>>> 2. Why to add the new parameter if completionTag hold the number of
>>> processed tuple information; can be extracted
>>>
>>> from it as follows:
>>> _SPI_current->processed = strtoul(completionTag + 7,
>>> NULL, 10);
>>
>>this is basic question. I prefer a natural type for counter - uint64
>>instead text. And there are no simply way to get offset (7 in this
>>case)
>
> I agree with your point, but currently in few other places we are parsing the completion tag for getting number of tuples processed. So may be in future we can change those places as well. For example

yes, this step can be done in future - together with libpq enhancing.
Is not adequate change API (and break lot of extensions) for this
isolated problem. So I propose fix plpgsql issue, and add to ToDo -
"enhance libpq to return a number of processed rows as number" - but
this change breaking compatibility.

Pavel

>
> pgss_ProcessUtility
> {
> ..
>
> /* parse command tag to retrieve the number of affected rows. */
> if (completionTag &&
> sscanf(completionTag, "COPY " UINT64_FORMAT, &rows) != 1)
> rows = 0;
>
> }
>
> _SPI_execute_plan
> {
> ..
> ..
> if (IsA(stmt, CreateTableAsStmt))
> {
> Assert(strncmp(completionTag, "SELECT ", 7) == 0);
> _SPI_current->processed = strtoul(completionTag + 7,
> NULL, 10);
>
> ..
> }
>
>
> With Regards,
> Amit Kapila.
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

Attachment Content-Type Size
copy-processed-rows-simple.patch application/octet-stream 1.9 KB

From: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
To: "'Pavel Stehule'" <pavel(dot)stehule(at)gmail(dot)com>
Cc: "'Heikki Linnakangas'" <heikki(dot)linnakangas(at)enterprisedb(dot)com>, "'Pg Hackers'" <pgsql-hackers(at)postgresql(dot)org>, "'Bruce Momjian'" <bruce(at)momjian(dot)us>
Subject: Re: ToDo: allow to get a number of processed rows by COPY statement [Review of Patch]
Date: 2012-09-28 04:47:21
Message-ID: 004e01cd9d34$59b6b9e0$0d242da0$@kapila@huawei.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> On Friday, September 28, 2012 2:28 AM Pavel Stehule wrote:
> Hello
>
> I reduced this patch as just plpgsql (SPI) problem solution.
>
> Correct generic solution needs a discussion about enhancing our libpq
> interface - we can take a number of returned rows, but we should to get
> number of processed rows too. A users should to parse this number from
> completationTag, but this problem is not too hot and usually is not
> blocker, because anybody can process completationTag usually.
>
> So this issue is primary for PL/pgSQL, where this information is not
> possible. There is precedent - CREATE AS SELECT (thanks for sample :)),
> and COPY should to have same impact on ROW_COUNT like this statement
> (last patch implements it).

IMO now this patch is okay. I have marked it as Ready For Committer.

With Regards,
Amit Kapila.

>
>
> 2012/9/21 Amit Kapila <amit(dot)kapila(at)huawei(dot)com>:
> > On Friday, September 21, 2012 1:23 AM Pavel Stehule wrote:
> >
> >>> Basic stuff:
> >>> ------------
> >>> - Patch applies OK. but offset difference in line numbers.
> >>> - Compiles with errors in contrib [pg_stat_statements, sepgsql]
> >>> modules
> >>> - Regression failed; one test-case in COPY due to incomplete
> >>> test-case attached patch. – same as reported by Heikki
> >>
> >>fixed patch is in attachment
> >
> > After modifications:
> > ---------------------------
> > - Patch applies OK
> > - Compiles cleanly without any errors/warnings
> > - Regression tests pass.
> >
> >>>
> >>> What it does:
> >>> --------------
> >>> Modification to get the number of processed rows evaluated via SPI.
> >>> The changes are to add extra parameter in ProcessUtility to get the
> >>> number of rows processed by COPY command.
> >>>
> >>> Code Review Comments:
> >>> ---------------------
> >>> 1. New parameter is added to ProcessUtility_hook_type function
> >>> but the functions which get assigned to these functions like
> >>> sepgsql_utility_command, pgss_ProcessUtility, prototype &
> >>> definition is not modified.
> >
> > Functionality is not fixed correctly for hook functions, In function
> > pgss_ProcessUtility for bellow snippet of code processed parameter is
> passed NULL, as well as not initialized.
> > because of this when "pg_stat_statements" extention is utilized COPY
> command is giving garbage values.
> > if (prev_ProcessUtility)
> > prev_ProcessUtility(parsetree, queryString, params,
> > dest,
> completionTag, context, NULL);
> > else
> > standard_ProcessUtility(parsetree, queryString,
> params,
> > dest,
> > completionTag, context, NULL);
> >
> > Testcase is attached.
> > In this testcase table has only 1000 records but it show garbage
> value.
> > postgres=# show shared_preload_libraries ;
> > shared_preload_libraries
> > --------------------------
> > pg_stat_statements
> > (1 row)
> > postgres=# CREATE TABLE tbl (a int);
> > CREATE TABLE
> > postgres=# INSERT INTO tbl VALUES(generate_series(1,1000));
> > INSERT 0 1000
> > postgres=# do $$
> > declare r int;
> > begin
> > copy tbl to '/home/kiran/copytest.csv' csv;
> > get diagnostics r = row_count;
> > raise notice 'exported % rows', r;
> > truncate tbl;
> > copy tbl from '/home/kiran/copytest.csv' csv;
> > get diagnostics r = row_count;
> > raise notice 'imported % rows', r;
> > end;
> > $$ language plpgsql;
> > postgres$#
> > NOTICE: exported 13281616 rows
> > NOTICE: imported 13281616 rows
> > DO
> >
> >>>
> >>> 2. Why to add the new parameter if completionTag hold the number of
> >>> processed tuple information; can be extracted
> >>>
> >>> from it as follows:
> >>> _SPI_current->processed = strtoul(completionTag
> >>> + 7, NULL, 10);
> >>
> >>this is basic question. I prefer a natural type for counter - uint64
> >>instead text. And there are no simply way to get offset (7 in this
> >>case)
> >
> > I agree with your point, but currently in few other places we are
> > parsing the completion tag for getting number of tuples processed. So
> > may be in future we can change those places as well. For example
>
> yes, this step can be done in future - together with libpq enhancing.
> Is not adequate change API (and break lot of extensions) for this
> isolated problem. So I propose fix plpgsql issue, and add to ToDo -
> "enhance libpq to return a number of processed rows as number" - but
> this change breaking compatibility.
>
> Pavel
>
> >
> > pgss_ProcessUtility
> > {
> > ..
> >
> > /* parse command tag to retrieve the number of affected rows. */ if
> > (completionTag &&
> > sscanf(completionTag, "COPY " UINT64_FORMAT, &rows) != 1)
> > rows = 0;
> >
> > }
> >
> > _SPI_execute_plan
> > {
> > ..
> > ..
> > if (IsA(stmt, CreateTableAsStmt))
> > {
> > Assert(strncmp(completionTag, "SELECT ", 7) == 0);
> > _SPI_current->processed = strtoul(completionTag + 7,
> >
> > NULL, 10);
> >
> > ..
> > }
> >
> >
> > With Regards,
> > Amit Kapila.
> >
> >
> >
> > --
> > Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org) To
> > make changes to your subscription:
> > http://www.postgresql.org/mailpref/pgsql-hackers


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: Re: ToDo: allow to get a number of processed rows by COPY statement [Review of Patch]
Date: 2012-09-28 05:19:06
Message-ID: CAFj8pRADh_R8bFixo6Eivnd_Dyw=JsUL40V-56XWE2Vmb2+wQA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2012/9/28 Amit Kapila <amit(dot)kapila(at)huawei(dot)com>:
>> On Friday, September 28, 2012 2:28 AM Pavel Stehule wrote:
>> Hello
>>
>> I reduced this patch as just plpgsql (SPI) problem solution.
>>
>> Correct generic solution needs a discussion about enhancing our libpq
>> interface - we can take a number of returned rows, but we should to get
>> number of processed rows too. A users should to parse this number from
>> completationTag, but this problem is not too hot and usually is not
>> blocker, because anybody can process completationTag usually.
>>
>> So this issue is primary for PL/pgSQL, where this information is not
>> possible. There is precedent - CREATE AS SELECT (thanks for sample :)),
>> and COPY should to have same impact on ROW_COUNT like this statement
>> (last patch implements it).
>
> IMO now this patch is okay. I have marked it as Ready For Committer.

Thank you very much for review

Regards

Pavel

>
> With Regards,
> Amit Kapila.
>
>
>>
>>
>> 2012/9/21 Amit Kapila <amit(dot)kapila(at)huawei(dot)com>:
>> > On Friday, September 21, 2012 1:23 AM Pavel Stehule wrote:
>> >
>> >>> Basic stuff:
>> >>> ------------
>> >>> - Patch applies OK. but offset difference in line numbers.
>> >>> - Compiles with errors in contrib [pg_stat_statements, sepgsql]
>> >>> modules
>> >>> - Regression failed; one test-case in COPY due to incomplete
>> >>> test-case attached patch. – same as reported by Heikki
>> >>
>> >>fixed patch is in attachment
>> >
>> > After modifications:
>> > ---------------------------
>> > - Patch applies OK
>> > - Compiles cleanly without any errors/warnings
>> > - Regression tests pass.
>> >
>> >>>
>> >>> What it does:
>> >>> --------------
>> >>> Modification to get the number of processed rows evaluated via SPI.
>> >>> The changes are to add extra parameter in ProcessUtility to get the
>> >>> number of rows processed by COPY command.
>> >>>
>> >>> Code Review Comments:
>> >>> ---------------------
>> >>> 1. New parameter is added to ProcessUtility_hook_type function
>> >>> but the functions which get assigned to these functions like
>> >>> sepgsql_utility_command, pgss_ProcessUtility, prototype &
>> >>> definition is not modified.
>> >
>> > Functionality is not fixed correctly for hook functions, In function
>> > pgss_ProcessUtility for bellow snippet of code processed parameter is
>> passed NULL, as well as not initialized.
>> > because of this when "pg_stat_statements" extention is utilized COPY
>> command is giving garbage values.
>> > if (prev_ProcessUtility)
>> > prev_ProcessUtility(parsetree, queryString, params,
>> > dest,
>> completionTag, context, NULL);
>> > else
>> > standard_ProcessUtility(parsetree, queryString,
>> params,
>> > dest,
>> > completionTag, context, NULL);
>> >
>> > Testcase is attached.
>> > In this testcase table has only 1000 records but it show garbage
>> value.
>> > postgres=# show shared_preload_libraries ;
>> > shared_preload_libraries
>> > --------------------------
>> > pg_stat_statements
>> > (1 row)
>> > postgres=# CREATE TABLE tbl (a int);
>> > CREATE TABLE
>> > postgres=# INSERT INTO tbl VALUES(generate_series(1,1000));
>> > INSERT 0 1000
>> > postgres=# do $$
>> > declare r int;
>> > begin
>> > copy tbl to '/home/kiran/copytest.csv' csv;
>> > get diagnostics r = row_count;
>> > raise notice 'exported % rows', r;
>> > truncate tbl;
>> > copy tbl from '/home/kiran/copytest.csv' csv;
>> > get diagnostics r = row_count;
>> > raise notice 'imported % rows', r;
>> > end;
>> > $$ language plpgsql;
>> > postgres$#
>> > NOTICE: exported 13281616 rows
>> > NOTICE: imported 13281616 rows
>> > DO
>> >
>> >>>
>> >>> 2. Why to add the new parameter if completionTag hold the number of
>> >>> processed tuple information; can be extracted
>> >>>
>> >>> from it as follows:
>> >>> _SPI_current->processed = strtoul(completionTag
>> >>> + 7, NULL, 10);
>> >>
>> >>this is basic question. I prefer a natural type for counter - uint64
>> >>instead text. And there are no simply way to get offset (7 in this
>> >>case)
>> >
>> > I agree with your point, but currently in few other places we are
>> > parsing the completion tag for getting number of tuples processed. So
>> > may be in future we can change those places as well. For example
>>
>> yes, this step can be done in future - together with libpq enhancing.
>> Is not adequate change API (and break lot of extensions) for this
>> isolated problem. So I propose fix plpgsql issue, and add to ToDo -
>> "enhance libpq to return a number of processed rows as number" - but
>> this change breaking compatibility.
>>
>> Pavel
>>
>> >
>> > pgss_ProcessUtility
>> > {
>> > ..
>> >
>> > /* parse command tag to retrieve the number of affected rows. */ if
>> > (completionTag &&
>> > sscanf(completionTag, "COPY " UINT64_FORMAT, &rows) != 1)
>> > rows = 0;
>> >
>> > }
>> >
>> > _SPI_execute_plan
>> > {
>> > ..
>> > ..
>> > if (IsA(stmt, CreateTableAsStmt))
>> > {
>> > Assert(strncmp(completionTag, "SELECT ", 7) == 0);
>> > _SPI_current->processed = strtoul(completionTag + 7,
>> >
>> > NULL, 10);
>> >
>> > ..
>> > }
>> >
>> >
>> > With Regards,
>> > Amit Kapila.
>> >
>> >
>> >
>> > --
>> > Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org) To
>> > make changes to your subscription:
>> > http://www.postgresql.org/mailpref/pgsql-hackers
>


From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: Re: ToDo: allow to get a number of processed rows by COPY statement [Review of Patch]
Date: 2012-10-01 07:29:31
Message-ID: 5069465B.5010302@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 27.09.2012 23:58, Pavel Stehule wrote:
> Hello
>
> I reduced this patch as just plpgsql (SPI) problem solution.
>
> Correct generic solution needs a discussion about enhancing our libpq
> interface - we can take a number of returned rows, but we should to
> get number of processed rows too. A users should to parse this number
> from completationTag, but this problem is not too hot and usually is
> not blocker, because anybody can process completationTag usually.
>
> So this issue is primary for PL/pgSQL, where this information is not
> possible. There is precedent - CREATE AS SELECT (thanks for sample
> :)), and COPY should to have same impact on ROW_COUNT like this
> statement (last patch implements it).

The comment where CREATE AS SELECT does this says:

> /*
> * CREATE TABLE AS is a messy special case for historical
> * reasons. We must set _SPI_current->processed even though
> * the tuples weren't returned to the caller, and we must
> * return a special result code if the statement was spelled
> * SELECT INTO.
> */

That doesn't sound like a very good precedent that we should copy to
COPY. I don't have a problem with this approach in general, though. But
if we're going to make it normal behavior, rather than an ugly
historical kluge, that utility statements can return a row count without
returning the tuples, we should document that. The above comment ought
to be changed, and the manual section about SPI_execute needs to mention
the possibility that SPI_processed != 0, but SPI_tuptable == NULL.

Regarding the patch itself:

> + else if (IsA(stmt, CopyStmt))
> + {
> + /*
> + * usually utility statements doesn't return a number
> + * of processed rows, but COPY does it.
> + */
> + Assert(strncmp(completionTag, "COPY ", 5) == 0);
> + _SPI_current->processed = strtoul(completionTag + 5,
> + NULL, 10);
> + }
> else
> res = SPI_OK_UTILITY;

'res' is not set for a CopyStmt, and there's an extra space in "COPY "
in the Assert.

- Heikki


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: Re: ToDo: allow to get a number of processed rows by COPY statement [Review of Patch]
Date: 2012-10-01 07:33:04
Message-ID: CAFj8pRB5vqG+RcKByct_8EiYYfxF5_u09K5THRz+iNztJohi-g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2012/10/1 Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>:
> On 27.09.2012 23:58, Pavel Stehule wrote:
>>
>> Hello
>>
>> I reduced this patch as just plpgsql (SPI) problem solution.
>>
>> Correct generic solution needs a discussion about enhancing our libpq
>> interface - we can take a number of returned rows, but we should to
>> get number of processed rows too. A users should to parse this number
>> from completationTag, but this problem is not too hot and usually is
>> not blocker, because anybody can process completationTag usually.
>>
>> So this issue is primary for PL/pgSQL, where this information is not
>> possible. There is precedent - CREATE AS SELECT (thanks for sample
>> :)), and COPY should to have same impact on ROW_COUNT like this
>> statement (last patch implements it).
>
>
> The comment where CREATE AS SELECT does this says:
>
>> /*
>> * CREATE TABLE AS is a messy special case for historical
>> * reasons. We must set _SPI_current->processed even though
>> * the tuples weren't returned to the caller, and we must
>> * return a special result code if the statement was spelled
>> * SELECT INTO.
>> */
>
>
> That doesn't sound like a very good precedent that we should copy to COPY. I
> don't have a problem with this approach in general, though. But if we're
> going to make it normal behavior, rather than an ugly historical kluge, that
> utility statements can return a row count without returning the tuples, we
> should document that. The above comment ought to be changed, and the manual
> section about SPI_execute needs to mention the possibility that
> SPI_processed != 0, but SPI_tuptable == NULL.
>
> Regarding the patch itself:
>
>> + else if (IsA(stmt, CopyStmt))
>> + {
>> + /*
>> + * usually utility statements
>> doesn't return a number
>> + * of processed rows, but COPY
>> does it.
>> + */
>> + Assert(strncmp(completionTag,
>> "COPY ", 5) == 0);
>> + _SPI_current->processed =
>> strtoul(completionTag + 5,
>> +
>> NULL, 10);
>> + }
>> else
>> res = SPI_OK_UTILITY;
>
>
> 'res' is not set for a CopyStmt, and there's an extra space in "COPY " in
> the Assert.

res is issue,

Extra space is correct, we would to break, when completetionTag is changed.

Pavel

>
> - Heikki


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: Re: ToDo: allow to get a number of processed rows by COPY statement [Review of Patch]
Date: 2012-10-02 11:09:07
Message-ID: CAFj8pRCcC-YqduDKFUkhCUC53EODNw-hvFRyPk3pD03FHizM=A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2012/10/1 Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>:
> On 27.09.2012 23:58, Pavel Stehule wrote:
>>
>> Hello
>>
>> I reduced this patch as just plpgsql (SPI) problem solution.
>>
>> Correct generic solution needs a discussion about enhancing our libpq
>> interface - we can take a number of returned rows, but we should to
>> get number of processed rows too. A users should to parse this number
>> from completationTag, but this problem is not too hot and usually is
>> not blocker, because anybody can process completationTag usually.
>>
>> So this issue is primary for PL/pgSQL, where this information is not
>> possible. There is precedent - CREATE AS SELECT (thanks for sample
>> :)), and COPY should to have same impact on ROW_COUNT like this
>> statement (last patch implements it).
>
>
> The comment where CREATE AS SELECT does this says:
>
>> /*
>> * CREATE TABLE AS is a messy special case for historical
>> * reasons. We must set _SPI_current->processed even though
>> * the tuples weren't returned to the caller, and we must
>> * return a special result code if the statement was spelled
>> * SELECT INTO.
>> */
>
>
> That doesn't sound like a very good precedent that we should copy to COPY. I
> don't have a problem with this approach in general, though. But if we're
> going to make it normal behavior, rather than an ugly historical kluge, that
> utility statements can return a row count without returning the tuples, we
> should document that. The above comment ought to be changed, and the manual
> section about SPI_execute needs to mention the possibility that
> SPI_processed != 0, but SPI_tuptable == NULL.
>
> Regarding the patch itself:
>
>> + else if (IsA(stmt, CopyStmt))
>> + {
>> + /*
>> + * usually utility statements
>> doesn't return a number
>> + * of processed rows, but COPY
>> does it.
>> + */
>> + Assert(strncmp(completionTag,
>> "COPY ", 5) == 0);
>> + _SPI_current->processed =
>> strtoul(completionTag + 5,
>> +
>> NULL, 10);
>> + }
>> else
>> res = SPI_OK_UTILITY;
>
>
> 'res' is not set for a CopyStmt, and there's an extra space in "COPY " in
> the Assert.
>

fixed patch

Regards

Pavel Stehule

> - Heikki

Attachment Content-Type Size
copy-processed-rows-simple2.patch application/octet-stream 3.7 KB

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Bruce Momjian <bruce(at)momjian(dot)us>
Subject: Re: ToDo: allow to get a number of processed rows by COPY statement [Review of Patch]
Date: 2012-10-03 11:39:14
Message-ID: 506C23E2.8000703@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 02.10.2012 14:09, Pavel Stehule wrote:
> fixed patch

Thanks, committed with some minor editorializing.

- Heikki