Re: final patch - plpgsql: for-in-array

Lists: pgsql-hackers
From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: final patch - plpgsql: for-in-array
Date: 2010-09-30 12:46:02
Message-ID: AANLkTinqzqUCy++vpkN27=8dr2ADuoz0L9nSRozADt6J@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello

this patch implement a new iteration construct - iteration over an
array. The sense of this new iteration is:
* a simple and cleaner syntax
* a faster execution - this bring down a number of detoast operations

create or replace function subscripts(anyarray, int)
returns int[] as $$
select array(select generate_subscripts($1,$2));
$$ language sql;

create or replace function fora_test()
returns int as $$
declare x int; s int = 0;
a int[] := array[1,2,3,4,5,6,7,8,9,10];
begin
for x in array subscripts(a, 1)
loop
s := s + a[x];
end loop;
return s;
end;
$$ language plpgsql;

create or replace function fora_test()
returns int as $$
declare x int; s int = 0;
begin
for x in array array[1,2,3,4,5,6,7,8,9,10]
loop
s := s + x;
end loop;
return s;
end;
$$ language plpgsql;

create or replace function fora_test()
returns int as $$
declare x int; y int;
a fora_point[] := array[(1,2),(3,4),(5,6)];
begin
for x, y in array a
loop
raise notice 'point=%,%', x, y;
end loop;
return 0;
end;
$$ language plpgsql;

Regards

Pavel Stehule

Attachment Content-Type Size
for-in-array application/octet-stream 19.8 KB

From: Jaime Casanova <jaime(at)2ndquadrant(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-18 00:08:58
Message-ID: AANLkTikWAJ0mie2sHSxLBYkx7u21=X1cj9wb1zXe6zyr@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Sep 30, 2010 at 7:46 AM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
> Hello
>
> this patch implement a new iteration construct - iteration over an
> array. The sense of this new iteration is:
>  * a simple and cleaner syntax

i will start the review of this one... but before that sorry for
suggesting this a bit later but about using UNNEST as part of the
sintax?

FOR var IN UNNEST array_expr LOOP
END LOOP

i like this because:
1) is cleaner when array_expr is ARRAY[1,2,3]
2) is not legal now to use the unnest() function without a SELECT in
the context of a FOR loop (or am i missing something?)
3) the unnest() function does the same so seems intuitive what a
FOR-IN-UNNEST do

what i don't know if is this syntax could co-exist with the unnest() function?

--
Jaime Casanova         www.2ndQuadrant.com
Professional PostgreSQL: Soporte y capacitación de PostgreSQL


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Jaime Casanova <jaime(at)2ndquadrant(dot)com>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-18 05:28:34
Message-ID: AANLkTimzQsE7LJ6jc8Bi0bXvU9PynU=+zOwBv2X=vRyV@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Nov 17, 2010 at 7:08 PM, Jaime Casanova <jaime(at)2ndquadrant(dot)com> wrote:
> On Thu, Sep 30, 2010 at 7:46 AM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>> Hello
>>
>> this patch implement a new iteration construct - iteration over an
>> array. The sense of this new iteration is:
>>  * a simple and cleaner syntax
>
> i will start the review of this one... but before that sorry for
> suggesting this a bit later but about using UNNEST as part of the
> sintax?

Does for-in-array do what unnset does? unnest() flattens the whole
array into scalars irregardless of dimensions:
select unnest(array[array[1,2],array[3,4]]);
unnest
--------
1
2
3
4

If yes, then +1 (unless there is some other problem) otherwise -1.

merlin


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Jaime Casanova <jaime(at)2ndquadrant(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-18 05:47:05
Message-ID: 20429.1290059225@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
> On Wed, Nov 17, 2010 at 7:08 PM, Jaime Casanova <jaime(at)2ndquadrant(dot)com> wrote:
>> i will start the review of this one... but before that sorry for
>> suggesting this a bit later but about using UNNEST as part of the
>> sintax?

> Does for-in-array do what unnset does?

Yes, which begs the question of why bother at all. AFAICS this patch
simply allows you to replace

for x in select unnest(array_value) loop

with

for x in unnest array_value loop

(plus or minus a parenthesis or so). I do not think we need to add a
bunch of code and create even more syntactic ambiguity (FOR loops are
already on the hairy edge of unparsability) to save people from writing
"select".

regards, tom lane


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-18 06:04:27
Message-ID: AANLkTi=iEY3OtWwEe+JgXBTb88rTnQyLNRmxXncbsip6@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2010/11/18 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
>> On Wed, Nov 17, 2010 at 7:08 PM, Jaime Casanova <jaime(at)2ndquadrant(dot)com> wrote:
>>> i will start the review of this one... but before that sorry for
>>> suggesting this a bit later but about using UNNEST as part of the
>>> sintax?
>
>> Does for-in-array do what unnset does?
>
> Yes, which begs the question of why bother at all.  AFAICS this patch
> simply allows you to replace
>
>        for x in select unnest(array_value) loop
>
> with
>
>        for x in unnest array_value loop
>
> (plus or minus a parenthesis or so).  I do not think we need to add a
> bunch of code and create even more syntactic ambiguity (FOR loops are
> already on the hairy edge of unparsability) to save people from writing
> "select".

this patch is semantically equal to SELECT unnest(..), but it is
evaluated as simple expression and does directly array unpacking and
iteration, - so it means this fragment is significantly >>faster<<.

Regards

Pavel Stehule

>
>                        regards, tom lane
>


From: Cédric Villemain <cedric(dot)villemain(dot)debian(at)gmail(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-18 09:59:33
Message-ID: AANLkTi=vPFB3UCfMEO2yr1R2ZAxXJuvCbCXodzrrw6-r@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2010/11/18 Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>:
> 2010/11/18 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
>> Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
>>> On Wed, Nov 17, 2010 at 7:08 PM, Jaime Casanova <jaime(at)2ndquadrant(dot)com> wrote:
>>>> i will start the review of this one... but before that sorry for
>>>> suggesting this a bit later but about using UNNEST as part of the
>>>> sintax?
>>
>>> Does for-in-array do what unnset does?
>>
>> Yes, which begs the question of why bother at all.  AFAICS this patch
>> simply allows you to replace
>>
>>        for x in select unnest(array_value) loop
>>
>> with
>>
>>        for x in unnest array_value loop
>>
>> (plus or minus a parenthesis or so).  I do not think we need to add a
>> bunch of code and create even more syntactic ambiguity (FOR loops are
>> already on the hairy edge of unparsability) to save people from writing
>> "select".
>
> this patch is semantically equal to SELECT unnest(..), but it is
> evaluated as simple expression and does directly array unpacking and
> iteration, - so it means this fragment is significantly >>faster<<.

Did you implement a method to be able to walk the array and detoast
only the current needed data ?

(I wonder because I have something like that in that garage : select
array_filter(foo,'like','%bar%',10); where 10 is the limit and can be
avoided, foo is the array, like is callback function, '%bar%' the
parameter for the callback function for filtering results.)

It will make my toy in the garage a fast race car (and probably doable
in (plpg)SQL instead of C) ...

--
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Cédric Villemain <cedric(dot)villemain(dot)debian(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-18 13:43:22
Message-ID: AANLkTik_-C9FodhiC2+rwM+qoLD7bmQsgTar5iwbFAnB@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2010/11/18 Cédric Villemain <cedric(dot)villemain(dot)debian(at)gmail(dot)com>:
> 2010/11/18 Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>:
>> 2010/11/18 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
>>> Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
>>>> On Wed, Nov 17, 2010 at 7:08 PM, Jaime Casanova <jaime(at)2ndquadrant(dot)com> wrote:
>>>>> i will start the review of this one... but before that sorry for
>>>>> suggesting this a bit later but about using UNNEST as part of the
>>>>> sintax?
>>>
>>>> Does for-in-array do what unnset does?
>>>
>>> Yes, which begs the question of why bother at all.  AFAICS this patch
>>> simply allows you to replace
>>>
>>>        for x in select unnest(array_value) loop
>>>
>>> with
>>>
>>>        for x in unnest array_value loop
>>>
>>> (plus or minus a parenthesis or so).  I do not think we need to add a
>>> bunch of code and create even more syntactic ambiguity (FOR loops are
>>> already on the hairy edge of unparsability) to save people from writing
>>> "select".
>>
>> this patch is semantically equal to SELECT unnest(..), but it is
>> evaluated as simple expression and does directly array unpacking and
>> iteration, - so it means this fragment is significantly >>faster<<.
>
> Did you implement a method to be able to walk the array and detoast
> only the current needed data ?

not only - iteration over array can help with readability but a
general work with SRF (set returning functions is more harder and
slower) - so special loop statement can to safe a some toast op / when
you use a large array and access via index, or can to safe a some work
with memory, because there isn't necessary convert array to set of
tuples. Please, recheck these tests.

test:

CREATE OR REPLACE FUNCTION rndstr() RETURNS text AS $$select
array_to_string(array(select substring('ABCDEFGHIJKLMNOPQ' FROM
(random()*16)::int FOR 1) from generate_series(1,10)),'')$$ LANGUAGE
sql;

create or replace function rndarray(int) returns text[] as $$select
array(select rndstr() from generate_series(1,$1)) $$ language sql;

create table t10(x text[]);
insert into t10 select rndarray(10) from generate_series(1,10000);
create table t100(x text[]);
insert into t100 select rndarray(100) from generate_series(1,10000);
create table t1000(x text[]);
insert into t1000 select rndarray(1000) from generate_series(1,10000);

CREATE OR REPLACE FUNCTION public.filter(text[], text, integer)
RETURNS text[]
LANGUAGE plpgsql
AS $function$
DECLARE
s text[] := '{}';
l int := 0;
v text;
BEGIN
FOR v IN ARRAY $1
LOOP
EXIT WHEN l = $3;
IF v LIKE $2 THEN
s := s || v;
l := l + 1;
END IF;
END LOOP;
RETURN s;
END;$function$;

postgres=# select avg(array_upper(filter(x,'%AA%', 10),1)) from t10;
avg
--------------------
1.1596079803990200
(1 row)

Time: 393.649 ms

postgres=# select avg(array_upper(filter(x,'%AA%', 10),1)) from t100;
avg
--------------------
3.4976777789245536
(1 row)

Time: 2804.502 ms

postgres=# select avg(array_upper(filter(x,'%AA%', 10),1)) from t1000;
avg
---------------------
10.0000000000000000
(1 row)

Time: 9729.994 ms

CREATE OR REPLACE FUNCTION public.filter01(text[], text, integer)
RETURNS text[]
LANGUAGE plpgsql
AS $function$
DECLARE
s text[] := '{}';
l int := 0;
v text;
BEGIN
FOR v IN SELECT UNNEST($1)
LOOP
EXIT WHEN l = $3;
IF v LIKE $2 THEN
s := s || v;
l := l + 1;
END IF;
END LOOP;
RETURN s;
END;$function$;

postgres=# select avg(array_upper(filter01(x,'%AA%', 10),1)) from t10;
avg
--------------------
1.1596079803990200
(1 row)

Time: 795.383 ms

postgres=# select avg(array_upper(filter01(x,'%AA%', 10),1)) from t100;
avg
--------------------
3.4976777789245536
(1 row)

Time: 3848.258 ms

postgres=# select avg(array_upper(filter01(x,'%AA%', 10),1)) from t1000;
avg
---------------------
10.0000000000000000
(1 row)

Time: 12366.093 ms

The iteration via specialized FOR IN ARRAY is about 25-30% faster than
FOR IN SELECT UNNEST

postgres=# CREATE OR REPLACE FUNCTION public.filter02(text[], text, integer)
RETURNS text[]
LANGUAGE plpgsql
AS $function$
DECLARE
s text[] := '{}';
l int := 0; i int;
v text;
BEGIN
FOR i IN array_lower($1,1)..array_upper($1,1)
LOOP
EXIT WHEN l = $3;
IF $1[i] LIKE $2 THEN
s := s || $1[i];
l := l + 1;
END IF;
END LOOP;
RETURN s;
END;$function$
;

postgres=# select avg(array_upper(filter02(x,'%AA%', 10),1)) from t10;
avg
--------------------
1.1596079803990200
(1 row)

Time: 414.960 ms

postgres=# select avg(array_upper(filter02(x,'%AA%', 10),1)) from t100;
avg
--------------------
3.4976777789245536
(1 row)

Time: 3460.970 ms

there FOR IN ARRAY is faster about 30% then access per index

for T1000 I had to cancel over 1 minute!!!!

>
> (I wonder because I have something like that in that garage : select
> array_filter(foo,'like','%bar%',10); where 10 is the limit and can be
> avoided, foo is the array, like is callback function, '%bar%' the
> parameter for the callback function for filtering results.)
>
> It will make my toy in the garage a fast race car (and probably doable
> in (plpg)SQL instead of C) ...

it can help with reading of array. But it doesn't help with array
updating :(. For large arrays it can be slow too.

Regards

Pavel Stehule

>
> --
> Cédric Villemain               2ndQuadrant
> http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support
>


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jaime Casanova <jaime(at)2ndquadrant(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-18 15:24:34
Message-ID: AANLkTi=Fp=ub_0J8n--hpuBft6mE9JVcHcj_Zc9pgokK@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Nov 18, 2010 at 12:47 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
>> On Wed, Nov 17, 2010 at 7:08 PM, Jaime Casanova <jaime(at)2ndquadrant(dot)com> wrote:
>>> i will start the review of this one... but before that sorry for
>>> suggesting this a bit later but about using UNNEST as part of the
>>> sintax?
>
>> Does for-in-array do what unnset does?
>
> Yes, which begs the question of why bother at all.  AFAICS this patch
> simply allows you to replace
>
>        for x in select unnest(array_value) loop
>
> with
>
>        for x in unnest array_value loop
>
> (plus or minus a parenthesis or so).  I do not think we need to add a
> bunch of code and create even more syntactic ambiguity (FOR loops are
> already on the hairy edge of unparsability) to save people from writing
> "select".

Pavel's performance argument is imnsho valid. arrays at present are
the best way to pass data around functions and any optimizations here
are very welcome. Given that, is there any way to mitigate your
concerns on the syntax side?

merlin


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-18 15:33:16
Message-ID: AANLkTikF6Os3_VrkhDnbYAmEZ2vJDH-fsP=koCdYHM+j@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Nov 18, 2010 at 10:24 AM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
> On Thu, Nov 18, 2010 at 12:47 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
>>> On Wed, Nov 17, 2010 at 7:08 PM, Jaime Casanova <jaime(at)2ndquadrant(dot)com> wrote:
>>>> i will start the review of this one... but before that sorry for
>>>> suggesting this a bit later but about using UNNEST as part of the
>>>> sintax?
>>
>>> Does for-in-array do what unnset does?
>>
>> Yes, which begs the question of why bother at all.  AFAICS this patch
>> simply allows you to replace
>>
>>        for x in select unnest(array_value) loop
>>
>> with
>>
>>        for x in unnest array_value loop
>>
>> (plus or minus a parenthesis or so).  I do not think we need to add a
>> bunch of code and create even more syntactic ambiguity (FOR loops are
>> already on the hairy edge of unparsability) to save people from writing
>> "select".
>
> Pavel's performance argument is imnsho valid. arrays at present are
> the best way to pass data around functions and any optimizations here
> are very welcome.  Given that, is there any way to mitigate your
> concerns on the syntax side?

Can we get the performance benefit any other way? I hate to think
that it will still be slow for people using the already-supported
syntax.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Cédric Villemain <cedric(dot)villemain(dot)debian(at)gmail(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-18 15:37:24
Message-ID: AANLkTi=XN4M-_yi3R1i8s2NFZaqo0yDBm=6E3dMQJOYW@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2010/11/18 Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>:
> 2010/11/18 Cédric Villemain <cedric(dot)villemain(dot)debian(at)gmail(dot)com>:
>> 2010/11/18 Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>:
>>> 2010/11/18 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
>>>> Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
>>>>> On Wed, Nov 17, 2010 at 7:08 PM, Jaime Casanova <jaime(at)2ndquadrant(dot)com> wrote:
>>>>>> i will start the review of this one... but before that sorry for
>>>>>> suggesting this a bit later but about using UNNEST as part of the
>>>>>> sintax?
>>>>
>>>>> Does for-in-array do what unnset does?
>>>>
>>>> Yes, which begs the question of why bother at all.  AFAICS this patch
>>>> simply allows you to replace
>>>>
>>>>        for x in select unnest(array_value) loop
>>>>
>>>> with
>>>>
>>>>        for x in unnest array_value loop
>>>>
>>>> (plus or minus a parenthesis or so).  I do not think we need to add a
>>>> bunch of code and create even more syntactic ambiguity (FOR loops are
>>>> already on the hairy edge of unparsability) to save people from writing
>>>> "select".
>>>
>>> this patch is semantically equal to SELECT unnest(..), but it is
>>> evaluated as simple expression and does directly array unpacking and
>>> iteration, - so it means this fragment is significantly >>faster<<.
>>
>> Did you implement a method to be able to walk the array and detoast
>> only the current needed data ?
>
> not only - iteration over array can help with readability but a
> general work with SRF (set returning functions is more harder and
> slower) - so special loop statement can to safe a some toast op / when
> you use a large array and access via index, or can to safe a some work
> with memory, because there isn't necessary convert array to set of
> tuples. Please, recheck these tests.
>
> test:
>
> CREATE OR REPLACE FUNCTION rndstr() RETURNS text AS $$select
> array_to_string(array(select substring('ABCDEFGHIJKLMNOPQ' FROM
> (random()*16)::int FOR 1) from generate_series(1,10)),'')$$ LANGUAGE
> sql;
>
> create or replace function rndarray(int) returns text[] as $$select
> array(select rndstr() from generate_series(1,$1)) $$ language sql;
>
> create table t10(x text[]);
> insert into t10 select rndarray(10) from generate_series(1,10000);
> create table t100(x text[]);
> insert into t100 select rndarray(100) from generate_series(1,10000);
> create table t1000(x text[]);
> insert into t1000 select rndarray(1000) from generate_series(1,10000);
>
> CREATE OR REPLACE FUNCTION public.filter(text[], text, integer)
>  RETURNS text[]
>  LANGUAGE plpgsql
> AS $function$
> DECLARE
>  s text[] := '{}';
>  l int := 0;
>  v text;
> BEGIN
>  FOR v IN ARRAY $1
>  LOOP
>    EXIT WHEN l = $3;
>    IF v LIKE $2 THEN
>      s := s || v;
>      l := l + 1;
>    END IF;
>  END LOOP;
>  RETURN s;
> END;$function$;
>
> postgres=# select avg(array_upper(filter(x,'%AA%', 10),1)) from t10;
>        avg
> --------------------
>  1.1596079803990200
> (1 row)
>
> Time: 393.649 ms
>
> postgres=# select avg(array_upper(filter(x,'%AA%', 10),1)) from t100;
>        avg
> --------------------
>  3.4976777789245536
> (1 row)
>
> Time: 2804.502 ms
>
> postgres=# select avg(array_upper(filter(x,'%AA%', 10),1)) from t1000;
>         avg
> ---------------------
>  10.0000000000000000
> (1 row)
>
> Time: 9729.994 ms
>
> CREATE OR REPLACE FUNCTION public.filter01(text[], text, integer)
>  RETURNS text[]
>  LANGUAGE plpgsql
> AS $function$
> DECLARE
>  s text[] := '{}';
>  l int := 0;
>  v text;
> BEGIN
>  FOR v IN SELECT UNNEST($1)
>  LOOP
>    EXIT WHEN l = $3;
>    IF v LIKE $2 THEN
>      s := s || v;
>      l := l + 1;
>    END IF;
>  END LOOP;
>  RETURN s;
> END;$function$;
>
> postgres=# select avg(array_upper(filter01(x,'%AA%', 10),1)) from t10;
>        avg
> --------------------
>  1.1596079803990200
> (1 row)
>
> Time: 795.383 ms
>
> postgres=# select avg(array_upper(filter01(x,'%AA%', 10),1)) from t100;
>        avg
> --------------------
>  3.4976777789245536
> (1 row)
>
> Time: 3848.258 ms
>
> postgres=# select avg(array_upper(filter01(x,'%AA%', 10),1)) from t1000;
>         avg
> ---------------------
>  10.0000000000000000
> (1 row)
>
> Time: 12366.093 ms
>
> The iteration via specialized FOR IN ARRAY is about 25-30% faster than
> FOR IN SELECT UNNEST
>
> postgres=# CREATE OR REPLACE FUNCTION public.filter02(text[], text, integer)
>  RETURNS text[]
>  LANGUAGE plpgsql
> AS $function$
> DECLARE
>  s text[] := '{}';
>  l int := 0; i int;
>  v text;
> BEGIN
>  FOR i IN array_lower($1,1)..array_upper($1,1)
>  LOOP
>    EXIT WHEN l = $3;
>    IF $1[i] LIKE $2 THEN
>      s := s || $1[i];
>      l := l + 1;
>    END IF;
>  END LOOP;
>  RETURN s;
> END;$function$
> ;
>
> postgres=# select avg(array_upper(filter02(x,'%AA%', 10),1)) from t10;
>        avg
> --------------------
>  1.1596079803990200
> (1 row)
>
> Time: 414.960 ms
>
> postgres=# select avg(array_upper(filter02(x,'%AA%', 10),1)) from t100;
>        avg
> --------------------
>  3.4976777789245536
> (1 row)
>
> Time: 3460.970 ms
>
> there FOR IN ARRAY is faster about 30% then access per index
>
> for T1000 I had to cancel over 1 minute!!!!

I can't test until this week-end. But I will.

>
>
>>
>> (I wonder because I have something like that in that garage : select
>> array_filter(foo,'like','%bar%',10); where 10 is the limit and can be
>> avoided, foo is the array, like is callback function, '%bar%' the
>> parameter for the callback function for filtering results.)
>>
>> It will make my toy in the garage a fast race car (and probably doable
>> in (plpg)SQL instead of C) ...
>
> it can help with reading of array. But it doesn't help with array
> updating :(. For large arrays it can be slow too.

select fast is already a good job, thank you.

--
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support


From: Cédric Villemain <cedric(dot)villemain(dot)debian(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-18 15:50:40
Message-ID: AANLkTi==LCFkKvk7UzufBgt_5JiAEkBzMBx1CwvjvAYz@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2010/11/18 Robert Haas <robertmhaas(at)gmail(dot)com>:
> On Thu, Nov 18, 2010 at 10:24 AM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
>> On Thu, Nov 18, 2010 at 12:47 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
>>>> On Wed, Nov 17, 2010 at 7:08 PM, Jaime Casanova <jaime(at)2ndquadrant(dot)com> wrote:
>>>>> i will start the review of this one... but before that sorry for
>>>>> suggesting this a bit later but about using UNNEST as part of the
>>>>> sintax?
>>>
>>>> Does for-in-array do what unnset does?
>>>
>>> Yes, which begs the question of why bother at all.  AFAICS this patch
>>> simply allows you to replace
>>>
>>>        for x in select unnest(array_value) loop
>>>
>>> with
>>>
>>>        for x in unnest array_value loop
>>>
>>> (plus or minus a parenthesis or so).  I do not think we need to add a
>>> bunch of code and create even more syntactic ambiguity (FOR loops are
>>> already on the hairy edge of unparsability) to save people from writing
>>> "select".
>>
>> Pavel's performance argument is imnsho valid. arrays at present are
>> the best way to pass data around functions and any optimizations here
>> are very welcome.  Given that, is there any way to mitigate your
>> concerns on the syntax side?
>
> Can we get the performance benefit any other way?  I hate to think
> that it will still be slow for people using the already-supported
> syntax.

If you are able to make unnest() outputting 1st row without detoasting
last field.

I think if we have :
#define DatumGetTextPSlice(X,m,n) ((text *) PG_DETOAST_DATUM_SLICE(X,m,n))
but for array, most is done

Pavel, am I correct ?

>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

--
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-18 15:52:09
Message-ID: AANLkTimGOKNG_-5d9rJ0sjf_zT=W1JMMZHD3u-ONC9+A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2010/11/18 Robert Haas <robertmhaas(at)gmail(dot)com>:
> On Thu, Nov 18, 2010 at 10:24 AM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
>> On Thu, Nov 18, 2010 at 12:47 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
>>>> On Wed, Nov 17, 2010 at 7:08 PM, Jaime Casanova <jaime(at)2ndquadrant(dot)com> wrote:
>>>>> i will start the review of this one... but before that sorry for
>>>>> suggesting this a bit later but about using UNNEST as part of the
>>>>> sintax?
>>>
>>>> Does for-in-array do what unnset does?
>>>
>>> Yes, which begs the question of why bother at all.  AFAICS this patch
>>> simply allows you to replace
>>>
>>>        for x in select unnest(array_value) loop
>>>
>>> with
>>>
>>>        for x in unnest array_value loop
>>>
>>> (plus or minus a parenthesis or so).  I do not think we need to add a
>>> bunch of code and create even more syntactic ambiguity (FOR loops are
>>> already on the hairy edge of unparsability) to save people from writing
>>> "select".
>>
>> Pavel's performance argument is imnsho valid. arrays at present are
>> the best way to pass data around functions and any optimizations here
>> are very welcome.  Given that, is there any way to mitigate your
>> concerns on the syntax side?
>
> Can we get the performance benefit any other way?  I hate to think
> that it will still be slow for people using the already-supported
> syntax.
>

I afraid so other ways are more difficult with deeper impacts on
plpgsql executor.

what is a slow:

a) repeated detoasting - access with subscripts - maybe detoasted
values can be cached?
b) evaluation of SRF expression - maybe call of SRF function can be
simple expression,
c) faster evaluation ro query

The most important is @a. Only a few people uses a FOR IN SELECT
UNNEST form now. Probably not optimizable on PLpgSQL level is form FOR
IN SELECT * FROM UNNEST.

FOR IN ARRAY doesn't impacts on expression executing - this patch is
just simple and isolated.

Pavel

> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Cédric Villemain <cedric(dot)villemain(dot)debian(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-18 15:57:53
Message-ID: AANLkTik=fV5g+b9aPKgFkq7v-2Jt2FFU-0TvJ4EkZMBZ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2010/11/18 Cédric Villemain <cedric(dot)villemain(dot)debian(at)gmail(dot)com>:
> 2010/11/18 Robert Haas <robertmhaas(at)gmail(dot)com>:
>> On Thu, Nov 18, 2010 at 10:24 AM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
>>> On Thu, Nov 18, 2010 at 12:47 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>>> Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
>>>>> On Wed, Nov 17, 2010 at 7:08 PM, Jaime Casanova <jaime(at)2ndquadrant(dot)com> wrote:
>>>>>> i will start the review of this one... but before that sorry for
>>>>>> suggesting this a bit later but about using UNNEST as part of the
>>>>>> sintax?
>>>>
>>>>> Does for-in-array do what unnset does?
>>>>
>>>> Yes, which begs the question of why bother at all.  AFAICS this patch
>>>> simply allows you to replace
>>>>
>>>>        for x in select unnest(array_value) loop
>>>>
>>>> with
>>>>
>>>>        for x in unnest array_value loop
>>>>
>>>> (plus or minus a parenthesis or so).  I do not think we need to add a
>>>> bunch of code and create even more syntactic ambiguity (FOR loops are
>>>> already on the hairy edge of unparsability) to save people from writing
>>>> "select".
>>>
>>> Pavel's performance argument is imnsho valid. arrays at present are
>>> the best way to pass data around functions and any optimizations here
>>> are very welcome.  Given that, is there any way to mitigate your
>>> concerns on the syntax side?
>>
>> Can we get the performance benefit any other way?  I hate to think
>> that it will still be slow for people using the already-supported
>> syntax.
>
> If you are able to make unnest() outputting 1st row without detoasting
> last field.
>
> I think if we have :
> #define DatumGetTextPSlice(X,m,n)   ((text *) PG_DETOAST_DATUM_SLICE(X,m,n))
> but for array, most is done
>
> Pavel, am I correct ?

yes, it can help, but still if you iterate over complete array, you
have to do n - detoast ops.

Pavel

>
>>
>> --
>> Robert Haas
>> EnterpriseDB: http://www.enterprisedb.com
>> The Enterprise PostgreSQL Company
>>
>> --
>> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-hackers
>>
>
>
>
> --
> Cédric Villemain               2ndQuadrant
> http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support
>


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-18 16:00:55
Message-ID: 4CE54DB7.9050800@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 11/18/2010 10:33 AM, Robert Haas wrote:
> On Thu, Nov 18, 2010 at 10:24 AM, Merlin Moncure<mmoncure(at)gmail(dot)com> wrote:
>>
>> Pavel's performance argument is imnsho valid. arrays at present are
>> the best way to pass data around functions and any optimizations here
>> are very welcome. Given that, is there any way to mitigate your
>> concerns on the syntax side?
> Can we get the performance benefit any other way? I hate to think
> that it will still be slow for people using the already-supported
> syntax.

It's not disastrously slower. AFAICT from a very quick glance over the
patch, he's getting the speedup by bypassing the normal mechanism for
evaluating "for x in select ...". So we'd have to special-case that to
trap calls to unnest in the general form. That would be pretty ugly. Or
else make unnest and SPI faster. But that's a much bigger project.

Syntactic sugar is not entirely to be despised, anyway.

cheers

andrew


From: Cédric Villemain <cedric(dot)villemain(dot)debian(at)gmail(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-18 16:02:35
Message-ID: AANLkTinKEXQ3mTNmf-csQoM0j=ZJN_xBg2CrJ7NH4nK4@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2010/11/18 Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>:
> 2010/11/18 Cédric Villemain <cedric(dot)villemain(dot)debian(at)gmail(dot)com>:
>> 2010/11/18 Robert Haas <robertmhaas(at)gmail(dot)com>:
>>> On Thu, Nov 18, 2010 at 10:24 AM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
>>>> On Thu, Nov 18, 2010 at 12:47 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>>>> Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
>>>>>> On Wed, Nov 17, 2010 at 7:08 PM, Jaime Casanova <jaime(at)2ndquadrant(dot)com> wrote:
>>>>>>> i will start the review of this one... but before that sorry for
>>>>>>> suggesting this a bit later but about using UNNEST as part of the
>>>>>>> sintax?
>>>>>
>>>>>> Does for-in-array do what unnset does?
>>>>>
>>>>> Yes, which begs the question of why bother at all.  AFAICS this patch
>>>>> simply allows you to replace
>>>>>
>>>>>        for x in select unnest(array_value) loop
>>>>>
>>>>> with
>>>>>
>>>>>        for x in unnest array_value loop
>>>>>
>>>>> (plus or minus a parenthesis or so).  I do not think we need to add a
>>>>> bunch of code and create even more syntactic ambiguity (FOR loops are
>>>>> already on the hairy edge of unparsability) to save people from writing
>>>>> "select".
>>>>
>>>> Pavel's performance argument is imnsho valid. arrays at present are
>>>> the best way to pass data around functions and any optimizations here
>>>> are very welcome.  Given that, is there any way to mitigate your
>>>> concerns on the syntax side?
>>>
>>> Can we get the performance benefit any other way?  I hate to think
>>> that it will still be slow for people using the already-supported
>>> syntax.
>>
>> If you are able to make unnest() outputting 1st row without detoasting
>> last field.
>>
>> I think if we have :
>> #define DatumGetTextPSlice(X,m,n)   ((text *) PG_DETOAST_DATUM_SLICE(X,m,n))
>> but for array, most is done
>>
>> Pavel, am I correct ?
>
> yes, it can help, but still if you iterate over complete array, you
> have to do n - detoast ops.

sure.

>
> Pavel
>
>>
>>>
>>> --
>>> Robert Haas
>>> EnterpriseDB: http://www.enterprisedb.com
>>> The Enterprise PostgreSQL Company
>>>
>>> --
>>> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
>>> To make changes to your subscription:
>>> http://www.postgresql.org/mailpref/pgsql-hackers
>>>
>>
>>
>>
>> --
>> Cédric Villemain               2ndQuadrant
>> http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support
>>
>

--
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Jaime Casanova <jaime(at)2ndquadrant(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-18 16:09:29
Message-ID: 918.1290096569@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
> On Thu, Nov 18, 2010 at 12:47 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Yes, which begs the question of why bother at all.

> Pavel's performance argument is imnsho valid.

Well, that argument is unsupported by any evidence, so far as I've seen.

More to the point, if there is indeed an interesting performance win
here, we could get the same win by internally optimizing the existing
syntax. That would provide the benefit to existing code not just
new code; and it would avoid foreclosing our future options for
extending FOR in not-so-redundant ways.

regards, tom lane


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-18 16:37:06
Message-ID: AANLkTik4_Vx44dePKTFtGhLtnsBKHzO-0jM_KQEyfzhJ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2010/11/18 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
>> On Thu, Nov 18, 2010 at 12:47 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> Yes, which begs the question of why bother at all.
>
>> Pavel's performance argument is imnsho valid.
>
> Well, that argument is unsupported by any evidence, so far as I've seen.
>
> More to the point, if there is indeed an interesting performance win
> here, we could get the same win by internally optimizing the existing
> syntax.  That would provide the benefit to existing code not just
> new code; and it would avoid foreclosing our future options for
> extending FOR in not-so-redundant ways.

sorry, but I don't agree. I don't think, so there are some big space
for optimizing - and if then it means much more code complexity for
current expr executor. Next - FOR IN ARRAY takes fields from array on
request - and it is possible, because a unpacking of array is
controlled by statement - it's impossible do same when unpacking is
inside other functions with same effectivity.

Regards

Pavel Stehule

>
>                        regards, tom lane
>


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-18 17:19:25
Message-ID: 3145.1290100765@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
> 2010/11/18 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
>> More to the point, if there is indeed an interesting performance win
>> here, we could get the same win by internally optimizing the existing
>> syntax.

> sorry, but I don't agree. I don't think, so there are some big space
> for optimizing - and if then it means much more code complexity for
> current expr executor. Next - FOR IN ARRAY takes fields from array on
> request - and it is possible, because a unpacking of array is
> controlled by statement - it's impossible do same when unpacking is
> inside other functions with same effectivity.

Just because you haven't thought about it doesn't mean it's impossible
or impractical.

The first implementation I was thinking of would involve looking at the
SELECT querytree after parsing to see if it's "SELECT UNNEST(something)"
--- that is, empty jointree and so on, single tlist item that is an
invocation of the unnest() function. If it is, you pull out the
argument expression of unnest() and go from there, with exactly the same
execution behavior as in the specialized-syntax patch. This is
perfectly safe if you identify the array_unnest function by OID: since
it's a built-in function you know exactly what it's supposed to do.

But having said that, it's still not apparent to me that array_unnest
itself is markedly slower than what you could hope to do in plpgsql.
I think the real issue here is that plpgsql's simple-expression code
can't be used with set-returning expressions, which means that we have
to go through the vastly more expensive SPI code path. But that
restriction is largely based on fear of re-using expression trees, which
is something we fixed a few weeks ago. I think that it would now be
interesting to look at whether "FOR x IN SELECT simple-expression" could
use the simple-expression code even when the expression returns set.
That approach might bring a useful speedup not just for unnest, but for
many other use-cases as well.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-18 17:36:24
Message-ID: 3464.1290101784@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> Syntactic sugar is not entirely to be despised, anyway.

If it were harmless syntactic sugar I wouldn't be objecting so loudly.
The problem here is that FOR is a syntactic choke point: it's already
overloaded with several different sub-syntaxes that are quite difficult
to separate. Adding another one makes that worse, with the consequences
that we might misinterpret the user's intent, leading either to
misleading/unhelpful error messages or unexpected runtime behavior.
If you consult the archives you can find numerous past instances of
complaints from people who hit such problems with the existing set of
FOR sub-syntaxes, so this isn't hypothetical.

I'm also quite afraid of having a conflict with other future extensions
of FOR, which we might want to introduce either on our own hook or for
compatibility with something Oracle might add to PL/SQL in future.

This might not be the worst place in the entire system to be introducing
inessential syntactic sugar, but it's certainly one of the top ten.
I would *much* rather we get the performance benefit by internal
optimization, and forego inventing syntax.

regards, tom lane


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-18 17:43:29
Message-ID: AANLkTikUYZtFrq9=6=Y5uAc=uaG8uuy61NexQaFhmLuh@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2010/11/18 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
>> 2010/11/18 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
>>> More to the point, if there is indeed an interesting performance win
>>> here, we could get the same win by internally optimizing the existing
>>> syntax.
>
>> sorry, but I don't agree. I don't think, so there are some big space
>> for optimizing - and if then it means much more code complexity for
>> current expr executor. Next - FOR IN ARRAY takes fields from array on
>> request - and it is possible, because a unpacking of array is
>> controlled by statement - it's impossible do same when unpacking is
>> inside other functions with same effectivity.
>
> Just because you haven't thought about it doesn't mean it's impossible
> or impractical.
>
> The first implementation I was thinking of would involve looking at the
> SELECT querytree after parsing to see if it's "SELECT UNNEST(something)"
> --- that is, empty jointree and so on, single tlist item that is an
> invocation of the unnest() function.  If it is, you pull out the
> argument expression of unnest() and go from there, with exactly the same
> execution behavior as in the specialized-syntax patch.  This is
> perfectly safe if you identify the array_unnest function by OID: since
> it's a built-in function you know exactly what it's supposed to do.

this additional control will do slow down for any expression - more -
somebody can use a form: SELECT FROM unnest(), and this form will be
slower.

>
> But having said that, it's still not apparent to me that array_unnest
> itself is markedly slower than what you could hope to do in plpgsql.
> I think the real issue here is that plpgsql's simple-expression code
> can't be used with set-returning expressions, which means that we have
> to go through the vastly more expensive SPI code path.  But that
> restriction is largely based on fear of re-using expression trees, which
> is something we fixed a few weeks ago.  I think that it would now be
> interesting to look at whether "FOR x IN SELECT simple-expression" could
> use the simple-expression code even when the expression returns set.
> That approach might bring a useful speedup not just for unnest, but for
> many other use-cases as well.
>

any SRF call must not be faster then direct access to array. There is
overhead with tuples.

I don't understand you. Sorry. I don't think, so your objections are objective.

Regards

Pavel

>                        regards, tom lane
>


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-18 17:54:14
Message-ID: AANLkTi=sCYmS297g2UvJD67FPv+6cEGfJt7YSX1aBnuo@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Nov 18, 2010 at 12:36 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I would *much* rather we get the performance benefit by internal
> optimization, and forego inventing syntax.

+1.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-18 17:55:44
Message-ID: AANLkTin+2UwjvqOEifYFdD=NEANyS1fN2g4jYi43yedU@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2010/11/18 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
>> Syntactic sugar is not entirely to be despised, anyway.
>
> If it were harmless syntactic sugar I wouldn't be objecting so loudly.
> The problem here is that FOR is a syntactic choke point: it's already
> overloaded with several different sub-syntaxes that are quite difficult
> to separate.  Adding another one makes that worse, with the consequences
> that we might misinterpret the user's intent, leading either to
> misleading/unhelpful error messages or unexpected runtime behavior.
> If you consult the archives you can find numerous past instances of
> complaints from people who hit such problems with the existing set of
> FOR sub-syntaxes, so this isn't hypothetical.
>

yes, this argument is correct - but we can rearange a parser rules
related to FOR statement. It can be solved.

> I'm also quite afraid of having a conflict with other future extensions
> of FOR, which we might want to introduce either on our own hook or for
> compatibility with something Oracle might add to PL/SQL in future.

we talked about it last time - and I respect it - syntax is FOR IN
>>>ARRAY<<< expression

>
> This might not be the worst place in the entire system to be introducing
> inessential syntactic sugar, but it's certainly one of the top ten.
> I would *much* rather we get the performance benefit by internal
> optimization, and forego inventing syntax.
>

Regards

Pavel

>                        regards, tom lane
>


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-18 18:03:28
Message-ID: AANLkTikQ6i249VQrYceJV8JuL1h5das2EHJpe+MZGAfj@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2010/11/18 Robert Haas <robertmhaas(at)gmail(dot)com>:
> On Thu, Nov 18, 2010 at 12:36 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> I would *much* rather we get the performance benefit by internal
>> optimization, and forego inventing syntax.
>
> +1.

any optimization will be about 10-20% slower than direct access. See
my tests: on large arrays isn't significant if you use a simple
expression or full query. This is just overhead from building a
"tuplestore" and access to data via cursor. And you cannot to change a
SRF functions to returns just array. I would to see any optimization
on this level, but I think so it's unreal expecting.

Regards

Pavel Stehule

>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-18 18:17:26
Message-ID: AANLkTi=NzvVyrERK4B73O0PU_+Oakvf3J30zwPaE50AL@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Nov 18, 2010 at 1:03 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
> 2010/11/18 Robert Haas <robertmhaas(at)gmail(dot)com>:
>> On Thu, Nov 18, 2010 at 12:36 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> I would *much* rather we get the performance benefit by internal
>>> optimization, and forego inventing syntax.
>>
>> +1.
>
> any optimization will be about 10-20% slower than direct access. See
> my tests: on large arrays isn't significant if you use a simple
> expression or full query. This is just overhead from building a
> "tuplestore" and access to data via cursor. And you cannot to change a
> SRF functions to returns just array. I would to see any optimization
> on this level, but I think so it's unreal expecting.

How can you possibly make a general statement like that? What's slow
is not the syntax; it's what the syntax is making happen under the
hood.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-18 18:47:45
Message-ID: AANLkTim8CMZYw_wMUFquU4mCnkZJNfXAVoYLGyeCiTso@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2010/11/18 Robert Haas <robertmhaas(at)gmail(dot)com>:
> On Thu, Nov 18, 2010 at 1:03 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>> 2010/11/18 Robert Haas <robertmhaas(at)gmail(dot)com>:
>>> On Thu, Nov 18, 2010 at 12:36 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>>> I would *much* rather we get the performance benefit by internal
>>>> optimization, and forego inventing syntax.
>>>
>>> +1.
>>
>> any optimization will be about 10-20% slower than direct access. See
>> my tests: on large arrays isn't significant if you use a simple
>> expression or full query. This is just overhead from building a
>> "tuplestore" and access to data via cursor. And you cannot to change a
>> SRF functions to returns just array. I would to see any optimization
>> on this level, but I think so it's unreal expecting.
>
> How can you possibly make a general statement like that?  What's slow
> is not the syntax; it's what the syntax is making happen under the
> hood.
>

ok, it is based on my tests, but it can be subjective. Probably is
possible to work with a tuplestore as result of SRF function. And
probably we can read from it without cursor. Maybe we can to teach a
SRF functions to store values as scalars not as tuple - tuplestore can
do it, but the we have to have a global state and we must to modify
buildin functions (not just "unnest" - if the feature should be
general). But newer we can to ensure a working with only necessary
data like a special PL statement. "unnest" returns all fields, but
these fields should not be used. There isn't possible to say - stop, I
don't need other fields. It's possible just with special PL statement,
because it is controlled by PL. So it is reason why I don't believe in
optimizations on PL level.

Regards

Pavel Stehule

> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-18 18:56:59
Message-ID: 5201.1290106619@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
> what is a slow:

> a) repeated detoasting - access with subscripts - maybe detoasted
> values can be cached?
> b) evaluation of SRF expression - maybe call of SRF function can be
> simple expression,
> c) faster evaluation ro query

> The most important is @a.

Really? Becase AFAICS array_unnest only detoasts the source array once,
and saves the value between calls.

array_unnest doesn't currently have any smarts about fetching slices
of an array. I'm not sure how useful that would be in practice, since
(1) in most usages you probably run the function to the end and fetch
all the values anyway; (2) it's hard to see how to optimize that way
if the elements are varlena, which they most likely are in most usages
where this could possibly be a win. But if Cedric's use-case is really
worth optimizing, I'd sure rather see the smarts for it in the general
purpose array_unnest function instead of buried in plpgsql's FOR logic.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-18 19:16:34
Message-ID: 5570.1290107794@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
> 2010/11/18 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
>> The problem here is that FOR is a syntactic choke point: it's already
>> overloaded with several different sub-syntaxes that are quite difficult
>> to separate. Adding another one makes that worse, with the consequences
>> that we might misinterpret the user's intent, leading either to
>> misleading/unhelpful error messages or unexpected runtime behavior.

> yes, this argument is correct - but we can rearange a parser rules
> related to FOR statement. It can be solved.

No, it can't. The more things that can possibly follow FOR, the less
likely that you correctly guess which one the user had in mind when
faced with something that's not quite syntactically correct. Or maybe
it *is* syntactically correct, only not according to the variant that
the user thought he was invoking. We've seen bug reports of this sort
connected with FOR already; in fact I'm pretty sure you've responded to
a few yourself. Adding more variants *will* make it worse. We need
a decent return on investment for anything we add here, and this
proposal just doesn't offer enough benefit.

regards, tom lane


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-18 19:17:40
Message-ID: AANLkTi=Yb54mbU=STVTyW+_i1S1XMhC+L18L4U6ONBLO@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2010/11/18 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
>> what is a slow:
>
>> a) repeated detoasting - access with subscripts - maybe detoasted
>> values can be cached?
>> b) evaluation of SRF expression - maybe call of SRF function can be
>> simple expression,
>> c) faster evaluation ro query
>
>> The most important is @a.
>
> Really?  Becase AFAICS array_unnest only detoasts the source array once,
> and saves the value between calls.

I know.

this note was a different -only a few people use FOR IN SELECT UNNEST
for iteration over array. So from Robert's question (what is important
for current code?) perspective the more significant is access to
individual fields via subscripts. For example:

for i in 1..10000 loop
s := s + A[i];
end loop

is slow, when high limit of array is some bigger number > 1000. But
almost all stored procedures used this pattern. I know so some people
use a pattern FOR IN SELECT UNNEST, but (for example) I didn't meet
that developer in Czech Rep. It isn't usual so people can mix SQL and
PL well.

It has a practical reasons - using a UNNEST for small arrays is slower.

>
> array_unnest doesn't currently have any smarts about fetching slices
> of an array.  I'm not sure how useful that would be in practice, since
> (1) in most usages you probably run the function to the end and fetch
> all the values anyway; (2) it's hard to see how to optimize that way
> if the elements are varlena, which they most likely are in most usages
> where this could possibly be a win.  But if Cedric's use-case is really
> worth optimizing, I'd sure rather see the smarts for it in the general
> purpose array_unnest function instead of buried in plpgsql's FOR logic.
>

Probably - example with LIKE filter is really specific. But there can
be a tasks, where you can early break a iteration where you find a
value higher or less then some constant - it's not too artificial -
test "IS MEMBER OF"

Regards

Pavel

>                        regards, tom lane
>


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-18 19:24:43
Message-ID: 5781.1290108283@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
> this note was a different -only a few people use FOR IN SELECT UNNEST
> for iteration over array. So from Robert's question (what is important
> for current code?) perspective the more significant is access to
> individual fields via subscripts. For example:

> for i in 1..10000 loop
> s := s + A[i];
> end loop

> is slow, when high limit of array is some bigger number > 1000.

True, but inventing new FOR syntax isn't going to help people who are
used to doing that.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-18 19:28:40
Message-ID: 5871.1290108520@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
> "unnest" returns all fields, but
> these fields should not be used. There isn't possible to say - stop, I
> don't need other fields. It's possible just with special PL statement,
> because it is controlled by PL. So it is reason why I don't believe in
> optimizations on PL level.

That is complete nonsense. array_unnest doesn't return the whole array
contents at once, so it's just as capable of being optimized as any
single-purpose implementation. If you exit the loop early, you just
don't call it anymore.

regards, tom lane


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-18 19:31:47
Message-ID: 4CE57F23.8040108@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 11/18/2010 02:17 PM, Pavel Stehule wrote:
> -only a few people use FOR IN SELECT UNNEST for iteration over array.

How on earth do you know that? I use it a lot and I was just
demonstrating it to a client yesterday, and I'm quite sure he will use
it a lot too. I bet I'm far from alone.

cheers

andrew


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-18 19:32:00
Message-ID: AANLkTin05VpXRBPJ+z=wmLqRhGQ=_JOT3WdYxbpfUkKA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2010/11/18 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
>> 2010/11/18 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
>>> The problem here is that FOR is a syntactic choke point: it's already
>>> overloaded with several different sub-syntaxes that are quite difficult
>>> to separate.  Adding another one makes that worse, with the consequences
>>> that we might misinterpret the user's intent, leading either to
>>> misleading/unhelpful error messages or unexpected runtime behavior.
>
>> yes, this argument is correct - but we can rearange a parser rules
>> related to FOR statement. It can be solved.
>
> No, it can't.  The more things that can possibly follow FOR, the less
> likely that you correctly guess which one the user had in mind when
> faced with something that's not quite syntactically correct.  Or maybe
> it *is* syntactically correct, only not according to the variant that
> the user thought he was invoking.  We've seen bug reports of this sort
> connected with FOR already; in fact I'm pretty sure you've responded to
> a few yourself.  Adding more variants *will* make it worse.  We need
> a decent return on investment for anything we add here, and this
> proposal just doesn't offer enough benefit.
>

yes I reported a allowing a labels on "wrong" position. But minimally
this patch must not to change a current behave. It's your idea to use
keyword "ARRAY" there. Maybe we have just only different view on
complexity. My proposal increase complexity in parser, your proposal
in executor. Anybody thinking so other variant is worst. I don't speak
so we have to have a just FOR IN ARRAY syntax - I though so there was
a agreement on last discus. We can use a different syntax - but should
be readable.

Regards

Pavel Stehule

>                        regards, tom lane
>


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-18 19:34:37
Message-ID: AANLkTi=cwb-z_YbP61a-cAeGzY3aqqkKbbgPK4XLsiba@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2010/11/18 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
>> this note was a different -only  a few people use FOR IN SELECT UNNEST
>> for iteration over array. So from Robert's question (what is important
>> for current code?) perspective the more significant is access to
>> individual fields via subscripts. For example:
>
>> for i in 1..10000 loop
>>   s := s + A[i];
>> end loop
>
>> is slow, when high limit of array is some bigger number > 1000.
>
> True, but inventing new FOR syntax isn't going to help people who are
> used to doing that.

sure - I don't try it. Any change of this mean significant plpgsql's
refactoring and significant increasing the size and complexity of
code. More there can be still some overhead, because subscript can be
expression. And in almost all cases people dislike to write
subscripts.

>
>                        regards, tom lane
>


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-18 19:39:46
Message-ID: AANLkTinhvVj9azawZZtwK0R7Rn+GLaFdVkMV-O9n0UgY@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2010/11/18 Andrew Dunstan <andrew(at)dunslane(dot)net>:
>
>
> On 11/18/2010 02:17 PM, Pavel Stehule wrote:
>>
>>  -only a few people use FOR IN SELECT UNNEST for iteration over array.
>
> How on earth do you know that? I use it a lot and I was just demonstrating
> it to a client yesterday, and I'm quite sure he will use it a lot too. I bet
> I'm far from alone.
>

how much people are active readers and writers in pg_hackers like you? :)

I didn't say so nobody use it. You, me, David. But I really didn't see
this pattern here in real applications.

Regards

Pavel

> cheers
>
> andrew
>


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-18 19:45:32
Message-ID: 4CE5825C.2080803@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 11/18/2010 02:39 PM, Pavel Stehule wrote:
> 2010/11/18 Andrew Dunstan<andrew(at)dunslane(dot)net>:
>>
>> On 11/18/2010 02:17 PM, Pavel Stehule wrote:
>>> -only a few people use FOR IN SELECT UNNEST for iteration over array.
>> How on earth do you know that? I use it a lot and I was just demonstrating
>> it to a client yesterday, and I'm quite sure he will use it a lot too. I bet
>> I'm far from alone.
>>
> how much people are active readers and writers in pg_hackers like you? :)
>
> I didn't say so nobody use it. You, me, David. But I really didn't see
> this pattern here in real applications.
>

Lots of people are told to use it on IRC. Trust me, it's getting well known.

cheers

andrew


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-18 19:46:28
Message-ID: AANLkTikWpdzrmiy5t7C2S2FAO6Q1BAEco+MA6CTFgLT1@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2010/11/18 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
>> "unnest" returns all fields, but
>> these fields should not be used. There isn't possible to say - stop, I
>> don't need other fields. It's possible just with special PL statement,
>> because it is controlled by PL. So it is reason why I don't believe in
>> optimizations on PL level.
>
> That is complete nonsense.  array_unnest doesn't return the whole array
> contents at once, so it's just as capable of being optimized as any
> single-purpose implementation.  If you exit the loop early, you just
> don't call it anymore.

no it isn't - actually you cannot to limit a returned set when you
call SRF function in expression context - if I remember well. We can
change it - but this is other complexity.

Pavel

>
>                        regards, tom lane
>


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-18 20:00:04
Message-ID: AANLkTimgECqB77DrLbsgfMxxrzJZz0cixoYFoU=b5NvQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2010/11/18 Andrew Dunstan <andrew(at)dunslane(dot)net>:
>
>
> On 11/18/2010 02:39 PM, Pavel Stehule wrote:
>>
>> 2010/11/18 Andrew Dunstan<andrew(at)dunslane(dot)net>:
>>>
>>> On 11/18/2010 02:17 PM, Pavel Stehule wrote:
>>>>
>>>>  -only a few people use FOR IN SELECT UNNEST for iteration over array.
>>>
>>> How on earth do you know that? I use it a lot and I was just
>>> demonstrating
>>> it to a client yesterday, and I'm quite sure he will use it a lot too. I
>>> bet
>>> I'm far from alone.
>>>
>> how much people are active readers and writers in pg_hackers like you? :)
>>
>> I didn't say so nobody use it. You, me, David. But I really didn't see
>> this pattern here in real applications.
>>
>
> Lots of people are told to use it on IRC. Trust me, it's getting well known.

can be. but people on IRC are not representative.

I have about 10 courses of PL/pgSQL per year (about 100 people) -
almost all my students newer visited IRC. 30% of my students has a
problem to write a bublesort or some little bit complex code. I meet
this people. There can be a language barrier or some laziness. Really
it is surprisingly how too less people are interesting about coding.
This people has own problems, and usually uses a most classic patter
that know from programming languages.

These peoples are "normal" and typical. Some courses I have under some
big Czech agency - so there are people from banks, industry. Actually
only I do a courses of PLpgSQL in Czech language - so I think I know
what people use. For example - only a few people know and use a
generate_series functions.

sorry for offtopic :)

Pavel

>
> cheers
>
> andrew
>


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-18 20:11:32
Message-ID: 1290110898-sup-2807@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Excerpts from Pavel Stehule's message of jue nov 18 17:00:04 -0300 2010:
> 2010/11/18 Andrew Dunstan <andrew(at)dunslane(dot)net>:

> >> I didn't say so nobody use it. You, me, David. But I really didn't see
> >> this pattern here in real applications.
> >>
> >
> > Lots of people are told to use it on IRC. Trust me, it's getting well known.
>
> can be. but people on IRC are not representative.

Yeah, that's true. I point out usage of unnest to our customers too,
but it's much more common to see people not using it, instead relying on
subscripts. People using Postgres show up unexpectedly from under
rocks, in the weirdest corners; they rarely consult documentation and
even more rarely get into IRC or mailing list to get help.

I fail to see how this supports the FOR-IN-array development though. It
will just be another unused construct for most people, no?

--
Álvaro Herrera <alvherre(at)commandprompt(dot)com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-18 20:18:07
Message-ID: AANLkTinuACqiHv3hszU=CHdmbk2i8iE2igXqMgpm0kdy@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2010/11/18 Alvaro Herrera <alvherre(at)commandprompt(dot)com>:
> Excerpts from Pavel Stehule's message of jue nov 18 17:00:04 -0300 2010:
>> 2010/11/18 Andrew Dunstan <andrew(at)dunslane(dot)net>:
>
>> >> I didn't say so nobody use it. You, me, David. But I really didn't see
>> >> this pattern here in real applications.
>> >>
>> >
>> > Lots of people are told to use it on IRC. Trust me, it's getting well known.
>>
>> can be. but people on IRC are not representative.
>
> Yeah, that's true.  I point out usage of unnest to our customers too,
> but it's much more common to see people not using it, instead relying on
> subscripts.  People using Postgres show up unexpectedly from under
> rocks, in the weirdest corners; they rarely consult documentation and
> even more rarely get into IRC or mailing list to get help.
>
> I fail to see how this supports the FOR-IN-array development though.  It
> will just be another unused construct for most people, no?

maybe I don't understand well, but patch FOR-IN-ARRAY has a documentation

<sect2 id="plpgsql-array-iterating">
+ <title>Looping Through Array</title>
+.
+ <para>
+ The syntax is:
+ <synopsis>
+ <optional> &lt;&lt;<replaceable>label</replaceable>&gt;&gt; </optional>
+ FOR <replaceable>target</replaceable> IN <replaceable>array
expression</replaceable
+ <replaceable>statements</replaceable>
+ END LOOP <optional> <replaceable>label</replaceable> </optional>;
+ </synopsis>
+.
+ The <replaceable>target</replaceable> is a record variable, row variable,
+ or comma-separated list of scalar variables.
+ The <replaceable>target</replaceable> is successively assigned each item
+ of result of the <replaceable>array_expression</replaceable>
and the loop body
+ executed for each item. Here is an example:
+.
+ <programlisting>
+ CREATE TYPE mypt AS (x int, y int);
+.
+ CREATE FUNCTION iterate_over_points() RETURNS void AS $$
+ DECLARE
+ x int; y int;
+ a mypt[] = ARRAY[(10,11),(20,21),(30,31)];
+ BEGIN
+ FOR x, y IN ARRAY a
+ LOOP
+ RAISE NOTICE 'x = %, y = %', x, y;
+ END LOOP;
+ END;
+ $$ LANGUAGE plpgsql;
+ </programlisting>
+.
+ If the loop is terminated by an <literal>EXIT</> statement, the last
+ assigned item value is still accessible after the loop.
+ </para>
+ </sect2>
+.

Pavel
>
> --
> Álvaro Herrera <alvherre(at)commandprompt(dot)com>
> The PostgreSQL Company - Command Prompt, Inc.
> PostgreSQL Replication, Consulting, Custom Development, 24x7 support
>


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-18 22:29:36
Message-ID: 12307.1290119376@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
> 2010/11/18 Alvaro Herrera <alvherre(at)commandprompt(dot)com>:
>> I fail to see how this supports the FOR-IN-array development though. It
>> will just be another unused construct for most people, no?

> maybe I don't understand well, but patch FOR-IN-ARRAY has a documentation

UNNEST is documented too. Adding still more features doesn't really
improve matters for people who haven't memorized the documentation;
it only makes it even harder for them to find out what they should be
using. (More features != better)

To my mind, the complaint about subscripting being slow suggests that we
ought to fix subscripting, not introduce a nonstandard feature that will
make certain use-cases faster if people rewrite their code to use it.

I think it would probably not be terribly hard to arrange forcible
detoasting of an array variable's value the first time it gets
subscripted, for instance. Of course that only fixes some use-cases;
but it would help, and it helps without requiring people to change their
code.

regards, tom lane


From: Andres Freund <andres(at)anarazel(dot)de>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-18 23:06:07
Message-ID: 201011190006.09225.andres@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thursday 18 November 2010 21:11:32 Alvaro Herrera wrote:
> Excerpts from Pavel Stehule's message of jue nov 18 17:00:04 -0300 2010:
> > 2010/11/18 Andrew Dunstan <andrew(at)dunslane(dot)net>:
> > >> I didn't say so nobody use it. You, me, David. But I really didn't see
> > >> this pattern here in real applications.
> > >
> > > Lots of people are told to use it on IRC. Trust me, it's getting well
> > > known.
> >
> > can be. but people on IRC are not representative.
>
> Yeah, that's true. I point out usage of unnest to our customers too,
> but it's much more common to see people not using it, instead relying on
> subscripts. People using Postgres show up unexpectedly from under
> rocks, in the weirdest corners; they rarely consult documentation and
> even more rarely get into IRC or mailing list to get help.
Well, a good reason for that might be that unnest() is pretty new... Most code
I read has been initially written quite a bit earlier. Seeing 8.4 in
production is only starting to get common.

Andres


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers(at)postgresql(dot)org, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-18 23:41:18
Message-ID: 4CE5B99E.9020802@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 11/18/2010 06:06 PM, Andres Freund wrote:
> Well, a good reason for that might be that unnest() is pretty new... Most code
> I read has been initially written quite a bit earlier. Seeing 8.4 in
> production is only starting to get common.

I guess we must have more adventurous customers than you :-)

(Incidentally, there is a (slow but still very useful) userland version
of unnest that works with 8.3.)

cheers

andrew


From: Cédric Villemain <cedric(dot)villemain(dot)debian(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-19 01:34:34
Message-ID: AANLkTin+HfPMFz=gT7ogfRMWnV8yfcAXEnZzjXoYPvnL@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2010/11/18 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
>> what is a slow:
>
>> a) repeated detoasting - access with subscripts - maybe detoasted
>> values can be cached?
>> b) evaluation of SRF expression - maybe call of SRF function can be
>> simple expression,
>> c) faster evaluation ro query
>
>> The most important is @a.
>
> Really?  Becase AFAICS array_unnest only detoasts the source array once,
> and saves the value between calls.
>
> array_unnest doesn't currently have any smarts about fetching slices
> of an array.  I'm not sure how useful that would be in practice, since
> (1) in most usages you probably run the function to the end and fetch
> all the values anyway; (2) it's hard to see how to optimize that way
> if the elements are varlena, which they most likely are in most usages
> where this could possibly be a win.  But if Cedric's use-case is really
> worth optimizing, I'd sure rather see the smarts for it in the general
> purpose array_unnest function instead of buried in plpgsql's FOR logic.

My use case is the following:

I have text[] data containing around 50k field.
Executing my array_filter (which is probably not as fast as what Pavel
did) is the same thing as executing unnest, except that during the
array walk, I apply a callback function and increment an internal
counter up to the p_limit parameter when callback function success.
So that it stops walking the array as soon as p_limit counter is full,
or there are no more elements to walk to in the array.

1) At a maximum it is slow like unnest (plus callback overhead), at a
minimum it find quickly and the gain is huge. Don't have the exact
numbers right here, but (from memory) the durations are between Oms
and 45 milliseconds (20M rows, of 50k field text array, not very long
text < 100 char, 15k calls per second, depending on items to walk in
the array, linear). WIth just unnest, a minimum is 45 ms per query.

2) DETOAST_SLICE I don't know the internals here

I have a concrete usage of what Pavel did.
And I agree that this is fast and easy way to handle the real issues behind :/

>
>                        regards, tom lane
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

--
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-19 06:40:57
Message-ID: AANLkTikK+yV+mMb2NS7ExP-_QthbEzBKs_CxBxr_tSyD@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2010/11/18 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
>> 2010/11/18 Alvaro Herrera <alvherre(at)commandprompt(dot)com>:
>>> I fail to see how this supports the FOR-IN-array development though.  It
>>> will just be another unused construct for most people, no?
>
>> maybe I don't understand well, but patch FOR-IN-ARRAY has a documentation
>
> UNNEST is documented too.  Adding still more features doesn't really
> improve matters for people who haven't memorized the documentation;
> it only makes it even harder for them to find out what they should be
> using.  (More features != better)
>

yes, but less user feature doesn't mean less code. Mainly in little
bit specific environment like plpgsql.

> To my mind, the complaint about subscripting being slow suggests that we
> ought to fix subscripting, not introduce a nonstandard feature that will
> make certain use-cases faster if people rewrite their code to use it.
>
> I think it would probably not be terribly hard to arrange forcible
> detoasting of an array variable's value the first time it gets
> subscripted, for instance.  Of course that only fixes some use-cases;
> but it would help, and it helps without requiring people to change their
> code.
>

This is just one half of problem and isn't simple. Second half is
"array_seek" - So any access with subscripts means seq reading of
array's data. Please, look on this part. I am thinking, so this is
more important, than anything what we discussed before. For fast
access there is necessary to call a deconstruct_array function. Then
you can access to subscripts quickly. Actually we have not a control
for access to items in array, when subscript is used in expression
(inside PL). So it is very difficult to accelerate speed in area -
probably it means a subscript expr should be evaluated individually.

A deconstruct_area is relative expensive function, so you have to have
a information about a using of array. Without it, and for smaller
arrays, the optimization can be bad. There isn't any backend
infrastructure for this decision now.

I did a profiling

first example: FOR IN ARRAY

samples % symbol name
336 20.6642 exec_eval_expr
269 16.5437 plpgsql_param_fetch
229 14.0836 exec_stmts
225 13.8376 exec_eval_datum
118 7.2571 exec_assign_value
91 5.5966 exec_eval_cleanup.clone.10
88 5.4121 setup_param_list
72 4.4280 __i686.get_pc_thunk.bx
65 3.9975 exec_eval_boolean
47 2.8905 exec_simple_cast_value
43 2.6445 free_var.clone.2
28 1.7220 exec_cast_value

samples % image name symbol name
1064 16.1188 postgres pglz_decompress
410 6.2112 postgres AllocSetAlloc
353 5.3477 postgres MemoryContextAllocZero
293 4.4387 postgres GetSnapshotData
290 4.3933 postgres AllocSetFree
281 4.2569 postgres ExecEvalParamExtern
223 3.3783 postgres ExecMakeFunctionResultNoSets
220 3.3328 postgres AllocSetReset
212 3.2116 postgres UTF8_MatchText
210 3.1813 postgres LWLockAcquire
195 2.9541 postgres AllocSetCheck
195 2.9541 postgres LWLockRelease
172 2.6057 postgres pfree
163 2.4693 postgres CopySnapshot
162 2.4542 postgres list_member_ptr
144 2.1815 postgres RevalidateCachedPlan
133 2.0148 postgres PushActiveSnapshot
121 1.8331 postgres PopActiveSnapshot
121 1.8331 postgres bms_is_member
118 1.7876 postgres MemoryContextAlloc
108 1.6361 postgres textlike
105 1.5907 postgres AcquireExecutorLocks
79 1.1968 postgres TransactionIdPrecedes
76 1.1513 postgres pgstat_end_function_usage
75 1.1362 postgres pgstat_init_function_usage
72 1.0907 postgres check_list_invariants

sample01 - FOR i IN array_lowe()..array_upper() for t1000

Profiling through timer interrupt
samples % symbol name
1039 29.4084 exec_stmts
723 20.4642 exec_eval_expr
587 16.6148 exec_eval_datum
408 11.5483 plpgsql_param_fetch
176 4.9816 exec_eval_cleanup.clone.10
167 4.7269 setup_param_list
159 4.5004 exec_eval_boolean
128 3.6230 __i686.get_pc_thunk.bx
66 1.8681 exec_simple_cast_value

samples % image name symbol name
312604 84.1141 postgres pglz_decompress
4800 1.2916 postgres hash_search_with_hash_value
4799 1.2913 postgres array_seek.clone.0
2935 0.7897 postgres LWLockAcquire
2399 0.6455 postgres _bt_compare
2219 0.5971 postgres LWLockRelease
1899 0.5110 postgres index_getnext
1374 0.3697 postgres hash_any
1257 0.3382 postgres LockAcquireExtended
1231 0.3312 postgres _bt_checkkeys
1208 0.3250 postgres AllocSetAlloc
1158 0.3116 postgres FunctionCall2
1102 0.2965 postgres toast_fetch_datum

same for t100

samples % symbol name
108 20.6107 exec_eval_expr
96 18.3206 plpgsql_param_fetch
92 17.5573 exec_eval_datum
66 12.5954 exec_stmts
43 8.2061 setup_param_list
38 7.2519 __i686.get_pc_thunk.bx
34 6.4885 exec_eval_cleanup.clone.10
16 3.0534 exec_simple_cast_value
12 2.2901 exec_eval_boolean

samples % image name symbol name
511 20.4646 postgres array_seek.clone.0
163 6.5278 postgres ExecEvalParamExtern
131 5.2463 postgres AllocSetAlloc
127 5.0861 postgres MemoryContextAllocZero
113 4.5254 postgres list_member_ptr
103 4.1249 postgres GetSnapshotData
95 3.8046 postgres AllocSetFree
92 3.6844 postgres LWLockAcquire
80 3.2038 postgres ExecMakeFunctionResultNoSets
74 2.9636 postgres UTF8_MatchText
70 2.8034 postgres LWLockRelease
57 2.2827 postgres ExecEvalArrayRef
57 2.2827 postgres RevalidateCachedPlan
53 2.1225 postgres AllocSetReset
48 1.9223 postgres AllocSetCheck
47 1.8823 postgres pfree
41 1.6420 postgres PushActiveSnapshot
40 1.6019 postgres CopySnapshot
40 1.6019 postgres bms_is_member
39 1.5619 postgres PopActiveSnapshot
37 1.4818 postgres AcquireExecutorLocks
32 1.2815 postgres array_ref
31 1.2415 postgres textlike
28 1.1213 postgres MemoryContextAlloc

sample3 FOR IN UNNEST

samples % symbol name
334 19.1844 exec_eval_expr
278 15.9678 plpgsql_param_fetch
246 14.1298 exec_eval_datum
180 10.3389 exec_stmts
140 8.0414 exec_assign_value
107 6.1459 setup_param_list
97 5.5715 exec_eval_cleanup.clone.10
97 5.5715 exec_move_row
84 4.8248 __i686.get_pc_thunk.bx
53 3.0442 exec_eval_boolean
42 2.4124 exec_simple_cast_value
36 2.0678 free_var.clone.2

samples % image name symbol name
996 11.5171 postgres pglz_decompress
507 5.8626 postgres AllocSetAlloc
494 5.7123 postgres list_member_ptr
411 4.7525 postgres MemoryContextAllocZero
344 3.9778 postgres ExecEvalParamExtern
305 3.5268 postgres GetSnapshotData
297 3.4343 postgres ExecMakeFunctionResultNoSets
265 3.0643 postgres AllocSetFree
250 2.8908 postgres UTF8_MatchText
242 2.7983 postgres LWLockRelease
236 2.7290 postgres LWLockAcquire
210 2.4283 postgres AllocSetReset
201 2.3242 postgres heap_form_tuple
198 2.2895 postgres AllocSetCheck
183 2.1161 postgres pfree
165 1.9080 postgres ExecProject
155 1.7923 postgres heap_fill_tuple
151 1.7461 postgres CopySnapshot
141 1.6304 postgres RevalidateCachedPlan
136 1.5726 postgres MemoryContextAlloc
114 1.3182 postgres PopActiveSnapshot
108 1.2488 postgres AcquireExecutorLocks
102 1.1795 postgres ExecMakeFunctionResult
102 1.1795 postgres pgstat_init_function_usage
95 1.0985 postgres textlike
94 1.0870 postgres bms_is_member
92 1.0638 postgres datumGetSize

For iteration over large array with subscripts I am thinking so enough
is a block repeated pglz_decompress. Others optimizations needs a
hundreds lines (my personal opinion)

regards

Pavel Stehule

>                        regards, tom lane
>


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-19 15:49:07
Message-ID: AANLkTikEFbLO1FNAz6m-n4wisy_b+=TsovFjVp0D2wei@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I checked my tests and the most important is a remove a repeated detoast.

postgres=# CREATE OR REPLACE FUNCTION public.filter01(text[], text, integer)
RETURNS text[]
LANGUAGE plpgsql
AS $function$
DECLARE
s text[] := '{}';
l int := 0; i int;
v text; loc text[] = $1;
BEGIN
FOR i IN array_lower(loc,1)..array_upper(loc,1)
LOOP
EXIT WHEN l = $3;
IF loc[i] LIKE $2 THEN
s := s || loc[i];
l := l + 1;
END IF;
END LOOP;
RETURN s;
END;$function$;

This code is very slow when array is large - tested on n=1000. With
one small modification can be 20x faster

DECLARE
s text[] := '{}';
l int := 0; i int;
v text; loc text[] = $1 || '{}'::text[]; --<<< does just detoast and
docomprimation
BEGIN

the final version of test can be:

so result:

Don't access to large unmodified array inside cycle, when data comes
from table (for iteration over A[1000] of text(10)). A speadup is from
451 sec to 15 sec. This rule can be interesting for PostGIS people,
because it can be valid for other long varlena values. But still this
is 2x slower than special statement.

Regards

Pavel Stehule

samples % symbol name
332 22.1333 exec_eval_expr
311 20.7333 plpgsql_param_fetch
267 17.8000 exec_eval_datum
220 14.6667 exec_stmts
91 6.0667 setup_param_list
82 5.4667 exec_eval_cleanup.clone.10
71 4.7333 __i686.get_pc_thunk.bx
48 3.2000 exec_simple_cast_value
43 2.8667 exec_eval_boolean

samples % symbol name
4636 37.5994 array_seek.clone.0
961 7.7940 pglz_decompress
901 7.3074 list_member_ptr
443 3.5929 MemoryContextAllocZero
384 3.1144 AllocSetAlloc
381 3.0900 ExecEvalParamExtern
334 2.7088 GetSnapshotData
255 2.0681 AllocSetFree
254 2.0600 LWLockRelease
249 2.0195 ExecMakeFunctionResultNoSets
249 2.0195 UTF8_MatchText
234 1.8978 LWLockAcquire
195 1.5815 AllocSetReset
167 1.3544 AllocSetCheck
163 1.3220 pfree
151 1.2247 ExecEvalArrayRef
149 1.2084 RevalidateCachedPlan
138 1.1192 bms_is_member
126 1.0219 CopySnapshot


From: Valentine Gogichashvili <valgog(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-22 11:21:28
Message-ID: AANLkTinnQgj1ubCvaRkqe3FTRuKn9j5=ZHJ8KnKbuxYd@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

with the FOR e IN SELECT UNNEST(a) construct there is an issue again related
to the unresting of composite type arrays:

BEGIN;
CREATE TYPE truple AS (i integer, a text, b text);

DO $SQL$
DECLARE
start_time timestamp;
t truple;
ta truple[] := ARRAY( select ROW(s.i, 'A' || (s.i)::text, 'B' ||
(s.i)::text )::truple from generate_series(1, 10000) as s(i) );
i integer := 1;
BEGIN
start_time := clock_timestamp();
FOR t IN SELECT UNNEST(ta) LOOP
raise info 't is %', t;
i := i + 1;
END LOOP;
RAISE INFO 'looped in %', clock_timestamp() - start_time;
END;
$SQL$;
ROLLBACK;

fails with ERROR: invalid input syntax for integer: "(1,A1,B1)"
CONTEXT: PL/pgSQL function "inline_code_block" line 8 at FOR over SELECT
rows

So to UNNEST such an array one has to SELECT * FROM UNNEST(a) to be able
loop there like:

BEGIN;
CREATE TYPE truple AS (i integer, a text, b text);

DO $SQL$
DECLARE
start_time timestamp;
t truple;
ta truple[] := ARRAY( select ROW(s.i, 'A' || (s.i)::text, 'B' ||
(s.i)::text )::truple from generate_series(1, 10000) as s(i) );
i integer := 1;
BEGIN
start_time := clock_timestamp();
FOR t IN SELECT * FROM UNNEST(ta) LOOP
raise info 't is %', t;
i := i + 1;
END LOOP;
RAISE INFO 'looped in %', clock_timestamp() - start_time;
END;
$SQL$;
ROLLBACK;

Is it a bug or a feature? And if the second, then any work on optimizing FOR
e IN SELECT UNNEST(a) should probably include FOR e IN SELECT * FROM
UNNEST(a) statement optimizations.

Also, would the suggested FOR-IN-ARRAY construct loop in such
a composite type arrays?

Best regards,

-- Valenine Gogichashvili

On Thu, Nov 18, 2010 at 8:16 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
> > 2010/11/18 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> >> The problem here is that FOR is a syntactic choke point: it's already
> >> overloaded with several different sub-syntaxes that are quite difficult
> >> to separate. Adding another one makes that worse, with the consequences
> >> that we might misinterpret the user's intent, leading either to
> >> misleading/unhelpful error messages or unexpected runtime behavior.
>
> > yes, this argument is correct - but we can rearange a parser rules
> > related to FOR statement. It can be solved.
>
> No, it can't. The more things that can possibly follow FOR, the less
> likely that you correctly guess which one the user had in mind when
> faced with something that's not quite syntactically correct. Or maybe
> it *is* syntactically correct, only not according to the variant that
> the user thought he was invoking. We've seen bug reports of this sort
> connected with FOR already; in fact I'm pretty sure you've responded to
> a few yourself. Adding more variants *will* make it worse. We need
> a decent return on investment for anything we add here, and this
> proposal just doesn't offer enough benefit.
>
> regards, tom lane
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


From: Jaime Casanova <jaime(at)2ndquadrant(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-22 13:39:42
Message-ID: AANLkTinHcLBOnAHZ8iSSv13+twWbje5Z6WYCF=rLRW23@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Nov 17, 2010 at 7:08 PM, Jaime Casanova <jaime(at)2ndquadrant(dot)com> wrote:
> On Thu, Sep 30, 2010 at 7:46 AM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>> Hello
>>
>> this patch implement a new iteration construct - iteration over an
>> array. The sense of this new iteration is:
>>  * a simple and cleaner syntax
>
> i will start the review of this one...

so, what is the concensus for this patch?
return with feedback? reject the patch on the grounds that we should
go fix unnest() if it slow?
something else?

the patch itself works as advertised (in functionality) but i haven't
make much performance tests to see if we actually win something

--
Jaime Casanova         www.2ndQuadrant.com
Professional PostgreSQL: Soporte y capacitación de PostgreSQL


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Valentine Gogichashvili <valgog(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-22 14:29:43
Message-ID: AANLkTi=iQu5ULNjO+L9OzPbQOseJ55r5c-atHbytAu2-@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Nov 22, 2010 at 6:21 AM, Valentine Gogichashvili
<valgog(at)gmail(dot)com> wrote:
> Hi,
> with the FOR e IN SELECT UNNEST(a) construct there is an issue again related
> to the unresting of composite type arrays:
> [ example ]
> Is it a bug or a feature?

It looks like the problem in this example is that PL/pgsql tries to
assign the result of unest(ta) to t.i rather than to t as a whole.
This is pretty ridiculously stupid in this case, but it would make
sense if the subselect where of the form SELECT a, b, c FROM ...

I haven't looked at the code to see whether there's a way to make this
case smarter (it would be nice - I've been bitten by similar problems
myself) but it's only tangentially related to the patch under
discussion.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jaime Casanova <jaime(at)2ndquadrant(dot)com>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-22 14:36:59
Message-ID: AANLkTinSVTKyJAV4-Mvja6rga=4TbgfPEcQv+3KqM6=v@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Nov 22, 2010 at 8:39 AM, Jaime Casanova <jaime(at)2ndquadrant(dot)com> wrote:
> On Wed, Nov 17, 2010 at 7:08 PM, Jaime Casanova <jaime(at)2ndquadrant(dot)com> wrote:
>> On Thu, Sep 30, 2010 at 7:46 AM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>>> Hello
>>>
>>> this patch implement a new iteration construct - iteration over an
>>> array. The sense of this new iteration is:
>>>  * a simple and cleaner syntax
>>
>> i will start the review of this one...
>
> so, what is the concensus for this patch?
> return with feedback? reject the patch on the grounds that we should
> go fix unnest() if it slow?
> something else?

I think it should be marked rejected. I don't think Tom is likely to
ever be in favor of a syntax change here, and while I hesitate to deal
in absolutes, I don't think I will be either, and certainly not
without a lot more work on improving the performance of the existing
constructs. In particular, this seems like something that really
ought to be pursued:

http://archives.postgresql.org/pgsql-hackers/2010-11/msg01177.php

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-22 20:36:54
Message-ID: AANLkTim_sRK5CnLQHbLwWPNLGMQcxUJqx1MzCTVH+xjU@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello

I spent last two days a searching how to solve this problem better.
Probably I removed a issue with toasting. But I found other issue,
that wasn't discussed before. This issue is only seq access to items
via array_seek function. I though about some variable that stores a
last accessed field and last used subscript - but there isn't a good
place where this info can be stored (maybe in ArrayType). Other issues
are a arrays with a null bitmap. It is difficult to say if this cache
can have a positive effect - maybe yes - for large arrays. Problem is
in possible a increase of price for random access to array. And there
are not any "hint", that helps with specification about access to
array.

I don't think so searching inside expr. plan is a good idea. Is way to
have more code, more complexity. If we will do it, then more important
are accelerations of expression var = var + 1; var = var || var; or
some else.

So, please, I know, so you and Tom are busy, but try to spend a few
time about this problem before you are definitely reject this idea.

With my last patch (removing a toasting issue) the classic access to
array should be usable. But still any direct access to array from PL
executor is 20% faster - depends on array type. Maybe it is useless
discus - and all can be solved with possible support SRF inside simple
expression, but I don't know when it will be possible.

Regards

Pavel Stehule

*
* CAUTION: if you change the header for ordinary arrays you will also
* need to change the headers for oidvector and int2vector!
*/

>
> I think it should be marked rejected.  I don't think Tom is likely to
> ever be in favor of a syntax change here, and while I hesitate to deal
> in absolutes, I don't think I will be either, and certainly not
> without a lot more work on improving the performance of the existing
> constructs.   In particular, this seems like something that really
> ought to be pursued:
>
> http://archives.postgresql.org/pgsql-hackers/2010-11/msg01177.php
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-22 21:59:30
Message-ID: AANLkTin746-XLEyKv1JDbigFG2VQ9XYBo5OqEnA7jNOp@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Nov 22, 2010 at 3:36 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
> So, please, I know, so you and Tom are busy, but try to spend a few
> time about this problem before you are definitely reject this idea.

If I were to spend more time on this problem, what exactly would I
spend that time doing and how would it help? If I were interested in
spending time I'd probably spend it pursuing the suggestions Tom
already made, and that's what I think you should do. But I'm not
going to do that, because the purpose of the CommitFest is not for me
to write new patches from scratch that do something vaguely similar to
what a patch you wrote was trying to do. It's for all of us to review
and commit the patches that have already been written. You aren't
helping with that process, so your complaint that we aren't spending
enough time on your patches would be unfair even if were true, and it
isn't. The problem with your patch is that it has a design weakness,
not that it got short shift.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-23 04:55:28
Message-ID: AANLkTi=fyovp=EgLFm3HpzbkAJ-o_BP=27gSi2FxuHoi@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2010/11/22 Robert Haas <robertmhaas(at)gmail(dot)com>:
> On Mon, Nov 22, 2010 at 3:36 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>> So, please, I know, so you and Tom are busy, but try to spend a few
>> time about this problem before you are definitely reject this idea.
>
> If I were to spend more time on this problem, what exactly would I
> spend that time doing and how would it help?  If I were interested in
> spending time I'd probably spend it pursuing the suggestions Tom
> already made, and that's what I think you should do.  But I'm not
> going to do that, because the purpose of the CommitFest is not for me
> to write new patches from scratch that do something vaguely similar to
> what a patch you wrote was trying to do.  It's for all of us to review
> and commit the patches that have already been written.  You aren't
> helping with that process, so your complaint that we aren't spending
> enough time on your patches would be unfair even if were true, and it
> isn't. The problem with your patch is that it has a design weakness,
> not that it got short shift.

ok, I can only recapitulate so this feature was proposed cca two
months ago, and minimally Tom and maybe you did agreement - with
request on syntax - do you remember? I am little bit tired so this
agreement was changed when I spent my time with this.

Pavel

>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>


From: David Fetter <david(at)fetter(dot)org>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-23 05:42:23
Message-ID: 20101123054223.GA30097@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Nov 23, 2010 at 05:55:28AM +0100, Pavel Stehule wrote:
>
> ok, I can only recapitulate so this feature was proposed cca two
> months ago, and minimally Tom and maybe you did agreement - with
> request on syntax - do you remember? I am little bit tired so this
> agreement was changed when I spent my time with this.

What you can actually do that's productive is stop all current
development and concentrate on reviewing patches. If the language gap
is an issue, please feel free to mail me your reviews in Czech, and I
will get them translated.

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: David Fetter <david(at)fetter(dot)org>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-23 05:53:43
Message-ID: AANLkTi=iXOui=syUKQh=_bdWbStG2yTwE_uKMpizmSR9@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2010/11/23 David Fetter <david(at)fetter(dot)org>:
> On Tue, Nov 23, 2010 at 05:55:28AM +0100, Pavel Stehule wrote:
>>
>> ok, I can only recapitulate so this feature was proposed cca two
>> months ago, and minimally Tom and maybe you did agreement - with
>> request on syntax - do you remember? I am little bit tired so this
>> agreement was changed when I spent my time with this.
>
> What you can actually do that's productive is stop all current
> development and concentrate on reviewing patches.  If the language gap
> is an issue, please feel free to mail me your reviews in Czech, and I
> will get them translated.
>

it was a last mail to this topic minimally for some months.

Regards

Pavel Stehule

> Cheers,
> David.
> --
> David Fetter <david(at)fetter(dot)org> http://fetter.org/
> Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
> Skype: davidfetter      XMPP: david(dot)fetter(at)gmail(dot)com
> iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics
>
> Remember to vote!
> Consider donating to Postgres: http://www.postgresql.org/about/donate
>


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-23 18:50:19
Message-ID: AANLkTin13DOouNAMpFHNv2Bs4udmsk1ARS9ni2_ER-Lo@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Nov 22, 2010 at 11:55 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
> ok, I can only recapitulate so this feature was proposed cca two
> months ago, and minimally Tom and maybe you did agreement - with
> request on syntax - do you remember? I am little bit tired so this
> agreement was changed when I spent my time with this.

I went back and reread the thread I believe you're speaking about.
The first post is here:

http://archives.postgresql.org/pgsql-hackers/2010-09/msg01945.php

I cannot find one single email on that thread where Tom or I or anyone
else endorsed the syntax you've proposed here; indeed, it and some
other suggestions were roundly criticized. You responded to that by
saying that the arguments against it were all wrong, but no one other
than you ever appeared to believe that. There are a few emails on
that thread where other people agreed that it would be nice, in
theory, to have some syntax for this, but there is not one single
email that I see saying that any syntax you proposed was a good one.
If you read that thread and concluded that there was consensus to
implement this syntax, you did not read it very carefully.

If we had ELEMENT as a reserved keyword (which apparently it is in
some versions of the SQL standard), maybe FOR ELEMENT wunk IN
wunkarray... would be sufficiently unambiguous. But it's not even an
unreserved keyword right now, and I have a hard time thinking it would
be worth reserving it just for this.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Cédric Villemain <cedric(dot)villemain(dot)debian(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-24 01:56:11
Message-ID: AANLkTin1V_JcmPLHLxnt-TgasfJdDQabSw4McRjqZS8o@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2010/11/23 Robert Haas <robertmhaas(at)gmail(dot)com>:
> On Mon, Nov 22, 2010 at 11:55 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>> ok, I can only recapitulate so this feature was proposed cca two
>> months ago, and minimally Tom and maybe you did agreement - with
>> request on syntax - do you remember? I am little bit tired so this
>> agreement was changed when I spent my time with this.
>
> I went back and reread the thread I believe you're speaking about.
> The first post is here:
>
> http://archives.postgresql.org/pgsql-hackers/2010-09/msg01945.php

Here perhaps ? (or before)

http://archives.postgresql.org/pgsql-hackers/2010-09/msg01983.php

>
> I cannot find one single email on that thread where Tom or I or anyone
> else endorsed the syntax you've proposed here;

Nah, but you didn't disagree on the main idea, you just said : 'like
Tom I agree that syntax must be uptaded to something beter' , more or
less

> indeed, it and some
> other suggestions were roundly criticized.  You responded to that by
> saying that the arguments against it were all wrong, but no one other
> than you ever appeared to believe that.  There are a few emails on
> that thread where other people agreed that it would be nice, in
> theory, to have some syntax for this, but there is not one single
> email that I see saying that any syntax you proposed was a good one.
> If you read that thread and concluded that there was consensus to
> implement this syntax, you did not read it very carefully.

I think you (Robert) misunderstood dramatically what Pavel try to do.
Pavel did an excellent optimization work for a specific point. This
specific point looks crucial for me in the current behavior of
PostgreSQL[1]. AFAIS Pavel didn't want to implement a genious syntax,
but an optimization feature.

I don't care about syntax, I care with Tom explanation on that. but no more.

I care with the idea that this patch is just a quick way to cut the iceberg.
It is.
and ?

And we might do it better with more deep analysis and refactoring more
stuff, I agree...
Still this patch is interesting enought from perf point of view to not
trash it that quickly, IMO.

>
> If we had ELEMENT as a reserved keyword (which apparently it is in
> some versions of the SQL standard), maybe FOR ELEMENT wunk IN
> wunkarray... would be sufficiently unambiguous.  But it's not even an
> unreserved keyword right now, and I have a hard time thinking it would
> be worth reserving it just for this.

I am not aware of SQL spec precisely about that.
David, did your recent post about UNNEST stuff looks relevant in this
thread ? I mean can we elaborate something from your suggestion to
improve the situation of the current patch (and vice-versa) ?

[1] data compression in the array allow to insert billions of data for
a small size print. (I know it is not pure design, it is just pure
end-user very effective solution)

--
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Cédric Villemain <cedric(dot)villemain(dot)debian(at)gmail(dot)com>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-24 02:38:27
Message-ID: AANLkTimAmnVR6Zxm0P3kNAx_zLz_PkW1FqxhieWitAJk@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Nov 23, 2010 at 8:56 PM, Cédric Villemain
<cedric(dot)villemain(dot)debian(at)gmail(dot)com> wrote:
> 2010/11/23 Robert Haas <robertmhaas(at)gmail(dot)com>:
>> On Mon, Nov 22, 2010 at 11:55 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>>> ok, I can only recapitulate so this feature was proposed cca two
>>> months ago, and minimally Tom and maybe you did agreement - with
>>> request on syntax - do you remember? I am little bit tired so this
>>> agreement was changed when I spent my time with this.
>>
>> I went back and reread the thread I believe you're speaking about.
>> The first post is here:
>>
>> http://archives.postgresql.org/pgsql-hackers/2010-09/msg01945.php
>
> Here perhaps ? (or before)
>
> http://archives.postgresql.org/pgsql-hackers/2010-09/msg01983.php

Dang. You're right. I stand corrected.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Cédric Villemain <cedric(dot)villemain(dot)debian(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-24 06:06:17
Message-ID: 6371.1290578777@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

=?ISO-8859-1?Q?C=E9dric_Villemain?= <cedric(dot)villemain(dot)debian(at)gmail(dot)com> writes:
> I think you (Robert) misunderstood dramatically what Pavel try to do.
> Pavel did an excellent optimization work for a specific point. This
> specific point looks crucial for me in the current behavior of
> PostgreSQL[1]. AFAIS Pavel didn't want to implement a genious syntax,
> but an optimization feature.

As near as I can tell, Pavel is bullheadedly insisting on adding new
syntax, not on the optimization aspect of it. I already pointed out
how he could get 100% of the performance benefit using the existing
syntax, but he doesn't appear to be willing to pursue that route.

regards, tom lane


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Cédric Villemain <cedric(dot)villemain(dot)debian(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-24 07:43:32
Message-ID: AANLkTineZ2+SZ4t98s_OmByiktG4-D6Ns4jbQqD8SkNQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2010/11/24 Robert Haas <robertmhaas(at)gmail(dot)com>:
> On Tue, Nov 23, 2010 at 8:56 PM, Cédric Villemain
> <cedric(dot)villemain(dot)debian(at)gmail(dot)com> wrote:
>> 2010/11/23 Robert Haas <robertmhaas(at)gmail(dot)com>:
>>> On Mon, Nov 22, 2010 at 11:55 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>>>> ok, I can only recapitulate so this feature was proposed cca two
>>>> months ago, and minimally Tom and maybe you did agreement - with
>>>> request on syntax - do you remember? I am little bit tired so this
>>>> agreement was changed when I spent my time with this.
>>>
>>> I went back and reread the thread I believe you're speaking about.
>>> The first post is here:
>>>
>>> http://archives.postgresql.org/pgsql-hackers/2010-09/msg01945.php
>>
>> Here perhaps ? (or before)
>>
>> http://archives.postgresql.org/pgsql-hackers/2010-09/msg01983.php
>
> Dang.  You're right.  I stand corrected.
>

Sorry, I though so you and Tom hasn't a problem with syntax
FOR-IN-ARRAY (what is a Kevin Grittner's proposal). So problematic is
just my original proposal FOR-IN-expr, but proposed feature isn't
rejected.

My proposal isn't really genial - is true so first my motivation was
to replace a pattern array_lower(var,1)..array_upper(var,1). It's
relative simple in ADA, statement FOR is defined over range type, and
relative impossible in PL/pgSQL, where range type doesn't exists. Some
special construct in PL/pgSQL can to solve iteration over array
significantly better and simpler then any other solution - this really
must not be syntax FOR-IN-ARRAY - and with any next test and next code
checking I am more sure:

why:
* there is clean indicia so developer wants to process all items in array
* there isn't random access to array
* is possibility for a reuse varlena types stored in array without a
temporal copy

I am sorry, so I didn't speaking about these facts ear

> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Cédric Villemain <cedric(dot)villemain(dot)debian(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-24 07:46:09
Message-ID: AANLkTik9RjMN9sRUEMaGiyi_KmRxr9cj6XhJY9YfaM1K@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

sorry, there was a broken message

2010/11/24 Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>:
> 2010/11/24 Robert Haas <robertmhaas(at)gmail(dot)com>:
>> On Tue, Nov 23, 2010 at 8:56 PM, Cédric Villemain
>> <cedric(dot)villemain(dot)debian(at)gmail(dot)com> wrote:
>>> 2010/11/23 Robert Haas <robertmhaas(at)gmail(dot)com>:
>>>> On Mon, Nov 22, 2010 at 11:55 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>>>>> ok, I can only recapitulate so this feature was proposed cca two
>>>>> months ago, and minimally Tom and maybe you did agreement - with
>>>>> request on syntax - do you remember? I am little bit tired so this
>>>>> agreement was changed when I spent my time with this.
>>>>
>>>> I went back and reread the thread I believe you're speaking about.
>>>> The first post is here:
>>>>
>>>> http://archives.postgresql.org/pgsql-hackers/2010-09/msg01945.php
>>>
>>> Here perhaps ? (or before)
>>>
>>> http://archives.postgresql.org/pgsql-hackers/2010-09/msg01983.php
>>
>> Dang.  You're right.  I stand corrected.
>>
>
> Sorry, I though so you and Tom hasn't a problem with syntax
> FOR-IN-ARRAY (what is a Kevin Grittner's proposal). So problematic is
> just my original proposal FOR-IN-expr, but proposed feature isn't
> rejected.
>

Sorry, I though so you and Tom hasn't a problem with syntax
FOR-IN-ARRAY (what is a Kevin Grittner's proposal). I though so
problematic is just my original proposal FOR-IN-expr, but proposed
feature isn't a problem.

My proposal isn't really genial - is true so first my motivation was
to replace unwished pattern "array_lower(var,1)..array_upper(var,1)".
It's relative simple in ADA, where statement FOR is defined over range
type, and relative impossible in PL/pgSQL, where range type doesn't
exists yet. Some special construct in PL/pgSQL can to solve iteration
over array significantly better and simpler then any other solution -
there must not be used the syntax FOR-IN-ARRAY - with any next test
and next code checking I am more sure:

why?:
* there is clean indicia so developer wants to process all items in
array, or almost all
* there isn't random access to array!!
* is possibility for a reuse varlena's value stored in array without a
temporal copy - with maybe some trick!!
* there is a very low overhead

I am sorry, so I didn't speaking about these advices early.

I though about other possible syntax - what do you think about "FOR
var OVER expr LOOP ... END LOOP" ? "OVER" is keyword now

Regards

Pavel
>> --
>> Robert Haas
>> EnterpriseDB: http://www.enterprisedb.com
>> The Enterprise PostgreSQL Company
>>
>


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Cédric Villemain <cedric(dot)villemain(dot)debian(at)gmail(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-24 12:37:00
Message-ID: AANLkTinxpS7nYRL9Uv47-FMrjMff0KJA1Dt_FsfHWnCv@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Nov 24, 2010 at 1:06 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> =?ISO-8859-1?Q?C=E9dric_Villemain?= <cedric(dot)villemain(dot)debian(at)gmail(dot)com> writes:
>> I think you (Robert) misunderstood dramatically what Pavel try to do.
>> Pavel did an excellent optimization work for a specific point. This
>> specific point looks crucial for me in the current behavior of
>> PostgreSQL[1]. AFAIS Pavel didn't want to implement a genious syntax,
>> but an optimization feature.
>
> As near as I can tell, Pavel is bullheadedly insisting on adding new
> syntax, not on the optimization aspect of it.  I already pointed out
> how he could get 100% of the performance benefit using the existing
> syntax, but he doesn't appear to be willing to pursue that route.

Right, that was my impression, too. But, I think this may be partly a
case of people talking past each other. My impression of this
conversation was a repetition of this sequence:

A: This syntax is bad.
B: But it's way faster!

...which makes no sense. However, what I now think is going on here
is that there are really two separate things that are wished for here
- a more compact syntax, and a performance improvement. And taken
separately, I agree with both of those desires. PL/pgsql is an
incredibly clunky language syntactically, and it's also slow. A patch
that improves either one of those things has value, whether or not it
also does the other one.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Cédric Villemain <cedric(dot)villemain(dot)debian(at)gmail(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-24 15:33:02
Message-ID: 18771.1290612782@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> Right, that was my impression, too. But, I think this may be partly a
> case of people talking past each other. My impression of this
> conversation was a repetition of this sequence:

> A: This syntax is bad.
> B: But it's way faster!

> ...which makes no sense. However, what I now think is going on here
> is that there are really two separate things that are wished for here
> - a more compact syntax, and a performance improvement. And taken
> separately, I agree with both of those desires. PL/pgsql is an
> incredibly clunky language syntactically, and it's also slow. A patch
> that improves either one of those things has value, whether or not it
> also does the other one.

I understand the desire for nicer syntax, in the abstract. I'm just
unimpressed by this particular change, mainly because I'm afraid that
it will make syntax-error behaviors worse and foreclose future options
for other changes to FOR. If it were necessary to change the syntax
to get the performance benefit, I might think that on balance we should
do so; but it isn't.

regards, tom lane


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Cédric Villemain <cedric(dot)villemain(dot)debian(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-24 17:15:53
Message-ID: AANLkTike8r8mm2m9ny7CyU=_o2aZSVnGny63h0fwFQxS@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello

2010/11/24 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> Right, that was my impression, too.  But, I think this may be partly a
>> case of people talking past each other.  My impression of this
>> conversation was a repetition of this sequence:
>
>> A: This syntax is bad.
>> B: But it's way faster!
>
>> ...which makes no sense.  However, what I now think is going on here
>> is that there are really two separate things that are wished for here
>> - a more compact syntax, and a performance improvement.  And taken
>> separately, I agree with both of those desires.  PL/pgsql is an
>> incredibly clunky language syntactically, and it's also slow.  A patch
>> that improves either one of those things has value, whether or not it
>> also does the other one.
>
> I understand the desire for nicer syntax, in the abstract.  I'm just
> unimpressed by this particular change, mainly because I'm afraid that
> it will make syntax-error behaviors worse and foreclose future options
> for other changes to FOR.  If it were necessary to change the syntax
> to get the performance benefit, I might think that on balance we should
> do so; but it isn't.
>

I am for any readable syntax. It must not be FOR-IN-ARRAY. I
understand to problem with syntax-error checking. But I am not sure if
some >>different<< loop with control variable can be less ugliness in
language. Cannot we rewrite a parsing "for-clause" be more robust? I
agree with you, so there can be a other request in future. And if I
remember well, there was only few changes in other statements (on
parser level) and significant changes in FOR.

probably some hypothetical statement should be (my opinion)

FOR var [, vars]
<<UNKNOWN SYNTAX>>
LOOP
...
END LOOP;

PL/SQL uses a enhanced FOR

FOR var IN collection.first .. collection.last
LOOP
END LOOP;

From my view a introduction of new keyword should be a higher risk so
I don't would to do.

Regards

Pavel Stehule

>                        regards, tom lane
>


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Cédric Villemain <cedric(dot)villemain(dot)debian(at)gmail(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: final patch - plpgsql: for-in-array
Date: 2010-11-24 18:07:03
Message-ID: AANLkTin8YoExML_98e0UHHArxrVjksXyA-s_eVX92GP0@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Nov 24, 2010 at 10:33 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> Right, that was my impression, too.  But, I think this may be partly a
>> case of people talking past each other.  My impression of this
>> conversation was a repetition of this sequence:
>
>> A: This syntax is bad.
>> B: But it's way faster!
>
>> ...which makes no sense.  However, what I now think is going on here
>> is that there are really two separate things that are wished for here
>> - a more compact syntax, and a performance improvement.  And taken
>> separately, I agree with both of those desires.  PL/pgsql is an
>> incredibly clunky language syntactically, and it's also slow.  A patch
>> that improves either one of those things has value, whether or not it
>> also does the other one.
>
> I understand the desire for nicer syntax, in the abstract.  I'm just
> unimpressed by this particular change, mainly because I'm afraid that
> it will make syntax-error behaviors worse and foreclose future options
> for other changes to FOR.  If it were necessary to change the syntax
> to get the performance benefit, I might think that on balance we should
> do so; but it isn't.

It'd be nicer syntax if there were some way to have the keyword not
adjacent to the arbitrary expression.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company