Re: proposal: searching in array function - array_position

Lists: pgsql-hackers
From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: proposal: searching in array function - array_position
Date: 2015-01-16 09:39:03
Message-ID: CAFj8pRBYDvz-qFfSwGZqQoYLBXWapeOXpyh-d=bHZB575zHGmQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi

I am proposing a simple function, that returns a position of element in
array.

FUNCTION array_position(anyarray, anyelement) RETURNS int

Implementation is simple (plpgsql code)

CREATE OR REPLACE FUNCTION array_position(anyarray, anyelement)
RETURNS int AS $$
DECLARE i int := 0;
BEGIN
FOREACH a IN ARRAY $1
LOOP
IF a = $1 THEN
RETURN i;
END IF;
i := i + 1;
END LOOP;
RETURN NULL;
END;
$$ LANGUAGE plpgsql IMMUTABLE STRICT;

A possible benefits:

1. speed in plpgsql applications
2. reduced length of SQL functions

Ideas, comments, notices?

Regards

Pavel


From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-01-16 16:57:07
Message-ID: 54B942E3.9020101@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 1/16/15 3:39 AM, Pavel Stehule wrote:
> I am proposing a simple function, that returns a position of element in array.

Yes please!

> FUNCTION array_position(anyarray, anyelement) RETURNS int

That won't work on a multi-dimensional array. Ideally it needs to accept a slice or an element and return the specifier for the slice.

This wouldn't be so bad if we had an easier way to extract subsets of an array, but even that is really ugly because whatever you extract keeps the original number of dimensions.

> Implementation is simple (plpgsql code)

This would actually be written in C though, yes? Otherwise it's not really any better than just doing an extension...
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-01-16 17:16:44
Message-ID: CAFj8pRDZQxpNdLAdGDbBzZgX6H5C+n23hkMiNE=osLC9cpV+JA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2015-01-16 17:57 GMT+01:00 Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>:

> On 1/16/15 3:39 AM, Pavel Stehule wrote:
>
>> I am proposing a simple function, that returns a position of element in
>> array.
>>
>
> Yes please!
>
> FUNCTION array_position(anyarray, anyelement) RETURNS int
>>
>
> That won't work on a multi-dimensional array. Ideally it needs to accept a
> slice or an element and return the specifier for the slice.
>

It is question, what is a result - probably, there can be a
multidimensional variant, where result will be a array

array_position([1,2,3],2) --> 2
array_position([[1,2],[2,3],[3,4]], [2,3]) --> 2 /* 2nd parameter should to
have N-1 dimension of first parameter */
array_position_md([1,2,3],2) --> [2]
array_position_md([[1,2],[2,3],[3,4]], 2) --> [2,1]

another question is how to solve more than one occurrence on one value -
probably two sets of functions - first returns first occurrence of value,
second returns set of occurrence

>
> This wouldn't be so bad if we had an easier way to extract subsets of an
> array, but even that is really ugly because whatever you extract keeps the
> original number of dimensions.
>
> Implementation is simple (plpgsql code)
>>
>
> This would actually be written in C though, yes? Otherwise it's not really
> any better than just doing an extension...
>

Sure, I expect a C implementation

Pavel

> --
> Jim Nasby, Data Architect, Blue Treble Consulting
> Data in Trouble? Get it in Treble! http://BlueTreble.com
>


From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-01-16 17:37:14
Message-ID: 54B94C4A.3050607@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 1/16/15 11:16 AM, Pavel Stehule wrote:
>
>
> 2015-01-16 17:57 GMT+01:00 Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com <mailto:Jim(dot)Nasby(at)bluetreble(dot)com>>:
>
> On 1/16/15 3:39 AM, Pavel Stehule wrote:
>
> I am proposing a simple function, that returns a position of element in array.
>
>
> Yes please!
>
> FUNCTION array_position(anyarray, anyelement) RETURNS int
>
>
> That won't work on a multi-dimensional array. Ideally it needs to accept a slice or an element and return the specifier for the slice.
>
>
> It is question, what is a result - probably, there can be a multidimensional variant, where result will be a array
>
> array_position([1,2,3],2) --> 2
> array_position([[1,2],[2,3],[3,4]], [2,3]) --> 2 /* 2nd parameter should to have N-1 dimension of first parameter */

The problem with that is you can't actually use '2' to get [2,3] back:

select (array[[1,2,3],[4,5,6],[7,8,9]])[1] IS NULL;
?column?
----------
t
(1 row)

I think the bigger problem here is we need something better than slices for handling subsets of arrays. Even if the function returned [2:2] it's still going to behave differently than it will in the non-array case because you won't be getting the expected number of dimensions back. :(

> array_position_md([1,2,3],2) --> [2]
> array_position_md([[1,2],[2,3],[3,4]], 2) --> [2,1]
>
> another question is how to solve more than one occurrence on one value - probably two sets of functions - first returns first occurrence of value, second returns set of occurrence

Gee, if only way had some way to return multiple elements of something... ;P

In other words, I think all of these should actually return an array of positions. I think it's OK for someone that only cares about the first instance to just do [1].
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-01-16 18:12:29
Message-ID: CAFj8pRBiHmLue6ApX2JzicyOAVCoi4H+DU8VGfVFj5Uj0rGmFA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2015-01-16 18:37 GMT+01:00 Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>:

> On 1/16/15 11:16 AM, Pavel Stehule wrote:
>
>>
>>
>> 2015-01-16 17:57 GMT+01:00 Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com <mailto:
>> Jim(dot)Nasby(at)bluetreble(dot)com>>:
>>
>> On 1/16/15 3:39 AM, Pavel Stehule wrote:
>>
>> I am proposing a simple function, that returns a position of
>> element in array.
>>
>>
>> Yes please!
>>
>> FUNCTION array_position(anyarray, anyelement) RETURNS int
>>
>>
>> That won't work on a multi-dimensional array. Ideally it needs to
>> accept a slice or an element and return the specifier for the slice.
>>
>>
>> It is question, what is a result - probably, there can be a
>> multidimensional variant, where result will be a array
>>
>> array_position([1,2,3],2) --> 2
>> array_position([[1,2],[2,3],[3,4]], [2,3]) --> 2 /* 2nd parameter should
>> to have N-1 dimension of first parameter */
>>
>
> The problem with that is you can't actually use '2' to get [2,3] back:
>
> select (array[[1,2,3],[4,5,6],[7,8,9]])[1] IS NULL;
> ?column?
> ----------
> t
> (1 row)
>

yes, but when you are searching a array in array you can use a full slice
selection:

postgres=# select (ARRAY[[1,2],[4,5]])[1][1:2]; -- [1:2] should be a
constant every time in this case -- so it should not be returned
array
---------
{{1,2}}
(1 row)

>
> I think the bigger problem here is we need something better than slices
> for handling subsets of arrays. Even if the function returned [2:2] it's
> still going to behave differently than it will in the non-array case
> because you won't be getting the expected number of dimensions back. :(
>

you cannot to return a slice and I don't propose it, although we can return
a range type or array of range type - but still we cannot to use range for
a arrays.

>
> array_position_md([1,2,3],2) --> [2]
>> array_position_md([[1,2],[2,3],[3,4]], 2) --> [2,1]
>>
>> another question is how to solve more than one occurrence on one value -
>> probably two sets of functions - first returns first occurrence of value,
>> second returns set of occurrence
>>
>
> Gee, if only way had some way to return multiple elements of something...
> ;P
>
> In other words, I think all of these should actually return an array of
> positions. I think it's OK for someone that only cares about the first
> instance to just do [1].

there can be two functions - "position" - returns first and "positions"
returns all as a array

>
> --
> Jim Nasby, Data Architect, Blue Treble Consulting
> Data in Trouble? Get it in Treble! http://BlueTreble.com
>


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-01-17 18:56:54
Message-ID: CAFj8pRAXwxo_=HxSSww-DaADfThWb1NKkr4bnpVv-HHwJw7agw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2015-01-16 17:57 GMT+01:00 Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>:

> On 1/16/15 3:39 AM, Pavel Stehule wrote:
>
>> I am proposing a simple function, that returns a position of element in
>> array.
>>
>
> Yes please!
>
> FUNCTION array_position(anyarray, anyelement) RETURNS int
>>
>
> That won't work on a multi-dimensional array. Ideally it needs to accept a
> slice or an element and return the specifier for the slice.
>

theoretically you can use this function for md arrays too. This function
returns offset, and you can calculate a Nd possition

so maybe better name -- array_offset or some similar

Regards

Pavel

>
> This wouldn't be so bad if we had an easier way to extract subsets of an
> array, but even that is really ugly because whatever you extract keeps the
> original number of dimensions.
>
> Implementation is simple (plpgsql code)
>>
>
> This would actually be written in C though, yes? Otherwise it's not really
> any better than just doing an extension...
> --
> Jim Nasby, Data Architect, Blue Treble Consulting
> Data in Trouble? Get it in Treble! http://BlueTreble.com
>


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-01-17 22:43:10
Message-ID: CAFj8pRBi6YRMyOSbftCoJ+ELsENF+S_NEO9_zhtqLZ1EyXQmnw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi

here is a proof concept of array_offset function

possible question:

* used comparation "=" or "IS NOT DISTINCT FROM"

In this initial proof concept I used "IS NOT DISTINCT FROM" operator - but
my opinion is not strong in this question. Both has some advantages and
disadvantages.

Regards

Pavel

2015-01-16 19:12 GMT+01:00 Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>:

>
>
> 2015-01-16 18:37 GMT+01:00 Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>:
>
>> On 1/16/15 11:16 AM, Pavel Stehule wrote:
>>
>>>
>>>
>>> 2015-01-16 17:57 GMT+01:00 Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com <mailto:
>>> Jim(dot)Nasby(at)bluetreble(dot)com>>:
>>>
>>> On 1/16/15 3:39 AM, Pavel Stehule wrote:
>>>
>>> I am proposing a simple function, that returns a position of
>>> element in array.
>>>
>>>
>>> Yes please!
>>>
>>> FUNCTION array_position(anyarray, anyelement) RETURNS int
>>>
>>>
>>> That won't work on a multi-dimensional array. Ideally it needs to
>>> accept a slice or an element and return the specifier for the slice.
>>>
>>>
>>> It is question, what is a result - probably, there can be a
>>> multidimensional variant, where result will be a array
>>>
>>> array_position([1,2,3],2) --> 2
>>> array_position([[1,2],[2,3],[3,4]], [2,3]) --> 2 /* 2nd parameter
>>> should to have N-1 dimension of first parameter */
>>>
>>
>> The problem with that is you can't actually use '2' to get [2,3] back:
>>
>> select (array[[1,2,3],[4,5,6],[7,8,9]])[1] IS NULL;
>> ?column?
>> ----------
>> t
>> (1 row)
>>
>
> yes, but when you are searching a array in array you can use a full slice
> selection:
>
> postgres=# select (ARRAY[[1,2],[4,5]])[1][1:2]; -- [1:2] should be a
> constant every time in this case -- so it should not be returned
> array
> ---------
> {{1,2}}
> (1 row)
>
>
>
>
>>
>> I think the bigger problem here is we need something better than slices
>> for handling subsets of arrays. Even if the function returned [2:2] it's
>> still going to behave differently than it will in the non-array case
>> because you won't be getting the expected number of dimensions back. :(
>>
>
> you cannot to return a slice and I don't propose it, although we can
> return a range type or array of range type - but still we cannot to use
> range for a arrays.
>
>>
>> array_position_md([1,2,3],2) --> [2]
>>> array_position_md([[1,2],[2,3],[3,4]], 2) --> [2,1]
>>>
>>> another question is how to solve more than one occurrence on one value -
>>> probably two sets of functions - first returns first occurrence of value,
>>> second returns set of occurrence
>>>
>>
>> Gee, if only way had some way to return multiple elements of something...
>> ;P
>>
>> In other words, I think all of these should actually return an array of
>> positions. I think it's OK for someone that only cares about the first
>> instance to just do [1].
>
>
> there can be two functions - "position" - returns first and "positions"
> returns all as a array
>
>
>>
>> --
>> Jim Nasby, Data Architect, Blue Treble Consulting
>> Data in Trouble? Get it in Treble! http://BlueTreble.com
>>
>
>

Attachment Content-Type Size
array_offset-01.patch text/x-patch 11.6 KB

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-01-20 17:12:32
Message-ID: CAFj8pRCC8tawi8STC4FAfnPMYEpiB1hi_T0q4vKPEtFqkqkakg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi

I am sending updated version - it allow third optional argument that
specify where searching should to start. With it is possible repeatably
call this function.

Regards

Pavel

2015-01-17 23:43 GMT+01:00 Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>:

> Hi
>
> here is a proof concept of array_offset function
>
> possible question:
>
> * used comparation "=" or "IS NOT DISTINCT FROM"
>
> In this initial proof concept I used "IS NOT DISTINCT FROM" operator - but
> my opinion is not strong in this question. Both has some advantages and
> disadvantages.
>
> Regards
>
> Pavel
>
>
> 2015-01-16 19:12 GMT+01:00 Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>:
>
>>
>>
>> 2015-01-16 18:37 GMT+01:00 Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>:
>>
>>> On 1/16/15 11:16 AM, Pavel Stehule wrote:
>>>
>>>>
>>>>
>>>> 2015-01-16 17:57 GMT+01:00 Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com <mailto:
>>>> Jim(dot)Nasby(at)bluetreble(dot)com>>:
>>>>
>>>> On 1/16/15 3:39 AM, Pavel Stehule wrote:
>>>>
>>>> I am proposing a simple function, that returns a position of
>>>> element in array.
>>>>
>>>>
>>>> Yes please!
>>>>
>>>> FUNCTION array_position(anyarray, anyelement) RETURNS int
>>>>
>>>>
>>>> That won't work on a multi-dimensional array. Ideally it needs to
>>>> accept a slice or an element and return the specifier for the slice.
>>>>
>>>>
>>>> It is question, what is a result - probably, there can be a
>>>> multidimensional variant, where result will be a array
>>>>
>>>> array_position([1,2,3],2) --> 2
>>>> array_position([[1,2],[2,3],[3,4]], [2,3]) --> 2 /* 2nd parameter
>>>> should to have N-1 dimension of first parameter */
>>>>
>>>
>>> The problem with that is you can't actually use '2' to get [2,3] back:
>>>
>>> select (array[[1,2,3],[4,5,6],[7,8,9]])[1] IS NULL;
>>> ?column?
>>> ----------
>>> t
>>> (1 row)
>>>
>>
>> yes, but when you are searching a array in array you can use a full slice
>> selection:
>>
>> postgres=# select (ARRAY[[1,2],[4,5]])[1][1:2]; -- [1:2] should be a
>> constant every time in this case -- so it should not be returned
>> array
>> ---------
>> {{1,2}}
>> (1 row)
>>
>>
>>
>>
>>>
>>> I think the bigger problem here is we need something better than slices
>>> for handling subsets of arrays. Even if the function returned [2:2] it's
>>> still going to behave differently than it will in the non-array case
>>> because you won't be getting the expected number of dimensions back. :(
>>>
>>
>> you cannot to return a slice and I don't propose it, although we can
>> return a range type or array of range type - but still we cannot to use
>> range for a arrays.
>>
>>>
>>> array_position_md([1,2,3],2) --> [2]
>>>> array_position_md([[1,2],[2,3],[3,4]], 2) --> [2,1]
>>>>
>>>> another question is how to solve more than one occurrence on one value
>>>> - probably two sets of functions - first returns first occurrence of value,
>>>> second returns set of occurrence
>>>>
>>>
>>> Gee, if only way had some way to return multiple elements of
>>> something... ;P
>>>
>>> In other words, I think all of these should actually return an array of
>>> positions. I think it's OK for someone that only cares about the first
>>> instance to just do [1].
>>
>>
>> there can be two functions - "position" - returns first and "positions"
>> returns all as a array
>>
>>
>>>
>>> --
>>> Jim Nasby, Data Architect, Blue Treble Consulting
>>> Data in Trouble? Get it in Treble! http://BlueTreble.com
>>>
>>
>>
>

Attachment Content-Type Size
array_offset-02.patch text/x-patch 14.3 KB

From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-01-20 20:32:54
Message-ID: 54BEBB76.6090708@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 1/20/15 11:12 AM, Pavel Stehule wrote:
> I am sending updated version - it allow third optional argument that specify where searching should to start. With it is possible repeatably call this function.

What happened to returning an array of offsets? I think that would be both easier to use than this version as well as performing better.

I see you dropped multi-dimension support, but I think that's fine.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-01-21 06:31:12
Message-ID: CAFj8pRAt-q8ibcPwDQLhAE8-MPL-HtESacBL-EMzhSNcM0LF2A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2015-01-20 21:32 GMT+01:00 Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>:

> On 1/20/15 11:12 AM, Pavel Stehule wrote:
>
>> I am sending updated version - it allow third optional argument that
>> specify where searching should to start. With it is possible repeatably
>> call this function.
>>
>
> What happened to returning an array of offsets? I think that would be both
> easier to use than this version as well as performing better.
>

I have still thinking about this idea. It needs a different function and I
didn't start with this.

Implementation a optional start parameter to array_offset is cheap - and I
am thinking so it can be useful for some use cases.

>
> I see you dropped multi-dimension support, but I think that's fine.

It can be implemented later. There is no any barriers to later
implementation.

> --
> Jim Nasby, Data Architect, Blue Treble Consulting
> Data in Trouble? Get it in Treble! http://BlueTreble.com
>


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-01-24 08:48:39
Message-ID: CAFj8pRALw26BR2dmLBpU-VKRprfQ6d9x8QkyXejD41DtSTdz2Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi

with array_offsets - returns a array of offsets

Regards

Pavel

2015-01-20 21:32 GMT+01:00 Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>:

> On 1/20/15 11:12 AM, Pavel Stehule wrote:
>
>> I am sending updated version - it allow third optional argument that
>> specify where searching should to start. With it is possible repeatably
>> call this function.
>>
>
> What happened to returning an array of offsets? I think that would be both
> easier to use than this version as well as performing better.
>
> I see you dropped multi-dimension support, but I think that's fine.
>
> --
> Jim Nasby, Data Architect, Blue Treble Consulting
> Data in Trouble? Get it in Treble! http://BlueTreble.com
>

Attachment Content-Type Size
array_offset-03.patch text/x-patch 19.5 KB

From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-01-26 22:01:52
Message-ID: 54C6B950.6020500@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 1/24/15 2:48 AM, Pavel Stehule wrote:
> with array_offsets - returns a array of offsets

+ <entry>returns a offset of first occurrence of some element in a array. It uses
should be
+ <entry>returns the offset of the first occurrence of some element in an array. It uses

+ <entry>returns a array of offset of all occurrences some element in a array. It uses
should be
+ <entry>returns an array of the offsets of all occurrences of some element in an array. It uses

Any way to reduce the code duplication between the array and non-array versions? Maybe factor out the operator caching code?

You should remove the array_length() from the last array_offsets test; I don't see that it buys anything.

I think there should be tests for what happens when you feed these functions a multi-dimensional array.

Other than that, looks good.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-01-26 22:17:48
Message-ID: CAFj8pRAY6SVOhbaeYoHZUAA73k0Ar2oM=zq9kRDZ7=_guAUYiw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2015-01-26 23:01 GMT+01:00 Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>:

> On 1/24/15 2:48 AM, Pavel Stehule wrote:
>
>> with array_offsets - returns a array of offsets
>>
>
> + <entry>returns a offset of first occurrence of some element in a
> array. It uses
> should be
> + <entry>returns the offset of the first occurrence of some
> element in an array. It uses
>
> + <entry>returns a array of offset of all occurrences some element
> in a array. It uses
> should be
> + <entry>returns an array of the offsets of all occurrences of
> some element in an array. It uses
>
> Any way to reduce the code duplication between the array and non-array
> versions? Maybe factor out the operator caching code?
>
>
I though about it - but there is different checks, different result
processing, different result type.

I didn't find any readable and reduced form :(

> You should remove the array_length() from the last array_offsets test; I
> don't see that it buys anything.
>

ok

>
> I think there should be tests for what happens when you feed these
> functions a multi-dimensional array.
>

I can do it. Result should be expected - it searching row by row due MD
format

>
> Other than that, looks good.

Thank you

Pavel

>
> --
> Jim Nasby, Data Architect, Blue Treble Consulting
> Data in Trouble? Get it in Treble! http://BlueTreble.com
>


From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-01-26 22:29:19
Message-ID: 54C6BFBF.8080107@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 1/26/15 4:17 PM, Pavel Stehule wrote:
> Any way to reduce the code duplication between the array and non-array versions? Maybe factor out the operator caching code?
>
>
> I though about it - but there is different checks, different result processing, different result type.
>
> I didn't find any readable and reduced form :(

Yeah, that's why I was thinking specifically of the operator caching code... isn't that identical? That would at least remove a dozen lines...
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-01-27 10:36:49
Message-ID: CAFj8pRCq59NtOhDnX3wj5WdvC8UcKVZkA-CEcDF1ZK=AVVxUfw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2015-01-26 23:29 GMT+01:00 Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>:

> On 1/26/15 4:17 PM, Pavel Stehule wrote:
>
>> Any way to reduce the code duplication between the array and
>> non-array versions? Maybe factor out the operator caching code?
>>
>>
>> I though about it - but there is different checks, different result
>> processing, different result type.
>>
>> I didn't find any readable and reduced form :(
>>
>
> Yeah, that's why I was thinking specifically of the operator caching
> code... isn't that identical? That would at least remove a dozen lines...

It is only partially identical - I would to use cache for array_offset, but
it is not necessary for array_offsets .. depends how we would to modify
current API to support externally cached data.

Regards

Pavel

>
> --
> Jim Nasby, Data Architect, Blue Treble Consulting
> Data in Trouble? Get it in Treble! http://BlueTreble.com
>


From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-01-27 23:01:21
Message-ID: 54C818C1.1010508@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 1/27/15 4:36 AM, Pavel Stehule wrote:
> 2015-01-26 23:29 GMT+01:00 Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com <mailto:Jim(dot)Nasby(at)bluetreble(dot)com>>:
>
> On 1/26/15 4:17 PM, Pavel Stehule wrote:
>
> Any way to reduce the code duplication between the array and non-array versions? Maybe factor out the operator caching code?
>
>
> I though about it - but there is different checks, different result processing, different result type.
>
> I didn't find any readable and reduced form :(
>
>
> Yeah, that's why I was thinking specifically of the operator caching code... isn't that identical? That would at least remove a dozen lines...
>
>
> It is only partially identical - I would to use cache for array_offset, but it is not necessary for array_offsets .. depends how we would to modify current API to support externally cached data.

Externally cached data?
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-01-28 07:15:12
Message-ID: CAFj8pRC63RFJ-UhjrwkHekwtbVQ43A7FEQWMUCXoQAM=Vk0teQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2015-01-28 0:01 GMT+01:00 Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>:

> On 1/27/15 4:36 AM, Pavel Stehule wrote:
>
>> 2015-01-26 23:29 GMT+01:00 Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com <mailto:
>> Jim(dot)Nasby(at)bluetreble(dot)com>>:
>>
>> On 1/26/15 4:17 PM, Pavel Stehule wrote:
>>
>> Any way to reduce the code duplication between the array and
>> non-array versions? Maybe factor out the operator caching code?
>>
>>
>> I though about it - but there is different checks, different
>> result processing, different result type.
>>
>> I didn't find any readable and reduced form :(
>>
>>
>> Yeah, that's why I was thinking specifically of the operator caching
>> code... isn't that identical? That would at least remove a dozen lines...
>>
>>
>> It is only partially identical - I would to use cache for array_offset,
>> but it is not necessary for array_offsets .. depends how we would to modify
>> current API to support externally cached data.
>>
>
> Externally cached data?

Some from these functions has own caches for minimize access to typcache
(array_map, array_cmp is example). And in first case, I am trying to push
these information from fn_extra, in second case I don't do it, because I
don't expect a repeated call (and I am expecting so type cache will be
enough).

I plan to do some benchmark to calculate if we have to do, or we can use
type cache only.

Pavel

>
> --
> Jim Nasby, Data Architect, Blue Treble Consulting
> Data in Trouble? Get it in Treble! http://BlueTreble.com
>


From: Petr Jelinek <petr(at)2ndquadrant(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-02-22 02:00:54
Message-ID: 54E93856.6070701@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 28/01/15 08:15, Pavel Stehule wrote:
>
>
> 2015-01-28 0:01 GMT+01:00 Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com
> <mailto:Jim(dot)Nasby(at)bluetreble(dot)com>>:
>
> On 1/27/15 4:36 AM, Pavel Stehule wrote:
>
>
> It is only partially identical - I would to use cache for
> array_offset, but it is not necessary for array_offsets ..
> depends how we would to modify current API to support externally
> cached data.
>
>
> Externally cached data?
>
>
> Some from these functions has own caches for minimize access to typcache
> (array_map, array_cmp is example). And in first case, I am trying to
> push these information from fn_extra, in second case I don't do it,
> because I don't expect a repeated call (and I am expecting so type cache
> will be enough).
>

You actually do caching via fn_extra in both case and I think that's the
correct way, and yes that part can be moved common function.

I also see that the documentation does not say what is returned by
array_offset if nothing is found (it's documented in code but not in sgml).

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Petr Jelinek <petr(at)2ndquadrant(dot)com>
Cc: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-02-22 11:19:39
Message-ID: CAFj8pRBy5fx9tjz0XWSy5EK3S+LpWs2KF+pe4qd8X7Vx+EmyLg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2015-02-22 3:00 GMT+01:00 Petr Jelinek <petr(at)2ndquadrant(dot)com>:

> On 28/01/15 08:15, Pavel Stehule wrote:
>
>>
>>
>> 2015-01-28 0:01 GMT+01:00 Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com
>> <mailto:Jim(dot)Nasby(at)bluetreble(dot)com>>:
>>
>> On 1/27/15 4:36 AM, Pavel Stehule wrote:
>>
>>
>> It is only partially identical - I would to use cache for
>> array_offset, but it is not necessary for array_offsets ..
>> depends how we would to modify current API to support externally
>> cached data.
>>
>>
>> Externally cached data?
>>
>>
>> Some from these functions has own caches for minimize access to typcache
>> (array_map, array_cmp is example). And in first case, I am trying to
>> push these information from fn_extra, in second case I don't do it,
>> because I don't expect a repeated call (and I am expecting so type cache
>> will be enough).
>>
>>
> You actually do caching via fn_extra in both case and I think that's the
> correct way, and yes that part can be moved common function.
>
> I also see that the documentation does not say what is returned by
> array_offset if nothing is found (it's documented in code but not in sgml).
>

rebased + fixed docs

Regards

Pavel

>
> --
> Petr Jelinek http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Training & Services
>

Attachment Content-Type Size
array_offset-04.patch text/x-patch 19.5 KB

From: Petr Jelinek <petr(at)2ndquadrant(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-03-07 18:06:42
Message-ID: 54FB3E32.7030207@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 22/02/15 12:19, Pavel Stehule wrote:
>
>
> 2015-02-22 3:00 GMT+01:00 Petr Jelinek <petr(at)2ndquadrant(dot)com
> <mailto:petr(at)2ndquadrant(dot)com>>:
>
> On 28/01/15 08:15, Pavel Stehule wrote:
>
>
>
> 2015-01-28 0:01 GMT+01:00 Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com
> <mailto:Jim(dot)Nasby(at)bluetreble(dot)com>>>:
>
> On 1/27/15 4:36 AM, Pavel Stehule wrote:
>
>
> It is only partially identical - I would to use cache for
> array_offset, but it is not necessary for array_offsets ..
> depends how we would to modify current API to support
> externally
> cached data.
>
>
> Externally cached data?
>
>
> Some from these functions has own caches for minimize access to
> typcache
> (array_map, array_cmp is example). And in first case, I am trying to
> push these information from fn_extra, in second case I don't do it,
> because I don't expect a repeated call (and I am expecting so
> type cache
> will be enough).
>
>
> You actually do caching via fn_extra in both case and I think that's
> the correct way, and yes that part can be moved common function.
>
> I also see that the documentation does not say what is returned by
> array_offset if nothing is found (it's documented in code but not in
> sgml).
>
>
> rebased + fixed docs
>

You still duplicate the type cache code, but many other array functions
do that too so I will not hold that against you. (Maybe somebody should
write separate patch that would put all that duplicate code into common
function?)

I think this patch is ready for committer so I am going to mark it as such.

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Petr Jelinek <petr(at)2ndquadrant(dot)com>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-03-10 14:30:30
Message-ID: CA+TgmoZ4rUYbKoeubYPwSmzm4BWZ3mP4MNZcqJPBDa_kJxjNDg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Mar 7, 2015 at 1:06 PM, Petr Jelinek <petr(at)2ndquadrant(dot)com> wrote:
> You still duplicate the type cache code, but many other array functions do
> that too so I will not hold that against you. (Maybe somebody should write
> separate patch that would put all that duplicate code into common function?)
>
> I think this patch is ready for committer so I am going to mark it as such.

The documentation in this patch needs some improvements to the
English. Can anyone help with that?

The documentation should discuss what happens if the array is multi-dimensional.

The code for array_offset and for array_offset_start appear to be
byte-for-byte identical. There's no comment explaining why, but I bet
it's to make the opr_sanity test pass. How about adding a comment?

The comment for array_offset_common refers to array_offset_start as
array_offset_startpos.

I am sure in agreement with the idea that it would be good to factor
out the common typecache code (for setting up my_extra). Any chance
we get a preliminary patch that does that refactoring, and then rebase
the main patch on top of it? I agree that it's not really this
patch's job to solve that problem, but it would be nice.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-03-10 15:53:49
Message-ID: 54FF138D.4060702@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 3/10/15 9:30 AM, Robert Haas wrote:
> On Sat, Mar 7, 2015 at 1:06 PM, Petr Jelinek <petr(at)2ndquadrant(dot)com> wrote:
>> You still duplicate the type cache code, but many other array functions do
>> that too so I will not hold that against you. (Maybe somebody should write
>> separate patch that would put all that duplicate code into common function?)
>>
>> I think this patch is ready for committer so I am going to mark it as such.
>
> The documentation in this patch needs some improvements to the
> English. Can anyone help with that?

I'll take a look at it.

> The documentation should discuss what happens if the array is multi-dimensional.
>
> The code for array_offset and for array_offset_start appear to be
> byte-for-byte identical. There's no comment explaining why, but I bet
> it's to make the opr_sanity test pass. How about adding a comment?
>
> The comment for array_offset_common refers to array_offset_start as
> array_offset_startpos.
>
> I am sure in agreement with the idea that it would be good to factor
> out the common typecache code (for setting up my_extra). Any chance
> we get a preliminary patch that does that refactoring, and then rebase
> the main patch on top of it? I agree that it's not really this
> patch's job to solve that problem, but it would be nice.

Since this patch is here and ready to go I would prefer that we commit
it and refactor later. I can tackle that unless Pavel specifically wants to.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-03-10 16:07:29
Message-ID: CAFj8pRCn=mCNkdyhAiSEswjhrcxgX2b-5RgBA084kcLPwHEEow@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2015-03-10 16:53 GMT+01:00 Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>:

> On 3/10/15 9:30 AM, Robert Haas wrote:
>
>> On Sat, Mar 7, 2015 at 1:06 PM, Petr Jelinek <petr(at)2ndquadrant(dot)com>
>> wrote:
>>
>>> You still duplicate the type cache code, but many other array functions
>>> do
>>> that too so I will not hold that against you. (Maybe somebody should
>>> write
>>> separate patch that would put all that duplicate code into common
>>> function?)
>>>
>>> I think this patch is ready for committer so I am going to mark it as
>>> such.
>>>
>>
>> The documentation in this patch needs some improvements to the
>> English. Can anyone help with that?
>>
>
> I'll take a look at it.
>
> The documentation should discuss what happens if the array is
>> multi-dimensional.
>>
>> The code for array_offset and for array_offset_start appear to be
>> byte-for-byte identical. There's no comment explaining why, but I bet
>> it's to make the opr_sanity test pass. How about adding a comment?
>>
>> The comment for array_offset_common refers to array_offset_start as
>> array_offset_startpos.
>>
>> I am sure in agreement with the idea that it would be good to factor
>> out the common typecache code (for setting up my_extra). Any chance
>> we get a preliminary patch that does that refactoring, and then rebase
>> the main patch on top of it? I agree that it's not really this
>> patch's job to solve that problem, but it would be nice.
>>
>
> Since this patch is here and ready to go I would prefer that we commit it
> and refactor later. I can tackle that unless Pavel specifically wants to.

I'll look on this part this evening - but I don't have any idea how to find
some common pattern yet. So I am with Jim - we can do it later.

Regards

Pavel

>
> --
> Jim Nasby, Data Architect, Blue Treble Consulting
> Data in Trouble? Get it in Treble! http://BlueTreble.com
>


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Petr Jelinek <petr(at)2ndquadrant(dot)com>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-03-10 18:27:16
Message-ID: CAFj8pRATG-0fQoC7NxDqVws8ak1R0VL7vg51JAYTn7rAMm1xPg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2015-03-10 15:30 GMT+01:00 Robert Haas <robertmhaas(at)gmail(dot)com>:

> On Sat, Mar 7, 2015 at 1:06 PM, Petr Jelinek <petr(at)2ndquadrant(dot)com> wrote:
> > You still duplicate the type cache code, but many other array functions
> do
> > that too so I will not hold that against you. (Maybe somebody should
> write
> > separate patch that would put all that duplicate code into common
> function?)
> >
> > I think this patch is ready for committer so I am going to mark it as
> such.
>
> The documentation in this patch needs some improvements to the
> English. Can anyone help with that?
>
> The documentation should discuss what happens if the array is
> multi-dimensional.
>
> The code for array_offset and for array_offset_start appear to be
> byte-for-byte identical. There's no comment explaining why, but I bet
> it's to make the opr_sanity test pass. How about adding a comment?
>
> The comment for array_offset_common refers to array_offset_start as
> array_offset_startpos.
>

yes, it is a reason. I'll comment it.

>
> I am sure in agreement with the idea that it would be good to factor
> out the common typecache code (for setting up my_extra). Any chance
> we get a preliminary patch that does that refactoring, and then rebase
> the main patch on top of it? I agree that it's not really this
> patch's job to solve that problem, but it would be nice.
>

The common part is following code:

my_extra = (ArrayMetaState *) fcinfo->flinfo->fn_extra;
if (my_extra == NULL)
{
fcinfo->flinfo->fn_extra =
MemoryContextAlloc(fcinfo->flinfo->fn_mcxt,

sizeof(ArrayMetaState));
my_extra = (ArrayMetaState *) fcinfo->flinfo->fn_extra;
my_extra->element_type = ~element_type;
}

and

if (my_extra->element_type != element_type)
{
get_typlenbyvalalign(element_type,
&my_extra->typlen,
&my_extra->typbyval,
&my_extra->typalign);

my_extra->element_type = element_type;
}

so we can design function like

(ArrayMetaState *)
GetCachedArrayMetaState(FunctionCallInfo fcinfo, Oid element_type, bool
*reused)
{
ArrayMetaState *state = (ArrayMetaState *) fcinfo->flinfo->fn_extra;
if (state == NULL)
{
fcinfo->flinfo->fn_extra =
MemoryContextAlloc(fcinfo->flinfo->fn_mcxt,

sizeof(ArrayMetaState));
state = (ArrayMetaState *) fcinfo->flinfo->fn_extra;
state->element_type = ~element_type;
}
if (state->element_type != element_type)
{
get_typlenbyvalalign(element_type,
&my_extra->typlen,
&my_extra->typbyval,
&my_extra->typalign);

state->element_type = element_type;
*resused = false;
}
else
*reused = true;
}

Usage in code:

array_state = GetCachedArrayMetaState(fceinfo, element_type, &reused);
if (!reused)
{
// some other initialization
}

The content is relatively clean, but the most big problem is naming (as
usual)

Comments?

Regards

Pavel

> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-03-10 18:50:19
Message-ID: 835.1426013419@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
> 2015-03-10 15:30 GMT+01:00 Robert Haas <robertmhaas(at)gmail(dot)com>:
>> I am sure in agreement with the idea that it would be good to factor
>> out the common typecache code (for setting up my_extra). Any chance
>> we get a preliminary patch that does that refactoring, and then rebase
>> the main patch on top of it? I agree that it's not really this
>> patch's job to solve that problem, but it would be nice.

> The common part is following code:

There is not all that much commonality; many functions don't bother to
populate all of the ArrayMetaState fields because they don't need all of
them. (The ones you quote don't, in fact.) You are either going to end
up with a subroutine that does extra syscache lookups to populate the
whole struct every time, or a confusing collection of ad-hoc subroutines.
I'm not convinced that there's a lot of mileage to be gained here.

regards, tom lane


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-03-10 19:43:09
Message-ID: CAFj8pRCJc7YCLhidtgG0+-2+Zab5e3roP2cDRO+hYZohypj08A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2015-03-10 19:50 GMT+01:00 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:

> Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
> > 2015-03-10 15:30 GMT+01:00 Robert Haas <robertmhaas(at)gmail(dot)com>:
> >> I am sure in agreement with the idea that it would be good to factor
> >> out the common typecache code (for setting up my_extra). Any chance
> >> we get a preliminary patch that does that refactoring, and then rebase
> >> the main patch on top of it? I agree that it's not really this
> >> patch's job to solve that problem, but it would be nice.
>
> > The common part is following code:
>
> There is not all that much commonality; many functions don't bother to
> populate all of the ArrayMetaState fields because they don't need all of
> them. (The ones you quote don't, in fact.) You are either going to end
> up with a subroutine that does extra syscache lookups to populate the
> whole struct every time, or a confusing collection of ad-hoc subroutines.
> I'm not convinced that there's a lot of mileage to be gained here.
>

I have not good feeling about it too. If we would to enhance this are, we
probably need a specific flinfo field and flags to specify more precious
the context of cached informations. my_extra should be reserved for generic
usage. But still there is relative big space for settings some less common
fields like "proc".

With extra field in flinfo we can have interface like

bool set_flinfo_type_cache(fcinfo, type, flags);
and usage fcinfo->flinfo->typecache->typlen, ..

I agree with Robert, this can be nice, but it needs more time for design :(

Regards

Pavel

>
> regards, tom lane
>


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-03-10 19:45:15
Message-ID: CA+TgmoZisDDta5sKNRNLP-yAtOvq926g9Frqd=JVtzwAKYvWqA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Mar 10, 2015 at 3:43 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
> I have not good feeling about it too. If we would to enhance this are, we
> probably need a specific flinfo field and flags to specify more precious the
> context of cached informations. my_extra should be reserved for generic
> usage. But still there is relative big space for settings some less common
> fields like "proc".
>
> With extra field in flinfo we can have interface like
>
> bool set_flinfo_type_cache(fcinfo, type, flags);
> and usage fcinfo->flinfo->typecache->typlen, ..
>
> I agree with Robert, this can be nice, but it needs more time for design :(

OK. If I'm in the minority, I'll desist.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-03-10 21:18:12
Message-ID: 54FF5F94.4070806@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 3/10/15 10:53 AM, Jim Nasby wrote:
> On 3/10/15 9:30 AM, Robert Haas wrote:
>> On Sat, Mar 7, 2015 at 1:06 PM, Petr Jelinek <petr(at)2ndquadrant(dot)com>
>> wrote:
>>> You still duplicate the type cache code, but many other array
>>> functions do
>>> that too so I will not hold that against you. (Maybe somebody should
>>> write
>>> separate patch that would put all that duplicate code into common
>>> function?)
>>>
>>> I think this patch is ready for committer so I am going to mark it as
>>> such.
>>
>> The documentation in this patch needs some improvements to the
>> English. Can anyone help with that?
>
> I'll take a look at it.
>
>> The documentation should discuss what happens if the array is
>> multi-dimensional.

This is the wording I have to describe what happens with a
multi-dimensional array. I'm not thrilled with it; anyone have better ideas?

Note: multi-dimensional arrays are squashed to one dimension before
searching.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-03-10 21:53:11
Message-ID: 54FF67C7.2050709@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2/22/15 5:19 AM, Pavel Stehule wrote:
>
>
> 2015-02-22 3:00 GMT+01:00 Petr Jelinek <petr(at)2ndquadrant(dot)com
> <mailto:petr(at)2ndquadrant(dot)com>>:
>
> On 28/01/15 08:15, Pavel Stehule wrote:
>
>
>
> 2015-01-28 0:01 GMT+01:00 Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com
> <mailto:Jim(dot)Nasby(at)bluetreble(dot)com>
> <mailto:Jim(dot)Nasby(at)bluetreble(dot)__com
> <mailto:Jim(dot)Nasby(at)bluetreble(dot)com>>>:
>
> On 1/27/15 4:36 AM, Pavel Stehule wrote:
>
>
> It is only partially identical - I would to use cache for
> array_offset, but it is not necessary for array_offsets ..
> depends how we would to modify current API to support
> externally
> cached data.
>
>
> Externally cached data?
>
>
> Some from these functions has own caches for minimize access to
> typcache
> (array_map, array_cmp is example). And in first case, I am trying to
> push these information from fn_extra, in second case I don't do it,
> because I don't expect a repeated call (and I am expecting so
> type cache
> will be enough).
>
>
> You actually do caching via fn_extra in both case and I think that's
> the correct way, and yes that part can be moved common function.
>
> I also see that the documentation does not say what is returned by
> array_offset if nothing is found (it's documented in code but not in
> sgml).
>
>
> rebased + fixed docs

I don't think we need both array_offset and array_offset_start; can't
both SQL functions just call one C function?

It might be worth combining the array and non-array versions of this, by
having a _common function that accepts a boolean and then just run one
or the other of the while loops. Most of the code seems to be shared
between the two versions.

What is this comment supposed to mean? There is no 'width_array'...

* Biggest difference against width_array is unsorted input array.

I've attached my doc changes, both alone and with the code.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

Attachment Content-Type Size
doc_and_code.patch text/plain 18.0 KB
doc_changes.patch text/plain 3.4 KB

From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-03-10 22:20:20
Message-ID: 54FF6E24.2020808@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 3/10/15 2:43 PM, Pavel Stehule wrote:
>
> There is not all that much commonality; many functions don't bother to
> populate all of the ArrayMetaState fields because they don't need all of
> them. (The ones you quote don't, in fact.) You are either going to end
> up with a subroutine that does extra syscache lookups to populate the
> whole struct every time, or a confusing collection of ad-hoc
> subroutines.
> I'm not convinced that there's a lot of mileage to be gained here.
>
>
> I have not good feeling about it too. If we would to enhance this are,
> we probably need a specific flinfo field and flags to specify more
> precious the context of cached informations. my_extra should be reserved
> for generic usage. But still there is relative big space for settings
> some less common fields like "proc".
>
> With extra field in flinfo we can have interface like
>
> bool set_flinfo_type_cache(fcinfo, type, flags);
> and usage fcinfo->flinfo->typecache->typlen, ..

I'm not following what you intended there, but in any case I don't think
we'd need to change all that much, because there's only 3 cases:

1) Get only the base type info
2) Get base type info and equality operator
3) Get IO info for one IOFunc

Passing the function an enum (and perhaps keeping it in ArrayMetaState)
would be enough to distinguish between the 3 cases. You'd also need to
pass in IOFuncSelector.

That said, this pattern with fn_extra is repeated a lot, even just in
the backend (not counting contrib or extensions). It would be nice if
there was generic support for this.

decibel(at)decina:[17:15]~/pgsql/HEAD/src/backend (array_offset_v4
$)$pg_grep fn_extra|cut -d: -f1|uniq -c
14 access/gist/gistscan.c
7 executor/functions.c
1 executor/nodeWindowAgg.c
14 utils/adt/array_userfuncs.c
31 utils/adt/arrayfuncs.c
4 utils/adt/domains.c
2 utils/adt/enum.c
1 utils/adt/int.c
6 utils/adt/jsonfuncs.c
1 utils/adt/oid.c
4 utils/adt/orderedsetaggs.c
7 utils/adt/rangetypes.c
24 utils/adt/rowtypes.c
8 utils/adt/varlena.c
(utils/fmgr/* doesn't count)
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-03-10 22:25:50
Message-ID: 6167.1426026350@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com> writes:
> That said, this pattern with fn_extra is repeated a lot, even just in
> the backend (not counting contrib or extensions). It would be nice if
> there was generic support for this.

What do you mean by "generic support"? Most of those functions are doing
quite different things with fn_extra --- granted, nearly all of them are
caching something there, but I don't see very much that a "generic"
infrastructure could bring to the table.

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-03-11 01:57:33
Message-ID: CA+TgmoZ3=8XzX8sC429XCaYvYcLw=f6z6yUogrfBbnK-VdV_hA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Mar 10, 2015 at 5:53 PM, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com> wrote:
> I don't think we need both array_offset and array_offset_start; can't both
> SQL functions just call one C function?

Not if you want the opr_sanity tests to pass.

(But I'm seriously starting to wonder if that's actually a smart rule
for us to be enforcing. It seems to be something of a pain in the
neck, and I'm unclear as to whether it is preventing any real
problem.)

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-03-11 06:19:49
Message-ID: CAFj8pRBq-WmkpPbwXyM6i=zW_bk9TBfv+xKQgccv=YQSnpCyFg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2015-03-11 2:57 GMT+01:00 Robert Haas <robertmhaas(at)gmail(dot)com>:

> On Tue, Mar 10, 2015 at 5:53 PM, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
> wrote:
> > I don't think we need both array_offset and array_offset_start; can't
> both
> > SQL functions just call one C function?
>
> Not if you want the opr_sanity tests to pass.
>
> (But I'm seriously starting to wonder if that's actually a smart rule
> for us to be enforcing. It seems to be something of a pain in the
> neck, and I'm unclear as to whether it is preventing any real
> problem.)
>

It is simple protection against some oversights. I am not against this
check - this rule cleans a interface between C and SQL. More, the
additional C code is usually very short and trivial.

But it should be commented well.

Pavel

>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
Cc: Petr Jelinek <petr(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-03-11 06:29:51
Message-ID: CAFj8pRCkPWPxFytViWXOqHyayE=4k-AYma=Vdqe+PSPQsdrntw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2015-03-10 22:53 GMT+01:00 Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>:

> On 2/22/15 5:19 AM, Pavel Stehule wrote:
>
>>
>>
>> 2015-02-22 3:00 GMT+01:00 Petr Jelinek <petr(at)2ndquadrant(dot)com
>> <mailto:petr(at)2ndquadrant(dot)com>>:
>>
>> On 28/01/15 08:15, Pavel Stehule wrote:
>>
>>
>>
>> 2015-01-28 0:01 GMT+01:00 Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com
>> <mailto:Jim(dot)Nasby(at)bluetreble(dot)com>
>> <mailto:Jim(dot)Nasby(at)bluetreble(dot)__com
>> <mailto:Jim(dot)Nasby(at)bluetreble(dot)com>>>:
>>
>> On 1/27/15 4:36 AM, Pavel Stehule wrote:
>>
>>
>> It is only partially identical - I would to use cache for
>> array_offset, but it is not necessary for array_offsets
>> ..
>> depends how we would to modify current API to support
>> externally
>> cached data.
>>
>>
>> Externally cached data?
>>
>>
>> Some from these functions has own caches for minimize access to
>> typcache
>> (array_map, array_cmp is example). And in first case, I am trying
>> to
>> push these information from fn_extra, in second case I don't do
>> it,
>> because I don't expect a repeated call (and I am expecting so
>> type cache
>> will be enough).
>>
>>
>> You actually do caching via fn_extra in both case and I think that's
>> the correct way, and yes that part can be moved common function.
>>
>> I also see that the documentation does not say what is returned by
>> array_offset if nothing is found (it's documented in code but not in
>> sgml).
>>
>>
>> rebased + fixed docs
>>
>
> I don't think we need both array_offset and array_offset_start; can't both
> SQL functions just call one C function?
>

There is a rule about unique mapping C functions to SQL space - and I don't
think so this rule is bad.

>
> It might be worth combining the array and non-array versions of this, by
> having a _common function that accepts a boolean and then just run one or
> the other of the while loops. Most of the code seems to be shared between
> the two versions.
>
> What is this comment supposed to mean? There is no 'width_array'...
>

It is typo (I am sorry) - should be width_bucket(, array)

http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=e80252d424278abf65b624669c8e6b3fe8587cac

The code is similar, but it expect large **sorted** input. array_offset
works on unsorted (alphabetical unsorted) data sets - like days of week ..

>
> * Biggest difference against width_array is unsorted input array.
>
> I've attached my doc changes, both alone and with the code.
>
> --
> Jim Nasby, Data Architect, Blue Treble Consulting
> Data in Trouble? Get it in Treble! http://BlueTreble.com
>


From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Petr Jelinek <petr(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-03-11 20:18:12
Message-ID: 5500A304.9000706@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 3/11/15 1:19 AM, Pavel Stehule wrote:
> 2015-03-11 2:57 GMT+01:00 Robert Haas <robertmhaas(at)gmail(dot)com
> <mailto:robertmhaas(at)gmail(dot)com>>:
>
> On Tue, Mar 10, 2015 at 5:53 PM, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com
> <mailto:Jim(dot)Nasby(at)bluetreble(dot)com>> wrote:
> > I don't think we need both array_offset and array_offset_start; can't both
> > SQL functions just call one C function?
>
> Not if you want the opr_sanity tests to pass.
>
> (But I'm seriously starting to wonder if that's actually a smart rule
> for us to be enforcing. It seems to be something of a pain in the
> neck, and I'm unclear as to whether it is preventing any real
> problem.)
>
>
> It is simple protection against some oversights. I am not against this
> check - this rule cleans a interface between C and SQL. More, the
> additional C code is usually very short and trivial.
>
> But it should be commented well.

Ahh, ok, makes more sense now. If the separate C functions are serving a
purpose that's fine. I think the comment should mention it though, as
it's not exactly the most obvious thing.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-03-11 20:49:04
Message-ID: 5500AA40.3090805@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 3/10/15 5:25 PM, Tom Lane wrote:
> Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com> writes:
>> That said, this pattern with fn_extra is repeated a lot, even just in
>> the backend (not counting contrib or extensions). It would be nice if
>> there was generic support for this.
>
> What do you mean by "generic support"? Most of those functions are doing
> quite different things with fn_extra --- granted, nearly all of them are
> caching something there, but I don't see very much that a "generic"
> infrastructure could bring to the table.

At a glance, almost all the use under src/backend is doing some
combination of 3 things: get_typlenbyvalalign(), get_type_io_data() or
getting some operator for a type. This is most notable for records,
arrays and ranges (though, records actually need an array of this info,
so maybe we're out of luck there). That pattern exists outside of
backend too; I think it's used in contrib, and I know at least one
extension does this.

So my thought was something that accepted fcinfo, IOFuncSelector, and
TypeCacheEntry.flags. If IOFuncSelector was set, use get_type_io_data;
else use get_typlenbyvalalign. If TypeCacheEntry.flags is set also look
up the operator.

Hmm... now that I look at it, the only use of get_type_io_data in
src/backend seems to be to support arrays. Ranges make use of it too,
but there's a comment in there that it returns more than what's needed.

Even if a generic support function doesn't make sense, there's a lot of
repeated fn_extra code for arrays and records. It would be good to at
least refactor that like what was done for rangetypes. That process
might provide a more obvious answer to how this could be done generically.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Petr Jelinek <petr(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-03-11 21:14:20
Message-ID: 5500B02C.5070406@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 3/11/15 1:29 AM, Pavel Stehule wrote:
>
> What is this comment supposed to mean? There is no 'width_array'...
>
>
> It is typo (I am sorry) - should be width_bucket(, array)
>
> http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=e80252d424278abf65b624669c8e6b3fe8587cac
>
> The code is similar, but it expect large **sorted** input. array_offset
> works on unsorted (alphabetical unsorted) data sets - like days of week ..

The functions are serving rather different purposes, so I'm not sure
it's worth mentioning. If we do want to mention it, then something like
the following should be added to *both* functions:

* This code is similar to width_bucket()

and

* This code is similar to array_offset()

Incidentally, isn't it bad that we're doing all these static assignments
inside the loop in width_bucket? Or can we count on the compiler to
recognize this?

http://lnk.nu/github.com/1dvrr.c
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
Cc: Petr Jelinek <petr(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-03-11 21:37:30
Message-ID: CAFj8pRAyDyPQSb5ZdH0KzLB_e1xV6aQw3xXN1NCAdhN=J9gxxA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2015-03-11 22:14 GMT+01:00 Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>:

> On 3/11/15 1:29 AM, Pavel Stehule wrote:
>
>>
>> What is this comment supposed to mean? There is no 'width_array'...
>>
>>
>> It is typo (I am sorry) - should be width_bucket(, array)
>>
>> http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=
>> e80252d424278abf65b624669c8e6b3fe8587cac
>>
>> The code is similar, but it expect large **sorted** input. array_offset
>> works on unsorted (alphabetical unsorted) data sets - like days of week ..
>>
>
> The functions are serving rather different purposes, so I'm not sure it's
> worth mentioning. If we do want to mention it, then something like the
> following should be added to *both* functions:
>

ok, I removed this note.

I added comment about wrapping and I simplified a code there - this method
is used more time in pg for same purposes.

Merged Jim's changes in doc

Pavel

>
> * This code is similar to width_bucket()
>
> and
>
> * This code is similar to array_offset()
>
> Incidentally, isn't it bad that we're doing all these static assignments
> inside the loop in width_bucket? Or can we count on the compiler to
> recognize this?
>
> http://lnk.nu/github.com/1dvrr.c
>
> --
> Jim Nasby, Data Architect, Blue Treble Consulting
> Data in Trouble? Get it in Treble! http://BlueTreble.com
>

Attachment Content-Type Size
array_offset-05.patch text/x-patch 19.6 KB

From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Petr Jelinek <petr(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-03-11 21:50:09
Message-ID: 5500B891.8070706@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 3/11/15 4:37 PM, Pavel Stehule wrote:
+ /*
+ * array_offset - returns the offset of a value in an array
(array_offset and
+ * array_offset_start are wrappers for safe call (look on opr_sanity
test) a
+ * array_offset_common function.
+ *
+ * Returns NULL when value is not found. It uses a "NOT DISTINCT FROM"
operator
+ * for comparation to be safe against NULL.
+ */

would be better as...

+ /*
+ * array_offset - returns the offset of a value in an array.
array_offset and
+ * array_offset_start are wrappers for the sake of the opr_sanity test.
+ *
+ * Returns NULL when value is not found. It uses a "NOT DISTINCT FROM"
operator
+ * for comparation to be safe against NULL.
+ */
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
Cc: Petr Jelinek <petr(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-03-12 06:54:49
Message-ID: CAFj8pRDgLRQYF_8Dz4i4g3TKbfPdNUmCcCin5thR+PgT+_hd6A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2015-03-11 22:50 GMT+01:00 Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>:

> On 3/11/15 4:37 PM, Pavel Stehule wrote:
> + /*
> + * array_offset - returns the offset of a value in an array
> (array_offset and
> + * array_offset_start are wrappers for safe call (look on opr_sanity
> test) a
> + * array_offset_common function.
> + *
> + * Returns NULL when value is not found. It uses a "NOT DISTINCT FROM"
> operator
> + * for comparation to be safe against NULL.
> + */
>
> would be better as...
>
> + /*
> + * array_offset - returns the offset of a value in an array.
> array_offset and
> + * array_offset_start are wrappers for the sake of the opr_sanity test.
> + *
> + * Returns NULL when value is not found. It uses a "NOT DISTINCT FROM"
> operator
> + * for comparation to be safe against NULL.
> + */

fixed

Regards

Pavel

>
> --
> Jim Nasby, Data Architect, Blue Treble Consulting
> Data in Trouble? Get it in Treble! http://BlueTreble.com
>

Attachment Content-Type Size
array_offset-06.patch text/x-patch 19.5 KB

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-03-18 01:06:27
Message-ID: 20150318010627.GR3636@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

My main question regarding this patch is whether the behavior with MD
arrays is useful at all. Suppose I give it this:

alvherre=# select array_offset('{{{1,2},{3,4},{5,6}},{{2,3},{4,5},{6,7}}}', 3);
array_offset
--------------
3
(1 fila)

What can I do with the "3" value it returned? Certainly not use it as
an offset to get a slice of the original array. The only thing that
seems sensible to me here is to reject the whole thing with an error,
that is, only accept 1-D arrays here. We can later extend the function
by allowing higher dimensionality as long as the second argument is an
array one dimension less than the first argument. But if we allow the
case on its appearance, it's going to be difficult to change the
behavior later.

Has a case been made for the current behavior?

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment Content-Type Size
array_offset-07.patch text/x-diff 19.8 KB

From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Petr Jelinek <petr(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-03-18 02:45:02
Message-ID: 5508E6AE.3080603@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 3/17/15 8:06 PM, Alvaro Herrera wrote:
> My main question regarding this patch is whether the behavior with MD
> arrays is useful at all. Suppose I give it this:
>
> alvherre=# select array_offset('{{{1,2},{3,4},{5,6}},{{2,3},{4,5},{6,7}}}', 3);
> array_offset
> --------------
> 3
> (1 fila)
>
> What can I do with the "3" value it returned? Certainly not use it as
> an offset to get a slice of the original array. The only thing that
> seems sensible to me here is to reject the whole thing with an error,
> that is, only accept 1-D arrays here. We can later extend the function
> by allowing higher dimensionality as long as the second argument is an
> array one dimension less than the first argument. But if we allow the
> case on its appearance, it's going to be difficult to change the
> behavior later.

+1

> Has a case been made for the current behavior?

Not that I remember. There was discussion about how this should properly
support MD arrays.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-03-18 11:27:29
Message-ID: CAFj8pRC2U_RCNuOhP-VeGtNoDOsd6yZC0T6B+jjD2KYYc1HQ+g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2015-03-18 3:45 GMT+01:00 Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>:

> On 3/17/15 8:06 PM, Alvaro Herrera wrote:
>
>> My main question regarding this patch is whether the behavior with MD
>> arrays is useful at all. Suppose I give it this:
>>
>> alvherre=# select array_offset('{{{1,2},{3,4},{5,6}},{{2,3},{4,5},{6,7}}}',
>> 3);
>> array_offset
>> --------------
>> 3
>> (1 fila)
>>
>> What can I do with the "3" value it returned? Certainly not use it as
>> an offset to get a slice of the original array. The only thing that
>> seems sensible to me here is to reject the whole thing with an error,
>> that is, only accept 1-D arrays here. We can later extend the function
>> by allowing higher dimensionality as long as the second argument is an
>> array one dimension less than the first argument. But if we allow the
>> case on its appearance, it's going to be difficult to change the
>> behavior later.
>>
>
This behave is consistent with "unnest" function, when all multidimensional
arrays are reduced to 1ND arrays.

Other argument for this behave is impossibility to design other behave.
array_offset function have to returns integer always. You cannot to returns
a array of integers, what is necessary for MD position. And one integer can
be only position in flatted 1ND array. I agree, so this is not user
friendly, but there is not any other possible solution - we have not
anyarray and anymdarray types. I designed this possibility (use ND arrays)
mainly for info, if some value exists or not.

I am thinking, so this behave is correct (there is no other possible), but
it is only corner case for this functionality - and if you are thinking, so
better to disallow it, I am not against. My main focus is 1ND array.

Regards

Pavel

>
> +1
>
> Has a case been made for the current behavior?
>>
>
> Not that I remember. There was discussion about how this should properly
> support MD arrays.
>
> --
> Jim Nasby, Data Architect, Blue Treble Consulting
> Data in Trouble? Get it in Treble! http://BlueTreble.com
>


From: Marko Tiikkaja <marko(at)joh(dot)to>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-03-18 11:41:51
Message-ID: 5509647F.7060405@joh.to
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 3/18/15 12:27 PM, Pavel Stehule wrote:
>> On 3/17/15 8:06 PM, Alvaro Herrera wrote:
>>
>>> My main question regarding this patch is whether the behavior with MD
>>> arrays is useful at all. Suppose I give it this:
>>>
>>> alvherre=# select array_offset('{{{1,2},{3,4},{5,6}},{{2,3},{4,5},{6,7}}}',
>>> 3);
>>> array_offset
>>> --------------
>>> 3
>>> (1 fila)
>>>
>>> What can I do with the "3" value it returned? Certainly not use it as
>>> an offset to get a slice of the original array. The only thing that
>>> seems sensible to me here is to reject the whole thing with an error,
>>> that is, only accept 1-D arrays here. We can later extend the function
>>> by allowing higher dimensionality as long as the second argument is an
>>> array one dimension less than the first argument. But if we allow the
>>> case on its appearance, it's going to be difficult to change the
>>> behavior later.
>>>
> I designed this possibility (use ND arrays)
> mainly for info, if some value exists or not.

Why not use =ANY() for that?

> I am thinking, so this behave is correct (there is no other possible), but
> it is only corner case for this functionality - and if you are thinking, so
> better to disallow it, I am not against. My main focus is 1ND array.

A nonsensical answer for multi-dimensional arrays is worse than no
answer at all. I think raising an exception is better.

.m


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Marko Tiikkaja <marko(at)joh(dot)to>
Cc: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-03-18 11:51:52
Message-ID: CAFj8pRC8-vzy+LAd6b-LS9pSbjgHCgbfreNZHZeePJcd=c1u-w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2015-03-18 12:41 GMT+01:00 Marko Tiikkaja <marko(at)joh(dot)to>:

> On 3/18/15 12:27 PM, Pavel Stehule wrote:
>
>> On 3/17/15 8:06 PM, Alvaro Herrera wrote:
>>>
>>> My main question regarding this patch is whether the behavior with MD
>>>> arrays is useful at all. Suppose I give it this:
>>>>
>>>> alvherre=# select array_offset('{{{1,2},{3,4},{
>>>> 5,6}},{{2,3},{4,5},{6,7}}}',
>>>> 3);
>>>> array_offset
>>>> --------------
>>>> 3
>>>> (1 fila)
>>>>
>>>> What can I do with the "3" value it returned? Certainly not use it as
>>>> an offset to get a slice of the original array. The only thing that
>>>> seems sensible to me here is to reject the whole thing with an error,
>>>> that is, only accept 1-D arrays here. We can later extend the function
>>>> by allowing higher dimensionality as long as the second argument is an
>>>> array one dimension less than the first argument. But if we allow the
>>>> case on its appearance, it's going to be difficult to change the
>>>> behavior later.
>>>>
>>>> I designed this possibility (use ND arrays)
>> mainly for info, if some value exists or not.
>>
>
> Why not use =ANY() for that?
>

It is only partial solution - array_offset use "IS NOT DISTINCT FROM"
operator.

>
> I am thinking, so this behave is correct (there is no other possible), but
>> it is only corner case for this functionality - and if you are thinking,
>> so
>> better to disallow it, I am not against. My main focus is 1ND array.
>>
>
> A nonsensical answer for multi-dimensional arrays is worse than no answer
> at all. I think raising an exception is better.
>

I have not problem with it.

Regards

Pavel

>
>
> .m
>


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Marko Tiikkaja <marko(at)joh(dot)to>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-03-18 19:03:51
Message-ID: 20150318190351.GA3636@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Pavel Stehule wrote:
> 2015-03-18 12:41 GMT+01:00 Marko Tiikkaja <marko(at)joh(dot)to>:

> >> I am thinking, so this behave is correct (there is no other
> >> possible), but it is only corner case for this functionality - and
> >> if you are thinking, so better to disallow it, I am not against. My
> >> main focus is 1ND array.
> >
> > A nonsensical answer for multi-dimensional arrays is worse than no answer
> > at all. I think raising an exception is better.
> >
>
> I have not problem with it.

Pushed after adding error checks there and fixing the docs to match.
Please verify.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Marko Tiikkaja <marko(at)joh(dot)to>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-03-19 06:59:51
Message-ID: CAFj8pRCrxPU8fW_EGEPm5QOMjH1VKFNcvfdi-v4hzeXzzshaHQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2015-03-18 20:03 GMT+01:00 Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>:

> Pavel Stehule wrote:
> > 2015-03-18 12:41 GMT+01:00 Marko Tiikkaja <marko(at)joh(dot)to>:
>
> > >> I am thinking, so this behave is correct (there is no other
> > >> possible), but it is only corner case for this functionality - and
> > >> if you are thinking, so better to disallow it, I am not against. My
> > >> main focus is 1ND array.
> > >
> > > A nonsensical answer for multi-dimensional arrays is worse than no
> answer
> > > at all. I think raising an exception is better.
> > >
> >
> > I have not problem with it.
>
> Pushed after adding error checks there and fixing the docs to match.
> Please verify.
>

it is looking well, thank you.

small issue - there is not documented, so multidimensional arrays are not
supported,

Regards

Pavel

>
> --
> Álvaro Herrera http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Marko Tiikkaja <marko(at)joh(dot)to>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-03-19 12:16:07
Message-ID: 20150319121607.GC3636@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Pavel Stehule wrote:
> 2015-03-18 20:03 GMT+01:00 Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>:

> > Pushed after adding error checks there and fixing the docs to match.
> > Please verify.
> >
>
> it is looking well, thank you.

Thanks for checking.

> small issue - there is not documented, so multidimensional arrays are not
> supported,

I added a parenthised comment in the table, "(array must be
one-dimensional)". I copied this from the entry for the array_remove
function IIRC; it seemed enough ...

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Marko Tiikkaja <marko(at)joh(dot)to>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-03-20 16:49:29
Message-ID: CAEZATCXHb+tv8YYo4=XRoBzCOywTrM4cncqR57D4ZM7WdFomiQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 18 March 2015 at 19:03, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
> Pushed after adding error checks there and fixing the docs to match.
> Please verify.
>

There's an issue when the array's lower bound isn't 1:

select array_offset('[2:4]={1,2,3}'::int[], 1);
array_offset
--------------
1
(1 row)

whereas I would expect this to return 2. Similarly for
array_offsets(), so the offsets can be used as indexes into the
original array.

Regards,
Dean


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Marko Tiikkaja <marko(at)joh(dot)to>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-03-20 17:16:25
Message-ID: CAFj8pRA5t81AgvmwmCPNzJLZKeahENvG3QXrCrf+qGX3E6mqPQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2015-03-20 17:49 GMT+01:00 Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>:

> On 18 March 2015 at 19:03, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
> wrote:
> > Pushed after adding error checks there and fixing the docs to match.
> > Please verify.
> >
>
> There's an issue when the array's lower bound isn't 1:
>
> select array_offset('[2:4]={1,2,3}'::int[], 1);
> array_offset
> --------------
> 1
> (1 row)
>
> whereas I would expect this to return 2. Similarly for
> array_offsets(), so the offsets can be used as indexes into the
> original array.
>

I am thinking, so it is ok - it returns a offset, not position.

Regards

Pavel

>
> Regards,
> Dean
>


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Marko Tiikkaja <marko(at)joh(dot)to>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-03-20 17:29:42
Message-ID: 20150320172942.GX3636@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Pavel Stehule wrote:
> 2015-03-20 17:49 GMT+01:00 Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>:
>
> > There's an issue when the array's lower bound isn't 1:
> >
> > select array_offset('[2:4]={1,2,3}'::int[], 1);
> > array_offset
> > --------------
> > 1
> > (1 row)
> >
> > whereas I would expect this to return 2. Similarly for
> > array_offsets(), so the offsets can be used as indexes into the
> > original array.
> >
>
> I am thinking, so it is ok - it returns a offset, not position.

So you can't use it as a subscript? That sounds unfriendly. Almost
every function using this will be subtly broken.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Marko Tiikkaja <marko(at)joh(dot)to>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-03-20 17:43:59
Message-ID: CAFj8pRAsDKQtPnMHW+fc-N+gbvxb_Ed1_fXSEAPRgSaY1-Q3hA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2015-03-20 18:29 GMT+01:00 Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>:

> Pavel Stehule wrote:
> > 2015-03-20 17:49 GMT+01:00 Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>:
> >
> > > There's an issue when the array's lower bound isn't 1:
> > >
> > > select array_offset('[2:4]={1,2,3}'::int[], 1);
> > > array_offset
> > > --------------
> > > 1
> > > (1 row)
> > >
> > > whereas I would expect this to return 2. Similarly for
> > > array_offsets(), so the offsets can be used as indexes into the
> > > original array.
> > >
> >
> > I am thinking, so it is ok - it returns a offset, not position.
>
> So you can't use it as a subscript? That sounds unfriendly. Almost
> every function using this will be subtly broken.
>

depends what you want. It means - it is on Nth position from start. So it
is useful when iterate over array, because it is safe against different
array start dimensions. it works, if you use it as "offset". It is named
"array_offset"

It can be changed and renamed to array_position - it is simple fix. But I
am not sure, if it is better.

Regards

Pavel

> --
> Álvaro Herrera http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Marko Tiikkaja <marko(at)joh(dot)to>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-03-20 17:47:41
Message-ID: 14396.1426873661@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
> Pavel Stehule wrote:
>> I am thinking, so it is ok - it returns a offset, not position.

> So you can't use it as a subscript? That sounds unfriendly. Almost
> every function using this will be subtly broken.

I concur; perhaps "offset" was the design intention, but it's wrong.
The result should be a subscript.

regards, tom lane


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Marko Tiikkaja <marko(at)joh(dot)to>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-03-20 19:48:23
Message-ID: CAFj8pRAikQv9O72U35Z3KpgxFRRhgjYR2g0xu2DmaBX3tBukEg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2015-03-20 18:47 GMT+01:00 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:

> Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
> > Pavel Stehule wrote:
> >> I am thinking, so it is ok - it returns a offset, not position.
>
> > So you can't use it as a subscript? That sounds unfriendly. Almost
> > every function using this will be subtly broken.
>
> I concur; perhaps "offset" was the design intention, but it's wrong.
> The result should be a subscript.
>

do you have any idea about name for this function? array_position is ok?

Regards

Pavel

>
> regards, tom lane
>


From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Marko Tiikkaja <marko(at)joh(dot)to>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-03-20 23:27:40
Message-ID: 550CACEC.8030609@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 3/20/15 2:48 PM, Pavel Stehule wrote:
>
>
> 2015-03-20 18:47 GMT+01:00 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us
> <mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us>>:
>
> Alvaro Herrera <alvherre(at)2ndquadrant(dot)com
> <mailto:alvherre(at)2ndquadrant(dot)com>> writes:
> > Pavel Stehule wrote:
> >> I am thinking, so it is ok - it returns a offset, not position.
>
> > So you can't use it as a subscript? That sounds unfriendly. Almost
> > every function using this will be subtly broken.
>
> I concur; perhaps "offset" was the design intention, but it's wrong.
> The result should be a subscript.
>
>
> do you have any idea about name for this function? array_position is ok?

+1 on array_position. It's possible at some point we'll actually want
array_offset that does what it claims.

On another note, you mentioned elsewhere that it's not possible to
return anything other than an integer. Why can't there be a variation of
this function that returns an array of ndims-1 that is the slice where a
value was found?
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Marko Tiikkaja <marko(at)joh(dot)to>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-03-21 06:06:55
Message-ID: CAFj8pRCYak+N7E83F2SR9dTi0QrXBbSkezqkLzo=b9-2phRqsg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2015-03-21 0:27 GMT+01:00 Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>:

> On 3/20/15 2:48 PM, Pavel Stehule wrote:
>
>>
>>
>> 2015-03-20 18:47 GMT+01:00 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us
>> <mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us>>:
>>
>> Alvaro Herrera <alvherre(at)2ndquadrant(dot)com
>> <mailto:alvherre(at)2ndquadrant(dot)com>> writes:
>> > Pavel Stehule wrote:
>> >> I am thinking, so it is ok - it returns a offset, not position.
>>
>> > So you can't use it as a subscript? That sounds unfriendly. Almost
>> > every function using this will be subtly broken.
>>
>> I concur; perhaps "offset" was the design intention, but it's wrong.
>> The result should be a subscript.
>>
>>
>> do you have any idea about name for this function? array_position is ok?
>>
>
> +1 on array_position. It's possible at some point we'll actually want
> array_offset that does what it claims.
>

additional implementation of array_position needs few lines more

>
> On another note, you mentioned elsewhere that it's not possible to return
> anything other than an integer. Why can't there be a variation of this
> function that returns an array of ndims-1 that is the slice where a value
> was found?

We talked about it, when we talked about MD searching - and we moved it to
next stage.

I am thinking so array_postions can support MD arrays due returning a array

Regards

Pavel

>
>
> --
> Jim Nasby, Data Architect, Blue Treble Consulting
> Data in Trouble? Get it in Treble! http://BlueTreble.com
>

Attachment Content-Type Size
array_position.patch text/x-patch 7.0 KB

From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Marko Tiikkaja <marko(at)joh(dot)to>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-03-21 07:59:07
Message-ID: CAEZATCVukO8LcYTjceH=NZzcmicPg-phg53kzuRiJS33_vhUDg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>>> do you have any idea about name for this function? array_position is ok?
>>
>> +1 on array_position. It's possible at some point we'll actually want
>> array_offset that does what it claims.
>

+1 for array_position.

-1 for keeping array_offset. I'm not convinced that there are
sufficient use cases for it. No other array functions deal in offsets
relative to the first element, and if you do want that, it is trivial
to achieve with array_position() and array_lower(). IMO having 2
functions for searching in an array will just increase the risk of
users picking the wrong one by accident.

Regards,
Dean


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Marko Tiikkaja <marko(at)joh(dot)to>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-03-22 06:11:22
Message-ID: CAFj8pRBeBapPz5eSi7v2arQbt8txMaL88-fcCrdgjw20wwQitA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi

here is updated patch with array_position, array_positions implementation.

It is based on committed code - so please, revert commit
13dbc7a824b3f905904cab51840d37f31a07a9ef and apply this patch

Regards

Pavel

2015-03-20 18:29 GMT+01:00 Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>:

> Pavel Stehule wrote:
> > 2015-03-20 17:49 GMT+01:00 Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>:
> >
> > > There's an issue when the array's lower bound isn't 1:
> > >
> > > select array_offset('[2:4]={1,2,3}'::int[], 1);
> > > array_offset
> > > --------------
> > > 1
> > > (1 row)
> > >
> > > whereas I would expect this to return 2. Similarly for
> > > array_offsets(), so the offsets can be used as indexes into the
> > > original array.
> > >
> >
> > I am thinking, so it is ok - it returns a offset, not position.
>
> So you can't use it as a subscript? That sounds unfriendly. Almost
> every function using this will be subtly broken.
>
> --
> Álvaro Herrera http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>

Attachment Content-Type Size
array_position-20150322.patch text/x-patch 23.4 KB

From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Marko Tiikkaja <marko(at)joh(dot)to>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-03-22 10:30:05
Message-ID: CAEZATCWBb29HiozMuM5KJp+9CqS=ZrTUTGL-H6QU-F+kT8HrDA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 22 March 2015 at 06:11, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
> Hi
>
> here is updated patch with array_position, array_positions implementation.
>
> It is based on committed code - so please, revert commit
> 13dbc7a824b3f905904cab51840d37f31a07a9ef and apply this patch
>

I checked this and the changes look good, except that you missed a
couple of places in src/include/utils/array.h that need updating.

In the public docs, you should s/position/subscript because that's the
term used throughout the docs for an index into an array. I still like
the name array_position() for the function though, because it's
consistent with the existing position() functions.

Regards,
Dean


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Marko Tiikkaja <marko(at)joh(dot)to>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-03-22 10:57:42
Message-ID: CAFj8pRD=QhC36bMDA3xqRpKR3U7Y4QhYyF_KJn6RwGz3nTuNOQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2015-03-22 11:30 GMT+01:00 Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>:

> On 22 March 2015 at 06:11, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
> > Hi
> >
> > here is updated patch with array_position, array_positions
> implementation.
> >
> > It is based on committed code - so please, revert commit
> > 13dbc7a824b3f905904cab51840d37f31a07a9ef and apply this patch
> >
>
> I checked this and the changes look good, except that you missed a
> couple of places in src/include/utils/array.h that need updating.
>
> In the public docs, you should s/position/subscript because that's the
> term used throughout the docs for an index into an array. I still like
> the name array_position() for the function though, because it's
> consistent with the existing position() functions.
>

updated patch

>
> Regards,
> Dean
>

Attachment Content-Type Size
array_position-20150322-2.patch text/x-patch 21.8 KB

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Marko Tiikkaja <marko(at)joh(dot)to>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-03-30 19:30:17
Message-ID: 20150330193017.GB3071@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Pavel Stehule wrote:
> 2015-03-22 11:30 GMT+01:00 Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>:

> > In the public docs, you should s/position/subscript because that's the
> > term used throughout the docs for an index into an array. I still like
> > the name array_position() for the function though, because it's
> > consistent with the existing position() functions.
>
> updated patch

Thanks, pushed.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Marko Tiikkaja <marko(at)joh(dot)to>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: proposal: searching in array function - array_position
Date: 2015-03-30 19:58:50
Message-ID: CAFj8pRD1f4Eby6X-ruxEM0uCRHCemEzm2Ec6zXZqsqQCbtL=XA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2015-03-30 21:30 GMT+02:00 Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>:

> Pavel Stehule wrote:
> > 2015-03-22 11:30 GMT+01:00 Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>:
>
> > > In the public docs, you should s/position/subscript because that's the
> > > term used throughout the docs for an index into an array. I still like
> > > the name array_position() for the function though, because it's
> > > consistent with the existing position() functions.
> >
> > updated patch
>
> Thanks, pushed.
>
>
Thank you very much

Pavel

> --
> Álvaro Herrera http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>