Re: MULTISET and additional functions for ARRAY

From: Itagaki Takahiro <itagaki(dot)takahiro(at)gmail(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: MULTISET and additional functions for ARRAY
Date: 2010-11-18 08:52:16
Message-ID: AANLkTinrRubdSSWvqO481sL0EyGz830=mFKAdK_knfgZ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Nov 15, 2010 at 14:37, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>> BTW, some of the intermediate products to implement those features might
>> be useful if exported. like array_sort() and array_unique(). If there is
>> demand, I'll add those functions, too.
>
> I has not a standard, so I can't to speak about conformance with
> standard, but I must to say, so these functionality should be
> extremely useful for plpgsql programming. You can see samples a some
> functions on the net for array sorting, for reduce a redundant values
> via SQL language. I am sure, so implementation in C will be much
> faster.

Here is a new WIP patch to support MULTISET functions. It includes
no docs yet, but I'll work for it if the design is accepted.

I added some non-standard functions:
- set() : same as DISTINCT for rows
- array_sort() : same as ORDER BY ASC NULLS LAST for rows
- array_flatten() : flatten an array into one-dimensional one
I chose set() for the function name because of compatibility with
Oracle. The name also has been proposed in the working draft.

SUBMULTISET OF now uses own submultiset_of() function instead
of <@ operator because their behaviors are different at all.
For example,
- ARRAY[1, 1] <@ ARRAY[1] ==> true
- ARRAY[1, 1] SUBMULTISET OF ARRAY[1] ==> false

I removed element() function because it is not in the standard
though it has been in the working draft. I think the function
is not so useful because it is almost same as "array[1]".

I need help for the syntax of MEMBER OF and SUBMULTISET OF.
OFs in them are omittable in the standard, but I cannot resolve
shift/reduce errors by bison for syntax. I added a FIXME comment
in gram.y. Suggestions welcome.

> Maybe can be useful to implement a searching on sorted array.
> You can hold a flag if multiset is sorted or not.

I've not added "is sorted" bits into array types yet. It might
improve performance, but it should come separately from the patch.
I think it's not so simple task because user might modify comparison
operators after they store arrays with sorted bits on disks; those
arrays have sorted bits, but are not sorted actually with the latest
operators.

=== New features in ver.20101118 ===
- [FUNCTION] cardinality(anyarray) => integer
- [FUNCTION] trim_array(anyarray, nTrimmed integer) => anyarray
- [FUNCTION] array_flatten(anyarray) => anyarray
- [FUNCTION] array_sort(anyarray) => anyarray
- [FUNCTION] set(anyarray) => anyarray
- [SYNTAX] $1 IS [NOT] A SET => boolean
- [SYNTAX] $1 [NOT] MEMBER OF $2 => boolean
- [SYNTAX] $1 [NOT] SUBMULTISET OF $2 => boolean
- [SYNTAX] $1 MULTISET UNION [ALL | DISTINCT] $2 => anyarray
- [SYNTAX] $1 MULTISET INTERSECT [ALL | DISTINCT] $22 => anyarray
- [SYNTAX] $1 MULTISET EXCEPT [ALL | DISTINCT] $22 => anyarray
- [AGGREGATE] collect(anyelement) => anyarray
- [AGGREGATE] fusion(anyarray) => anyarray
- [AGGREGATE] intersection(anyarray) => anyarray

=== New unreserved keywords ===
A, MEMBER, MULTISET, SUBMULTISET

--
Itagaki Takahiro

Attachment Content-Type Size
multiset-20101118.patch application/octet-stream 46.0 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Cédric Villemain 2010-11-18 09:59:33 Re: final patch - plpgsql: for-in-array
Previous Message Dimitri Fontaine 2010-11-18 08:07:19 Re: unlogged tables