Workaround for custom aggregate which would need "internal" as statetype

From: "Florian G(dot) Pflug" <fgp(at)phlo(dot)org>
To: Postgresql-General <pgsql-general(at)postgresql(dot)org>
Subject: Workaround for custom aggregate which would need "internal" as statetype
Date: 2006-04-10 20:50:14
Message-ID: 443AC506.5060401@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi

I'm trying to write an aggrecate collect_distinct(int8) which puts all
distinct values into an array. My first try was defining an aggregate
"collect" using array_append, and doing "select collect(distinct <field>) ..",
but this is quite slow - probably because distinct sorts the values, instead
of using a hash to filter out duplicates.

Using perl, and a perl-hash was even slower, so I wrote my to c-functions
(actualy c++), which use a STL hash_set to filter out duplicates.

I initially defined my state-transaction function as
"collect_distinct(internal, int8) returns internal". The parameter marked internal was
a pointer to a STL hash_set. But using this to define an aggregate failed, because
internal seems to be forbidden as a state-type.

I now resorted to a crude hack to get this running - I changed "internal" to
"int4", and just cast my pointer to a int4 before returning it, and after receiving
it as an argument. This at least enabled me to do some benchmarking, and performance-wise
things look good...

Before using this on a production system, I need to get rid of that hack, but I don't
see how this could be done ATM... Maybe someone here could give me a hint how this could
work...

greetings, Florian Pflug

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Bruce Momjian 2006-04-10 21:07:01 Re: pg 8.1.2 performance issue
Previous Message Dave Page 2006-04-10 20:29:32 Re: Debian package for freeradius_postgresql module