Quick Links

Re: Adding pipelining support to set returning functions

From:	Hans-Juergen Schoenig <postgres(at)cybertec(dot)at>
To:	Hannu Krosing <hannu(at)krosing(dot)net>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Adding pipelining support to set returning functions
Date:	2008-04-11 08:57:47
Message-ID:	47FF280B.70105@cybertec.at
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hannu Krosing wrote:
> A question to all pg hackers
>
> Is anybody working on adding pipelining to set returning functions.
>
> How much effort would it take ?
>
> Where should I start digging ?
>

i asked myself basically the same question some time ago.
pipelining seems fairly impossible unless we ban joins on those
"plugins" completely.
i think this should be fine for your case (no need to join PL/proxy
partitions) - what we want here is to re-unify data and sent it through
centralized BI.

> BACKGROUND:
>
> AFAICS , currently set returning functions materialise their results
> before returning, as seen by this simple test:
>
> hannu=# select * from generate_series(1,10) limit 2;
> generate_series
> -----------------
> 1
> 2
> (2 rows)
>
> Time: 1.183 ms
>
>
> hannu=# select * from generate_series(1,10000000) limit 2;
> generate_series
> -----------------
> 1
> 2
> (2 rows)
>
> Time: 3795.032 ms
>
> being able to pipeline (generate results as needed) would enable several
> interesting techniques, especially if combined with pl/proxy or any
> other functions which stream external data.
>
> Applications and design patterns like http://telegraph.cs.berkeley.edu/
> or http://labs.google.com/papers/mapreduce.html would suddenly become
> very easy to implement.
>
> -----------------
> Hannu
>
>

currently things like nodeSeqscan do SeqNext and so on - one records is
passed on to the next level.
why not have a nodePlugin or so doing the same?
or maybe some additional calling convention for streaming functions...

e.g.:
CREATE STREAMING FUNCTION xy() RETURNS NEXT RECORD AS $$
return exactly one record to keep doing
return NULL to mark "end of table"
$$ LANGUAGE 'any';

so - for those function no ...
WHILE ...
RETURN NEXT

but just one tuple per call ...
this would pretty much do it for this case.
i would not even call this a special case - whenever there is a LOT of
data, this could make sense.

best regards,

hans

--
Cybertec Schönig & Schönig GmbH
PostgreSQL Solutions and Support
Gröhrmühlgasse 26, A-2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql-support.de

In response to

Adding pipelining support to set returning functions at 2008-04-06 07:01:20 from Hannu Krosing

Responses

Re: Adding pipelining support to set returning functions at 2008-04-11 10:00:04 from Hannu Krosing

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Bernd Helmle	2008-04-11 09:38:28	Re: Separate psql commands from arguments
Previous Message	Tom Dunstan	2008-04-11 08:56:53	Re: Commit fest queue