Proposal: include PL/Proxy into core

Lists: pgsql-hackers
From: "Marko Kreen" <markokr(at)gmail(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Proposal: include PL/Proxy into core
Date: 2007-03-30 10:36:02
Message-ID: e51f66da0703300336t4f221ddbjec3250aca0501a53@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

PL/Proxy is a small PL whose goal is to allow creating
"proxy functions" that call actual functions in remote database.
Basic design is:

Function body describes how to deduce final database. Its either

CONNECT 'connstr'; -- connect to exactly this db

or when partitioning is used:

-- partitons are described under that name
CLUSTER 'namestr';

-- calculate int4 based on function paramenters
-- and use that to pick a partition
RUN ON hashtext(username);

Actual function call info (arguments, result fields) are deduced
from looking at its own signature.

so function "foo(int4, text) returns setof text" will result in
query "select * from foo($1::int4, $2::text)" to be executed.

Announcement with more examples:

http://archives.postgresql.org/pgsql-announce/2007-03/msg00005.php

Documentation:

https://developer.skype.com/SkypeGarage/DbProjects/PlProxy

Patch:

http://plproxy.projects.postgresql.org/plproxy_core.diff.gz

Now, why put it into core?

1) Much simpler replacement for various other clustering solutions
that try to cluster regular SQL.

2) Nicer replacement for dblink.

3) PLs need much more intimate knowledge of the PostgreSQL core
then regular modules. API for PLs has been changing every
major revision of PostgreSQL.

4) It promotes the db-access-thru-functions design to databases, that
has proven to be killer feature of PostgreSQL. In a sense it is
using PostgreSQL as appserver which provides fixed API via
functions for external users, but hides internal layout from them,
so it can be changed invisibly to external users.

5) The language is ready feature-wise - theres no need for it to grow
into "Remote PLPGSQL", because all logic can be put into remote function.

Some objections that may be:

1) It is not a universal solves-everything tool for remote access/clustering.

But those solves-everything tools have very hard time maturing,
and will be not exactly simple. Much better is to have simple
tool that works well.

2) You cant use it for all thing you can use dblink.

PL/Proxy is easier to use for simple result fetching. For complicated
access using full-blown PLs (plperl, plpython) is better. From such
POV dblink is replaced.

3) It is possible for PL to live outside, The pain is not that big.

Sure its possible. We just feel that its usefulness : lines-of-code ratio
is very high, so its worthy of being builtin into PostgreSQL core,
thus also giving PostgreSQL opportunity to boast being
clusterable out-of-box.

4) What about all the existing apps that dont access database thru functions?

Those are target for "solves-everything" tool...

5) It it too new product.

We think this is offset by the small scope of the task it takes,
and it already works well in that scope.

--
marko


From: Cédric Villemain <cedric(dot)villemain(at)dalibo(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: "Marko Kreen" <markokr(at)gmail(dot)com>
Subject: Re: Proposal: include PL/Proxy into core
Date: 2007-03-30 10:47:23
Message-ID: 200703301247.30352.cedric.villemain@dalibo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Le vendredi 30 mars 2007 12:36, Marko Kreen a écrit :
> Patch:
>
> http://plproxy.projects.postgresql.org/plproxy_core.diff.gz
Note a perhaps oversight in your makefile :

+ #REGRESS_OPTS
= --dbname=$(PL_TESTDB) --load-language=plpgsql --load-language=plproxy
+ REGRESS_OPTS
= --dbname=regression --load-language=plpgsql --load-language=plproxy


From: "Marko Kreen" <markokr(at)gmail(dot)com>
To: Cédric Villemain <cedric(dot)villemain(at)dalibo(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Proposal: include PL/Proxy into core
Date: 2007-03-30 11:06:36
Message-ID: e51f66da0703300406p50ed57d3te29f7d9ff89a0303@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 3/30/07, Cédric Villemain <cedric(dot)villemain(at)dalibo(dot)com> wrote:
> Le vendredi 30 mars 2007 12:36, Marko Kreen a écrit:
> > Patch:
> >
> > http://plproxy.projects.postgresql.org/plproxy_core.diff.gz
> Note a perhaps oversight in your makefile :
>
> + #REGRESS_OPTS
> = --dbname=$(PL_TESTDB) --load-language=plpgsql --load-language=plproxy
> + REGRESS_OPTS
> = --dbname=regression --load-language=plpgsql --load-language=plproxy

Heh. The problem is I had 'regression' hardwired into
regtests, so I could not use $(PL_TESTDB).

If the proposal is accespted and we want to always run
PL/Proxy regtests, there should be some dynamic way
of passing main dbname and also connstrings for partitions
into regression tests.

ATM I thought it can stay as-is. (Actually I forgot that change
after I had done it :)

--
marko


From: Hannu Krosing <hannu(at)skype(dot)net>
To: Marko Kreen <markokr(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal: include PL/Proxy into core
Date: 2007-03-30 12:14:17
Message-ID: 1175256857.11864.13.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Ühel kenal päeval, R, 2007-03-30 kell 13:36, kirjutas Marko Kreen:
> PL/Proxy is a small PL whose goal is to allow creating
> "proxy functions" that call actual functions in remote database.
> Basic design is:
>
> Function body describes how to deduce final database. Its either
>
> CONNECT 'connstr'; -- connect to exactly this db
>
> or when partitioning is used:
>
> -- partitons are described under that name
> CLUSTER 'namestr';
>
> -- calculate int4 based on function paramenters
> -- and use that to pick a partition
> RUN ON hashtext(username);
>
>
> Actual function call info (arguments, result fields) are deduced
> from looking at its own signature.
>
> so function "foo(int4, text) returns setof text" will result in
> query "select * from foo($1::int4, $2::text)" to be executed.

>
> Announcement with more examples:
>
> http://archives.postgresql.org/pgsql-announce/2007-03/msg00005.php
>
> Documentation:
>
> https://developer.skype.com/SkypeGarage/DbProjects/PlProxy
>
> Patch:
>
> http://plproxy.projects.postgresql.org/plproxy_core.diff.gz
>
>
> Now, why put it into core?
>
> 1) Much simpler replacement for various other clustering solutions
> that try to cluster regular SQL.
>
> 2) Nicer replacement for dblink.
>
> 3) PLs need much more intimate knowledge of the PostgreSQL core
> then regular modules. API for PLs has been changing every
> major revision of PostgreSQL.
>
> 4) It promotes the db-access-thru-functions design to databases, that
> has proven to be killer feature of PostgreSQL. In a sense it is
> using PostgreSQL as appserver which provides fixed API via
> functions for external users, but hides internal layout from them,
> so it can be changed invisibly to external users.
>
> 5) The language is ready feature-wise - theres no need for it to grow
> into "Remote PLPGSQL", because all logic can be put into remote function.
>
>
> Some objections that may be:
>
> 1) It is not a universal solves-everything tool for remote access/clustering.
>
> But those solves-everything tools have very hard time maturing,
> and will be not exactly simple. Much better is to have simple
> tool that works well.

current pl/proxy proposed here for inclusion is already an almost
complete redesign and rewrite based on our experiences of using the
initial version in production databases, so you can expect ver 2.x
robustness, maintainability and code cleanness from it.

> 5) It it too new product.
>
> We think this is offset by the small scope of the task it takes,
> and it already works well in that scope.

Also, it is actively used serving thousands of requests per second in a
24/7 live environment, which means that it should be reasonably well
tested.

Together with our lightweight connection pooler
https://developer.skype.com/SkypeGarage/DbProjects/PgBouncer pl/proxy
can be used to implement the vision of building a "DB-bus" over a
database farm of diverse postgresql servers as shown in SLIDE3: of
https://developer.skype.com/SkypeGarage/DbProjects/SkypePostgresqlWhitepaper .

The connection pooler is not strictly needed and can be left out for
smaller configurations with maybe less than about 10 databases and/or
concurrent db connections.

(btw, the connection poolers name PgBouncer comes from its initial focus
of "bouncing around" single-transaction db calls.)

--
----------------
Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me: callto:hkrosing
Get Skype for free: http://www.skype.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Marko Kreen" <markokr(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal: include PL/Proxy into core
Date: 2007-03-30 17:50:16
Message-ID: 21697.1175277016@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Marko Kreen" <markokr(at)gmail(dot)com> writes:
> Now, why put it into core?

I don't think you have made a sufficient case for that. I think it
should stay as an outside project for awhile and see what sort of
userbase it attracts. If it becomes sufficiently popular I'd be
willing to consider adding it to core, but that remains to be seen.

We can barely keep up maintaining what's in core now --- we need to
be very strict about adding stuff that doesn't really have to be in
core, and this evidently doesn't, since you've got it working ...

regards, tom lane


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Hannu Krosing <hannu(at)skype(dot)net>, Marko Kreen <markokr(at)gmail(dot)com>
Subject: Re: Proposal: include PL/Proxy into core
Date: 2007-03-30 18:49:25
Message-ID: 200703301149.25308.josh@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hannu, Marko,

I, personally, think that it's worth talking about integrating these.
However, the old versions were definitely NOT ready for integration, and the
new versions went on the internet like a week ago. Heck, I haven't even
downloaded them yet.

Can we address these on the 8.4 timeline? That will give the rest of us in
the community time to download, try and debug the new SkyTools. I know I'm
planning on testing them and will know a lot more about your code/performance
in a few months. Is there a reason why getting PL/proxy into 8.3 is
critical?

--
Josh Berkus
PostgreSQL @ Sun
San Francisco


From: "Marko Kreen" <markokr(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal: include PL/Proxy into core
Date: 2007-03-31 07:40:00
Message-ID: e51f66da0703310040o1f38c66eyad805b556f46d30e@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 3/30/07, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> "Marko Kreen" <markokr(at)gmail(dot)com> writes:
> > Now, why put it into core?
>
> I don't think you have made a sufficient case for that. I think it
> should stay as an outside project for awhile and see what sort of
> userbase it attracts. If it becomes sufficiently popular I'd be
> willing to consider adding it to core, but that remains to be seen.

Fair enough.

--
marko


From: "Marko Kreen" <markokr(at)gmail(dot)com>
To: "Josh Berkus" <josh(at)agliodbs(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, "Hannu Krosing" <hannu(at)skype(dot)net>
Subject: Re: Proposal: include PL/Proxy into core
Date: 2007-03-31 08:19:44
Message-ID: e51f66da0703310119h36c5eaa0y70661ede01e9c361@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 3/30/07, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> I, personally, think that it's worth talking about integrating these.
> However, the old versions were definitely NOT ready for integration, and the
> new versions went on the internet like a week ago. Heck, I haven't even
> downloaded them yet.

Yeah, the old version was bit too complicated. Thats why we
did a rewrite. And it turned out nice and simple.

> Can we address these on the 8.4 timeline? That will give the rest of us in
> the community time to download, try and debug the new SkyTools. I know I'm
> planning on testing them and will know a lot more about your code/performance
> in a few months. Is there a reason why getting PL/proxy into 8.3 is
> critical?

No hurry. Just the timing was too good...

Also, if there are some design/API issues that may hinder
merging, we'd like to solve them immidiately.

--
marko