FDW-based dblink (WIP)

Lists: pgsql-hackers
From: Itagaki Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: pgsql-hackers(at)postgresql(dot)org
Subject: FDW-based dblink (WIP)
Date: 2009-08-19 08:07:13
Message-ID: 20090819155925.99E1.52131E4D@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Here is a WIP patch for a foreign data wrapper based dblink.

It integrates dblink module into core and adds a new functionality,
automatic transaction management. The new interface of dblink is
exported by include/foreign/dblink.h. We can easily write a connector
module for another database because we can reuse transaction and
resource management parts in core.

Syntax to create FDW with connector is below:
CREATE FOREIGN DATA WRAPPER postgresql
VALIDATOR postgresql_fdw_validator
CONNECTOR postgresql_fdw_connector
OPTIONS (...);

contrib/dblink2 is a sample of postgres connector. It exports one function:
CREATE FUNCTION postgresql_fdw_connector(options internal)
RETURNS internal -- returns a connection object

Basic dblink functions are moved to postgres core:
Name | Result type | Argument data types
-------------------+--------------+----------------------
dblink | SETOF record | text, text
dblink_close | boolean | integer
dblink_connect | boolean | text
dblink_connect | boolean | text, text
dblink_disconnect | boolean | text
dblink_exec | bigint | text, text
dblink_fetch | SETOF record | integer, integer
dblink_open | integer | text, text

The new dblink can work together closely with local transactions. If a
local transaction is committed or rollbacked, remote transactions take
the same status with the local one. Please set max_prepared_transactions
to 1 or greater if you could test the patch.

I want pretty much the automatic transaction management. It is useful to
write applied modules like materialized-view-over-network. But it should
be able to be turned off if we don't want it. I'll work on those parts next.

-- connect
CREATE SERVER server_postgres FOREIGN DATA WRAPPER postgresql;
SELECT dblink_connect('conn_postgres', 'server_postgres');
-- commit both local and remote transactions.
BEGIN;
SELECT dblink_exec('conn_postgres', 'UPDATE ...');
COMMIT;
-- rollback both local and remote transactions.
BEGIN;
SELECT dblink_exec('conn_postgres', 'UPDATE ...');
ROLLBACK;
-- disconnect
SELECT dblink_disconnect('conn_postgres');

I've not ported all features in present dblink, but I'd like to hear
wheather the goal and the concepts are reasonable. Comments welcome.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

Attachment Content-Type Size
dblink2-20090819.patch application/octet-stream 70.4 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Itagaki Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: FDW-based dblink (WIP)
Date: 2009-08-19 15:16:49
Message-ID: 22303.1250695009@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Itagaki Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> writes:
> Here is a WIP patch for a foreign data wrapper based dblink.
> It integrates dblink module into core and adds a new functionality,
> automatic transaction management.

I don't believe there is any consensus for integrating dblink into core,
and I for one will resist that strongly. Keep it in contrib.

regards, tom lane


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Itagaki Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: FDW-based dblink (WIP)
Date: 2009-08-19 15:21:44
Message-ID: 162867790908190821i25b0ba35tdb9627a5318274e9@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2009/8/19 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> Itagaki Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> writes:
>> Here is a WIP patch for a foreign data wrapper based dblink.
>> It integrates dblink module into core and adds a new functionality,
>> automatic transaction management.
>
> I don't believe there is any consensus for integrating dblink into core,
> and I for one will resist that strongly.  Keep it in contrib.

if integration means, so I could to write query like

SELECT * FROM otherdatabase.schema.table ....
UPDATE otherdb.table SET ...

I am for integration.

regards
Pavel Stehule
>
>                        regards, tom lane
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Itagaki Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: FDW-based dblink (WIP)
Date: 2009-08-19 15:24:19
Message-ID: 4A8C1923.8050200@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Itagaki Takahiro wrote:
> It integrates dblink module into core and adds a new functionality,
> automatic transaction management.

Why does it need to be integrated to core? I'd prefer to keep dblink out
of core, unless there's a reason.

> I want pretty much the automatic transaction management. It is useful to
> write applied modules like materialized-view-over-network. But it should
> be able to be turned off if we don't want it. I'll work on those parts next.

The transaction management is very unsafe as it is, for the reasons I
mentioned earlier.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Itagaki Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: FDW-based dblink (WIP)
Date: 2009-08-19 15:29:24
Message-ID: 22571.1250695764@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
> 2009/8/19 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
>> I don't believe there is any consensus for integrating dblink into core,
>> and I for one will resist that strongly. Keep it in contrib.

> if integration means, so I could to write query like
> SELECT * FROM otherdatabase.schema.table ....
> UPDATE otherdb.table SET ...
> I am for integration.

That is not what "integrating dblink" means --- what Itagaki-san is
talking about is moving the dblink_xxx functions into core. What
you are talking about is actual SQL/MED functionality, which we should
indeed try to get into core someday. But dblink is a dead end as far
as standards compliance goes. Between that and the potential security
issues, we should not put it in core.

regards, tom lane


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Itagaki Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: FDW-based dblink (WIP)
Date: 2009-08-19 15:32:56
Message-ID: 162867790908190832w4076b331oad5353534b543e6a@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2009/8/19 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
>> 2009/8/19 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
>>> I don't believe there is any consensus for integrating dblink into core,
>>> and I for one will resist that strongly.  Keep it in contrib.
>
>> if integration means, so I could to write query like
>> SELECT * FROM otherdatabase.schema.table ....
>> UPDATE otherdb.table SET ...
>> I am for integration.
>
> That is not what "integrating dblink" means --- what Itagaki-san is
> talking about is moving the dblink_xxx functions into core.  What
> you are talking about is actual SQL/MED functionality, which we should
> indeed try to get into core someday.  But dblink is a dead end as far
> as standards compliance goes.  Between that and the potential security
> issues, we should not put it in core.
>

aha, - ok

regards
Pavel Stehule

>                        regards, tom lane
>


From: Joe Conway <mail(at)joeconway(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Itagaki Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: FDW-based dblink (WIP)
Date: 2009-08-19 15:38:38
Message-ID: 4A8C1C7E.6040404@joeconway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
>> 2009/8/19 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
>>> I don't believe there is any consensus for integrating dblink into core,
>>> and I for one will resist that strongly. Â Keep it in contrib.
>
>> if integration means, so I could to write query like
>> SELECT * FROM otherdatabase.schema.table ....
>> UPDATE otherdb.table SET ...
>> I am for integration.
>
> That is not what "integrating dblink" means --- what Itagaki-san is
> talking about is moving the dblink_xxx functions into core. What
> you are talking about is actual SQL/MED functionality, which we should
> indeed try to get into core someday. But dblink is a dead end as far
> as standards compliance goes. Between that and the potential security
> issues, we should not put it in core.

+1

Joe


From: Itagaki Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: FDW-based dblink (WIP)
Date: 2009-08-20 02:36:19
Message-ID: 20090820112444.998F.52131E4D@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> I don't believe there is any consensus for integrating dblink into core,
> and I for one will resist that strongly. Keep it in contrib.

OK, our consensus is that dblink should be replaced with SQL/MED interface
and then we'll start to consider integrating into core.

However, automatic transaction management needs help by core. Is it
acceptable to have two-phase callbacks? Registered callbacks are
called with TWOPHASE_EVENT_PRE_COMMIT when a transaction is about
to be committed or prepared. The argument gxact is NULL if the
transaction is committed without 2PC.

typedef enum
{
TWOPHASE_EVENT_PRE_COMMIT,
TWOPHASE_EVENT_POST_COMMIT,
TWOPHASE_EVENT_POST_ABORT,
TWOPHASE_EVENT_RECOVER,
} TwoPhaseEvent;

typedef void (*TwoPhaseEventCallback) (
TwoPhaseEvent event, GlobalTransaction gxact, void *arg);

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

Attachment Content-Type Size
twophase_callbacks-20090820.patch application/octet-stream 4.8 KB

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Itagaki Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: FDW-based dblink (WIP)
Date: 2009-08-20 06:14:20
Message-ID: 4A8CE9BC.4080802@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Itagaki Takahiro wrote:
> However, automatic transaction management needs help by core. Is it
> acceptable to have two-phase callbacks?

Probably, but it's far too early to decide what the interface should
look like.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Itagaki Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: FDW-based dblink (WIP)
Date: 2009-09-15 10:52:31
Message-ID: 1253011951.18101.12.camel@fsopti579.F-Secure.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, 2009-08-19 at 17:07 +0900, Itagaki Takahiro wrote:
> Here is a WIP patch for a foreign data wrapper based dblink.
>
> It integrates dblink module into core and adds a new functionality,
> automatic transaction management. The new interface of dblink is
> exported by include/foreign/dblink.h. We can easily write a connector
> module for another database because we can reuse transaction and
> resource management parts in core.

This patch is listed in the commitfest, but I think the consensus was
that it needed some rework. I think the idea is that we will have
support for syntax like

> Syntax to create FDW with connector is below:
> CREATE FOREIGN DATA WRAPPER postgresql
> VALIDATOR postgresql_fdw_validator
> CONNECTOR postgresql_fdw_connector
> OPTIONS (...);

in core, but the actual implementation of postgresql_fdw_connector would
be a loadable module.

Personally, I'm undecided whether the single-function connector
implementation is the best. The other approach would be to use a
multiple-function interface based directly on the functions currently
provided by dblink.

More generally, what does this really buy us? It doesn't advance the
SQL/MED implementation, because you are not adding, say, some kind of
CREATE FOREIGN TABLE support. You are just changing the dblink
implementation to go through the FDW. I would argue that it should be
the other way around: The FDW should go through dblink.


From: Itagaki Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: FDW-based dblink (WIP)
Date: 2009-09-16 04:47:19
Message-ID: 20090916132434.8F37.52131E4D@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:

> This patch is listed in the commitfest, but I think the consensus was
> that it needed some rework.

No doubt, but SQL/MED will require a lot of works. Can we split the work
into small parts? I intended FDW-based dblink to be one of the split jobs.

Here are some random considerations:

* Split dblink to connector and connection management layers.
Present dblink has own name-based connection management and error
handling routines, but I think we should share them amoung connectors.

* CREATE FOREIGN TABLE supports only select query in SQL standard.
I thnk we will still need to have free-style SQL executor like dblink
even when we support SQL/MED It is not a waste to include dblink in core.

* Consider interface of extensible connecter to be able to connect
other DBMSs. Especially, there are many differences in usage of 2PC.

* Automatic 2PC is very useful if we supports non-select query in SQL/MED.
It would be better to have some infrastructure for it.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Itagaki Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: FDW-based dblink (WIP)
Date: 2009-09-20 16:48:38
Message-ID: 1253465318.20098.4.camel@vanquo.pezone.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, 2009-09-16 at 13:47 +0900, Itagaki Takahiro wrote:
> Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
>
> > This patch is listed in the commitfest, but I think the consensus was
> > that it needed some rework.
>
> No doubt, but SQL/MED will require a lot of works. Can we split the work
> into small parts? I intended FDW-based dblink to be one of the split jobs.

Sure, but I think what you are doing here isn't on anyone's agenda. I
would imagine that the next step would be to implement foreign tables
using something like dblink's interface as underlying interface. What
you are proposing doesn't really move us closer to that, or I'm
misunderstanding what you are trying to achieve. So what is the purpose
of this patch anyway?