Quick Links

Re: [v9.3] writable foreign tables

Lists:	pgsql-hackers

From:	Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
To:	PgHacker <pgsql-hackers(at)postgresql(dot)org>
Cc:	Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Subject:	[v9.3] writable foreign tables
Date:	2012-08-23 05:10:34
Message-ID:	CADyhKSXfwihRMdzqwykLeXUksGu-EUPQ8uB_Cm=TvYtXEzK8ow@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hello,

The attached patch is just a proof-of-concept of writable foreign table
feature; thus, I don't expect it getting merged at the upcoming commit
fest. The purpose of this patch is to find out the best way to support
"write stuff" in FDW.

Basic idea of this patch is to utilize "ctid" field to track an identifier of
a particular foreign-row; to be updated or deleted. It shall be moved
to the modify-stage from the scan-stage as regular table doing.
Then, newly added methods being invoked at ExecUpdate or
ExecDelete with the "pseudo ctid", so FDW shall be able to modify
the target foreign-row.
It is a responsibility of FDW extension (and DBA) to ensure each
foreign-row has a unique identifier that has 48-bits width integer
data type in maximum. In case of pgsql_fdw, "ctid" of remote table
can perform as "pseudo ctid", as is. For other RDBMS, DBA will
need to indicate which column should perform.
INSERT is simple enough; all we need to do it to carry every field
of new tuple into the remote side.

This patch adds five new methods of FdwRoutine structure.
The BeginForeignModify and EndForeignModify give a chance
to initialize / destruct a private state that can be allocated on
ResultRelInfo. As literal, ExecForeignInsert, ExecForeignDelete
and ExecForeignUpdate are invoked for each new tuple, instead
of heap_*() for regular tables. If NULL was set on them, it means
this FDW does not support these operations.

I intend FDW to set up a prepared statement that modifies
a particular remote-row being identified with pseudo-ctid,
at the BeginForeignModify(). Then, ExecForeign*() kicks
the prepared statement with given pseudo-ctid.

The patched portion at contrib/file_fdw.c does not make sense
actually. It just prints messages for each invocation.
It is just a proof-of-concept to show possibility of implementation
based on real RDBMS.

In case when file_fdw performs behalf on "ftbl".
--------------------------------
postgres=# SELECT ctid, * FROM ftbl;
ctid | a | b
--------+-----+-----
(0,1) | 101 | aaa
(0,2) | 102 | bbb
(0,3) | 103 | ccc
(0,4) | 104 | ddd
(0,5) | 105 | eee
(0,6) | 106 | fff
(0,7) | 107 | ggg
(0,8) | 108 | hhh
(0,9) | 109 | iii
(0,10) | 110 | jjj
(10 rows)
==> ctid is used to carray identifier of row; line number in this example.

postgres=# UPDATE ftbl SET a = a + 1 WHERE a > 107;
INFO: ftbl is the target relation of UPDATE
INFO: fdw_file: BeginForeignModify method
INFO: fdw_file: UPDATE (lineno = 8)
INFO: fdw_file: UPDATE (lineno = 9)
INFO: fdw_file: UPDATE (lineno = 10)
INFO: fdw_file: EndForeignModify method
UPDATE 3
postgres=# DELETE FROM ftbl WHERE a BETWEEN 103 AND 106;
INFO: ftbl is the target relation of DELETE
INFO: fdw_file: BeginForeignModify method
INFO: fdw_file: DELETE (lineno = 3)
INFO: fdw_file: DELETE (lineno = 4)
INFO: fdw_file: DELETE (lineno = 5)
INFO: fdw_file: DELETE (lineno = 6)
INFO: fdw_file: EndForeignModify method
DELETE 4
--------------------------------

This patch does not care about transaction control anyway.
According to the discussion in developer meeting at Ottawa,
I didn't include such a complex stuff in the first step.
(Probably, we can implement using XactCallback...)

Thanks,
--
KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>

Attachment	Content-Type	Size
pgsql-v9.3-writable-fdw-poc.v1.patch	application/octet-stream	15.6 KB

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Cc:	PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Subject:	Re: [v9.3] writable foreign tables
Date:	2012-08-25 16:20:57
Message-ID:	CA+TgmoYStMSvn1HWL85ZKvFbObDLBLktuU4EekWu_vUqzM5uXw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, Aug 23, 2012 at 1:10 AM, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp> wrote:
> It is a responsibility of FDW extension (and DBA) to ensure each
> foreign-row has a unique identifier that has 48-bits width integer
> data type in maximum.

It strikes me as incredibly short-sighted to decide that the row
identifier has to have the same format as what our existing heap AM
happens to have. I think we need to allow the row identifier to be of
any data type, and even compound. For example, the foreign side might
have no equivalent of CTID, and thus use primary key. And the primary
key might consist of an integer and a string, or some such.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

From:	Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Subject:	Re: [v9.3] writable foreign tables
Date:	2012-08-25 19:48:17
Message-ID:	CADyhKSUoEXvTXiWoqo83wFM+qvn0ugEAmEh4iiFpiiLz6jbT5g@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

2012/8/25 Robert Haas <robertmhaas(at)gmail(dot)com>:
> On Thu, Aug 23, 2012 at 1:10 AM, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp> wrote:
>> It is a responsibility of FDW extension (and DBA) to ensure each
>> foreign-row has a unique identifier that has 48-bits width integer
>> data type in maximum.
>
> It strikes me as incredibly short-sighted to decide that the row
> identifier has to have the same format as what our existing heap AM
> happens to have. I think we need to allow the row identifier to be of
> any data type, and even compound. For example, the foreign side might
> have no equivalent of CTID, and thus use primary key. And the primary
> key might consist of an integer and a string, or some such.
>
I assume it is a task of FDW extension to translate between the pseudo
ctid and the primary key in remote side.

For example, if primary key of the remote table is Text data type, an idea
is to use a hash table to track the text-formed primary being associated
with a particular 48-bits integer. The pseudo ctid shall be utilized to track
the tuple to be modified on the scan-stage, then FDW can reference the
hash table to pull-out the primary key to be provided on the prepared
statement.

Do we have some other reasonable ideas?

Thanks,
--
KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>

From:	"Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at>
To:	"Kohei KaiGai EXTERN" <kaigai(at)kaigai(dot)gr(dot)jp>, "Robert Haas" <robertmhaas(at)gmail(dot)com>
Cc:	"PgHacker" <pgsql-hackers(at)postgresql(dot)org>, "Shigeru Hanada" <shigeru(dot)hanada(at)gmail(dot)com>
Subject:	Re: [v9.3] writable foreign tables
Date:	2012-08-27 07:57:49
Message-ID:	D960CB61B694CF459DCFB4B0128514C2084EFAB5@exadv11.host.magwien.gv.at
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Kohei KaiGai wrote:
> 2012/8/25 Robert Haas <robertmhaas(at)gmail(dot)com>:
>> On Thu, Aug 23, 2012 at 1:10 AM, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
wrote:
>>> It is a responsibility of FDW extension (and DBA) to ensure each
>>> foreign-row has a unique identifier that has 48-bits width integer
>>> data type in maximum.

>> It strikes me as incredibly short-sighted to decide that the row
>> identifier has to have the same format as what our existing heap AM
>> happens to have. I think we need to allow the row identifier to be
of
>> any data type, and even compound. For example, the foreign side
might
>> have no equivalent of CTID, and thus use primary key. And the
primary
>> key might consist of an integer and a string, or some such.

> I assume it is a task of FDW extension to translate between the pseudo
> ctid and the primary key in remote side.
>
> For example, if primary key of the remote table is Text data type, an
idea
> is to use a hash table to track the text-formed primary being
associated
> with a particular 48-bits integer. The pseudo ctid shall be utilized
to track
> the tuple to be modified on the scan-stage, then FDW can reference the
> hash table to pull-out the primary key to be provided on the prepared
> statement.

And what if there is a hash collision? Then you would not be able to
determine which row is meant.

I agree with Robert that this should be flexible enough to cater for
all kinds of row identifiers. Oracle, for example, uses ten byte
identifiers which would give me a headache with your suggested design.

> Do we have some other reasonable ideas?

Would it be too invasive to introduce a new pointer in TupleTableSlot
that is NULL for anything but virtual tuples from foreign tables?

Yours,
Laurenz Albe

From:	Shigeru HANADA <shigeru(dot)hanada(at)gmail(dot)com>
To:	Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Cc:	PgHacker <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [v9.3] writable foreign tables
Date:	2012-08-27 10:30:04
Message-ID:	503B4C2C.7080206@gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Kaigai-san,

On Thu, Aug 23, 2012 at 2:10 PM, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp> wrote:
> The patched portion at contrib/file_fdw.c does not make sense
> actually. It just prints messages for each invocation.
> It is just a proof-of-concept to show possibility of implementation
> based on real RDBMS.

Attached is a tar ball of pgsql_fdw. It's WIP and contains no
document, but it would be enough for your PoC purpose. Usage and
features are same as the last version posted for 9.2 cycle.
# I'll post finished patch in the CF-Sep.

Here are random comments for your PoC patch:

+ As Robert says, using CTID as virtual tuple identifier doesn't seem
nice when considering various FDWs for NoSQL or RDBMS. Having abstract
layer between FDWs and tuple sounds better, but implementing it by each
FDW seems useless effort. Do yo have any idea of generic mechanism for
tuple mapping?

+ Do you have any plan about deparsing local qualifiers into remote
query to avoid repeated query submission? This would improve
performance of big UPDATE, but its use case might be limited to
statements which consist of one foreign table. For this case, we can
consider pass-through mode as second way.

+ I have not read your patch closely yet, but I wonder how we can know
which column is actually updated. If we have only updated image of
tuple, we have to update all remote columns by "new" values?

--
Shigeru Hanada

Attachment	Content-Type	Size
pgsql_fdw_93.tar.gz	application/gzip	32.9 KB

From:	Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
To:	Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Subject:	Re: [v9.3] writable foreign tables
Date:	2012-08-28 08:37:47
Message-ID:	CADyhKSVvKz+YKVJ91uBBOZzT1QfAc-QrdtdrjwSfxuXZ0JMDCw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

2012/8/27 Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>:
> Kohei KaiGai wrote:
>> 2012/8/25 Robert Haas <robertmhaas(at)gmail(dot)com>:
>>> On Thu, Aug 23, 2012 at 1:10 AM, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
> wrote:
>>>> It is a responsibility of FDW extension (and DBA) to ensure each
>>>> foreign-row has a unique identifier that has 48-bits width integer
>>>> data type in maximum.
>
>>> It strikes me as incredibly short-sighted to decide that the row
>>> identifier has to have the same format as what our existing heap AM
>>> happens to have. I think we need to allow the row identifier to be
> of
>>> any data type, and even compound. For example, the foreign side
> might
>>> have no equivalent of CTID, and thus use primary key. And the
> primary
>>> key might consist of an integer and a string, or some such.
>
>> I assume it is a task of FDW extension to translate between the pseudo
>> ctid and the primary key in remote side.
>>
>> For example, if primary key of the remote table is Text data type, an
> idea
>> is to use a hash table to track the text-formed primary being
> associated
>> with a particular 48-bits integer. The pseudo ctid shall be utilized
> to track
>> the tuple to be modified on the scan-stage, then FDW can reference the
>> hash table to pull-out the primary key to be provided on the prepared
>> statement.
>
> And what if there is a hash collision? Then you would not be able to
> determine which row is meant.
>
Even if we had a hash collision, each hash entry can have the original
key itself to be compared. But anyway, I love the idea to support
an opaque pointer to track particular remote-row rather.

> I agree with Robert that this should be flexible enough to cater for
> all kinds of row identifiers. Oracle, for example, uses ten byte
> identifiers which would give me a headache with your suggested design.
>
>> Do we have some other reasonable ideas?
>
> Would it be too invasive to introduce a new pointer in TupleTableSlot
> that is NULL for anything but virtual tuples from foreign tables?
>
I'm not certain whether the duration of TupleTableSlot is enough to
carry a private datum between scan and modify stage.
For example, the TupleTableSlot shall be cleared at ExecNestLoop
prior to the slot being delivered to ExecModifyTuple.

postgres=# EXPLAIN UPDATE t1 SET b = 'abcd' WHERE a IN (SELECT x FROM
t2 WHERE x % 2 = 0);
QUERY PLAN
-------------------------------------------------------------------------------
Update on t1 (cost=0.00..54.13 rows=6 width=16)
-> Nested Loop (cost=0.00..54.13 rows=6 width=16)
-> Seq Scan on t2 (cost=0.00..28.45 rows=6 width=10)
Filter: ((x % 2) = 0)
-> Index Scan using t1_pkey on t1 (cost=0.00..4.27 rows=1 width=10)
Index Cond: (a = t2.x)
(6 rows)

Is it possible to utilize ctid field to move a private pointer?
TID data type is internally represented as a pointer to ItemPointerData,
so it has enough width to track an opaque formed remote-row identifier;
including string, int64 or others.

One disadvantage is "ctid" system column shows a nonsense value
when user explicitly references this system column. But it does not
seems to me a fundamental problem, because we didn't give any
special meaning on the "ctid" field of foreign table.

Thanks,
--
KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>

From:	Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
To:	Shigeru HANADA <shigeru(dot)hanada(at)gmail(dot)com>
Cc:	PgHacker <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [v9.3] writable foreign tables
Date:	2012-08-28 09:07:31
Message-ID:	CADyhKSXz2gCBC+64+t3qz07TQqvB-sAVcfDb4C8mmgWMrR_zJQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

2012/8/27 Shigeru HANADA <shigeru(dot)hanada(at)gmail(dot)com>:
> Kaigai-san,
>
> On Thu, Aug 23, 2012 at 2:10 PM, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp> wrote:
>> The patched portion at contrib/file_fdw.c does not make sense
>> actually. It just prints messages for each invocation.
>> It is just a proof-of-concept to show possibility of implementation
>> based on real RDBMS.
>
> Attached is a tar ball of pgsql_fdw. It's WIP and contains no
> document, but it would be enough for your PoC purpose. Usage and
> features are same as the last version posted for 9.2 cycle.
> # I'll post finished patch in the CF-Sep.
>
Thanks, it is helpful to work on.

> Here are random comments for your PoC patch:
>
> + As Robert says, using CTID as virtual tuple identifier doesn't seem
> nice when considering various FDWs for NoSQL or RDBMS. Having abstract
> layer between FDWs and tuple sounds better, but implementing it by each
> FDW seems useless effort. Do yo have any idea of generic mechanism for
> tuple mapping?
>
As I wrote in the previous message, isn't it a reasonable idea to move
a private datum (instead of alternate key) on the "ctid" field which has
been internally represented as a pointer to indicate ItemPointerData?

> + Do you have any plan about deparsing local qualifiers into remote
> query to avoid repeated query submission? This would improve
> performance of big UPDATE, but its use case might be limited to
> statements which consist of one foreign table. For this case, we can
> consider pass-through mode as second way.
>
I think, FDW should run UPDATE or DELETE statement at the scan
stage on remote-side, then return a pseudo result to scanner, in case
of the statement is "enough simple", like no qualifier, no returning, etc...
The callback on ExecUpdate/ExecDelete will perform just a stub; that
does not actually work except for increment of affected rows.

> + I have not read your patch closely yet, but I wonder how we can know
> which column is actually updated. If we have only updated image of
> tuple, we have to update all remote columns by "new" values?
>
It seems to me TargetEntry of the parse tree can inform us which column
should be modified on UPDATE or INSERT. If it has just a Var element
that reference original table as-is, it means here is no change.

Thanks,
--
KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>

From:	"Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at>
To:	"Kohei KaiGai EXTERN" <kaigai(at)kaigai(dot)gr(dot)jp>
Cc:	"Robert Haas" <robertmhaas(at)gmail(dot)com>, "PgHacker" <pgsql-hackers(at)postgresql(dot)org>, "Shigeru Hanada" <shigeru(dot)hanada(at)gmail(dot)com>
Subject:	Re: [v9.3] writable foreign tables
Date:	2012-08-28 09:16:05
Message-ID:	D960CB61B694CF459DCFB4B0128514C2084EFD83@exadv11.host.magwien.gv.at
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Kohei KaiGai wrote:
>>>>> It is a responsibility of FDW extension (and DBA) to ensure each
>>>>> foreign-row has a unique identifier that has 48-bits width integer
>>>>> data type in maximum.

>>> For example, if primary key of the remote table is Text data type,
>>> an idea is to use a hash table to track the text-formed primary
>>> being associated with a particular 48-bits integer.

> Even if we had a hash collision, each hash entry can have the original
> key itself to be compared. But anyway, I love the idea to support
> an opaque pointer to track particular remote-row rather.

Me too.

>>> Do we have some other reasonable ideas?

> I'm not certain whether the duration of TupleTableSlot is enough to
> carry a private datum between scan and modify stage.

> Is it possible to utilize ctid field to move a private pointer?
> TID data type is internally represented as a pointer to
ItemPointerData,
> so it has enough width to track an opaque formed remote-row
identifier;
> including string, int64 or others.
>
> One disadvantage is "ctid" system column shows a nonsense value
> when user explicitly references this system column. But it does not
> seems to me a fundamental problem, because we didn't give any
> special meaning on the "ctid" field of foreign table.

I can't say if (ab)using the field that way would cause other
problems, but I don't think that "nonsense values" are a problem.
The pointer would stay the same for the duration of the foreign
scan, which I think is as good a ctid for a foreign table as
anybody should reasonably ask.

BTW, I see the following comment in htup.h:

* t_self and t_tableOid should be valid if the HeapTupleData points to
* a disk buffer, or if it represents a copy of a tuple on disk. They
* should be explicitly set invalid in manufactured tuples.

I don't know if "invalid" means "zero" in that case.

Yours,
Laurenz Albe

From:	Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
To:	Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Subject:	Re: [v9.3] writable foreign tables
Date:	2012-08-28 09:40:09
Message-ID:	CADyhKSWgyp4tRu6tf3a3wnbVU-9Gm+Z-668Wo6mScp7RMJgo0A@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

2012/8/28 Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>:
> Kohei KaiGai wrote:
>>>>>> It is a responsibility of FDW extension (and DBA) to ensure each
>>>>>> foreign-row has a unique identifier that has 48-bits width integer
>>>>>> data type in maximum.
>
>>>> For example, if primary key of the remote table is Text data type,
>>>> an idea is to use a hash table to track the text-formed primary
>>>> being associated with a particular 48-bits integer.
>
>> Even if we had a hash collision, each hash entry can have the original
>> key itself to be compared. But anyway, I love the idea to support
>> an opaque pointer to track particular remote-row rather.
>
> Me too.
>
>>>> Do we have some other reasonable ideas?
>
>> I'm not certain whether the duration of TupleTableSlot is enough to
>> carry a private datum between scan and modify stage.
>
>> Is it possible to utilize ctid field to move a private pointer?
>> TID data type is internally represented as a pointer to
> ItemPointerData,
>> so it has enough width to track an opaque formed remote-row
> identifier;
>> including string, int64 or others.
>>
>> One disadvantage is "ctid" system column shows a nonsense value
>> when user explicitly references this system column. But it does not
>> seems to me a fundamental problem, because we didn't give any
>> special meaning on the "ctid" field of foreign table.
>
> I can't say if (ab)using the field that way would cause other
> problems, but I don't think that "nonsense values" are a problem.
> The pointer would stay the same for the duration of the foreign
> scan, which I think is as good a ctid for a foreign table as
> anybody should reasonably ask.
>
> BTW, I see the following comment in htup.h:
>
> * t_self and t_tableOid should be valid if the HeapTupleData points to
> * a disk buffer, or if it represents a copy of a tuple on disk. They
> * should be explicitly set invalid in manufactured tuples.
>
> I don't know if "invalid" means "zero" in that case.
>
ItemPointerSetInvalid is declared as follows:

/*
* ItemPointerSetInvalid
* Sets a disk item pointer to be invalid.
*/
#define ItemPointerSetInvalid(pointer) \
( \
AssertMacro(PointerIsValid(pointer)), \
BlockIdSet(&((pointer)->ip_blkid), InvalidBlockNumber), \
(pointer)->ip_posid = InvalidOffsetNumber \
)

Since ItemPointerGetBlockNumber() and ItemPointerGetOffsetNumber()
checks whether the given ItemPointer is valid, FDWs may have to put
a dummy ItemPointerData on head of their private datum to avoid
the first 6-bytes having zero.

For example, the following data structure is safe to carry an opaque
datum without false-positive of invalid ctid.

typedef struct {
ItemPointerData dumm
char *pk_of_remote_table;
} my_pseudo_rowid;

Thanks,
--
KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Cc:	Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, Robert Haas <robertmhaas(at)gmail(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Subject:	Re: [v9.3] writable foreign tables
Date:	2012-08-28 14:47:28
Message-ID:	388.1346165248@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp> writes:
>> Would it be too invasive to introduce a new pointer in TupleTableSlot
>> that is NULL for anything but virtual tuples from foreign tables?

> I'm not certain whether the duration of TupleTableSlot is enough to
> carry a private datum between scan and modify stage.

It's not.

> Is it possible to utilize ctid field to move a private pointer?

UPDATEs and DELETEs do not rely on the ctid field of tuples to carry the
TID from scan to modify --- in fact, most of the time what the modify
step is going to get is a "virtual" TupleTableSlot that hasn't even
*got* a physical CTID field.

Instead, the planner arranges for the TID to be carried up as an
explicit resjunk column named ctid. (Currently this is done in
rewriteTargetListUD(), but see also preptlist.c which does some related
things for SELECT FOR UPDATE.)

I'm inclined to think that what we need here is for FDWs to be able to
modify the details of that behavior, at least to the extent of being
able to specify a different data type than TID for the row
identification column.

regards, tom lane

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Cc:	Shigeru HANADA <shigeru(dot)hanada(at)gmail(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [v9.3] writable foreign tables
Date:	2012-08-28 14:58:25
Message-ID:	638.1346165905@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp> writes:
> It seems to me TargetEntry of the parse tree can inform us which column
> should be modified on UPDATE or INSERT. If it has just a Var element
> that reference original table as-is, it means here is no change.

Only if you're not going to support BEFORE triggers modifying the row...

regards, tom lane

From:	David Fetter <david(at)fetter(dot)org>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>, Shigeru HANADA <shigeru(dot)hanada(at)gmail(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [v9.3] writable foreign tables
Date:	2012-08-28 15:01:48
Message-ID:	20120828150148.GC17812@fetter.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, Aug 28, 2012 at 10:58:25AM -0400, Tom Lane wrote:
> Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp> writes:
> > It seems to me TargetEntry of the parse tree can inform us which column
> > should be modified on UPDATE or INSERT. If it has just a Var element
> > that reference original table as-is, it means here is no change.
>
> Only if you're not going to support BEFORE triggers modifying the row...

+1 for supporting these.

Speaking of triggers on foreign tables, what's needed to support them
independent of support at the FDW level for writing on foreign tables,
or does that even make sense?

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

From:	Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
To:	David Fetter <david(at)fetter(dot)org>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Shigeru HANADA <shigeru(dot)hanada(at)gmail(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [v9.3] writable foreign tables
Date:	2012-08-28 15:18:34
Message-ID:	CADyhKSVA2427QuV7SsxMgWj+mvRhxDbZny2zs+b5O55Cx-3pcw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

2012/8/28 David Fetter <david(at)fetter(dot)org>:
> On Tue, Aug 28, 2012 at 10:58:25AM -0400, Tom Lane wrote:
>> Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp> writes:
>> > It seems to me TargetEntry of the parse tree can inform us which column
>> > should be modified on UPDATE or INSERT. If it has just a Var element
>> > that reference original table as-is, it means here is no change.
>>
>> Only if you're not going to support BEFORE triggers modifying the row...
>
> +1 for supporting these.
>
> Speaking of triggers on foreign tables, what's needed to support them
> independent of support at the FDW level for writing on foreign tables,
> or does that even make sense?
>
I agree with trigger support on foreign tables is definitely useful feature,
even though it does not have capability to replace the writable foreign
table functionality.

In case when foreign-table definition does not contain a column mapped
with primary-key column in remote-side, the trigger function cannot
determine which row should be updated / deleted.
It is a situation that FDW driver should track a particular remote-row using
its identifier.

Thanks,
--
KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>

From:	David Fetter <david(at)fetter(dot)org>
To:	Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Shigeru HANADA <shigeru(dot)hanada(at)gmail(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [v9.3] writable foreign tables
Date:	2012-08-28 15:25:14
Message-ID:	20120828152514.GD17812@fetter.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, Aug 28, 2012 at 05:18:34PM +0200, Kohei KaiGai wrote:
> 2012/8/28 David Fetter <david(at)fetter(dot)org>:
> > On Tue, Aug 28, 2012 at 10:58:25AM -0400, Tom Lane wrote:
> >> Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp> writes:
> >> > It seems to me TargetEntry of the parse tree can inform us
> >> > which column should be modified on UPDATE or INSERT. If it has
> >> > just a Var element that reference original table as-is, it
> >> > means here is no change.
> >>
> >> Only if you're not going to support BEFORE triggers modifying the
> >> row...
> >
> > +1 for supporting these.
> >
> > Speaking of triggers on foreign tables, what's needed to support
> > them independent of support at the FDW level for writing on
> > foreign tables, or does that even make sense?
> >
> I agree with trigger support on foreign tables is definitely useful
> feature, even though it does not have capability to replace the
> writable foreign table functionality.

With utmost respect, trigger support does make it possible to write to
foreign tables using a whole-row comparison with the effect that all
whole-row matches would be affected. This is how DBI-Link does it
currently.

> In case when foreign-table definition does not contain a column
> mapped with primary-key column in remote-side, the trigger function
> cannot determine which row should be updated / deleted. It is a
> situation that FDW driver should track a particular remote-row using
> its identifier.

Generated identifiers and whole-row matching are two ways to approach
this. There are likely others, especially in cases where people have
special knowledge of the remote source.

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

From:	Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, Robert Haas <robertmhaas(at)gmail(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Subject:	Re: [v9.3] writable foreign tables
Date:	2012-08-28 15:47:23
Message-ID:	CADyhKSWJYqmesTRps8LoGEzRG-En++tjtUdTBFZ_Rg4TnwxjJg@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

2012/8/28 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp> writes:
>>> Would it be too invasive to introduce a new pointer in TupleTableSlot
>>> that is NULL for anything but virtual tuples from foreign tables?
>
>> I'm not certain whether the duration of TupleTableSlot is enough to
>> carry a private datum between scan and modify stage.
>
> It's not.
>
>> Is it possible to utilize ctid field to move a private pointer?
>
> UPDATEs and DELETEs do not rely on the ctid field of tuples to carry the
> TID from scan to modify --- in fact, most of the time what the modify
> step is going to get is a "virtual" TupleTableSlot that hasn't even
> *got* a physical CTID field.
>
> Instead, the planner arranges for the TID to be carried up as an
> explicit resjunk column named ctid. (Currently this is done in
> rewriteTargetListUD(), but see also preptlist.c which does some related
> things for SELECT FOR UPDATE.)
>
> I'm inclined to think that what we need here is for FDWs to be able to
> modify the details of that behavior, at least to the extent of being
> able to specify a different data type than TID for the row
> identification column.
>
Hmm. It seems to me a straight-forward solution rather than ab-use
of ctid system column. Probably, cstring data type is more suitable
to carry a private datum between scan and modify stage.

One problem I noticed is how FDW driver returns an extra field that
is in neither system nor regular column.
Number of columns and its data type are defined with TupleDesc of
the target foreign-table, so we also need a feature to extend it on
run-time. For example, FDW driver may have to be able to extend
a "virtual" column with cstring data type, even though the target
foreign table does not have such a column.

Thanks,
--
KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>

From:	Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
To:	David Fetter <david(at)fetter(dot)org>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Shigeru HANADA <shigeru(dot)hanada(at)gmail(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [v9.3] writable foreign tables
Date:	2012-08-28 16:08:59
Message-ID:	CADyhKSUKuGj=TAUYub2yShamJxMccViGPDYvhhpdQi3mdYENRA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

2012/8/28 David Fetter <david(at)fetter(dot)org>:
> On Tue, Aug 28, 2012 at 05:18:34PM +0200, Kohei KaiGai wrote:
>> 2012/8/28 David Fetter <david(at)fetter(dot)org>:
>> > On Tue, Aug 28, 2012 at 10:58:25AM -0400, Tom Lane wrote:
>> >> Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp> writes:
>> >> > It seems to me TargetEntry of the parse tree can inform us
>> >> > which column should be modified on UPDATE or INSERT. If it has
>> >> > just a Var element that reference original table as-is, it
>> >> > means here is no change.
>> >>
>> >> Only if you're not going to support BEFORE triggers modifying the
>> >> row...
>> >
>> > +1 for supporting these.
>> >
>> > Speaking of triggers on foreign tables, what's needed to support
>> > them independent of support at the FDW level for writing on
>> > foreign tables, or does that even make sense?
>> >
>> I agree with trigger support on foreign tables is definitely useful
>> feature, even though it does not have capability to replace the
>> writable foreign table functionality.
>
> With utmost respect, trigger support does make it possible to write to
> foreign tables using a whole-row comparison with the effect that all
> whole-row matches would be affected. This is how DBI-Link does it
> currently.
>
>> In case when foreign-table definition does not contain a column
>> mapped with primary-key column in remote-side, the trigger function
>> cannot determine which row should be updated / deleted. It is a
>> situation that FDW driver should track a particular remote-row using
>> its identifier.
>
> Generated identifiers and whole-row matching are two ways to approach
> this. There are likely others, especially in cases where people have
> special knowledge of the remote source.
>
One major problem is how to carry the generated identifiers on run-time,
even though we have no slot except for system and regular columns
defined in TupleDesc of the target foreign tables.
It may need a feature to expand TupleDesc on demand.

Of course, I don't deny the benefit of trigger support on foreign-tables.
Both writable-feature and trigger-support can be supported simultaneously.

Thanks,
--
KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>

From:	David Fetter <david(at)fetter(dot)org>
To:	Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Shigeru HANADA <shigeru(dot)hanada(at)gmail(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [v9.3] writable foreign tables
Date:	2012-08-28 16:30:35
Message-ID:	20120828163035.GE17812@fetter.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, Aug 28, 2012 at 06:08:59PM +0200, Kohei KaiGai wrote:
> 2012/8/28 David Fetter <david(at)fetter(dot)org>:
> > On Tue, Aug 28, 2012 at 05:18:34PM +0200, Kohei KaiGai wrote:
> >> 2012/8/28 David Fetter <david(at)fetter(dot)org>:
> >> > On Tue, Aug 28, 2012 at 10:58:25AM -0400, Tom Lane wrote:
> >> >> Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp> writes:
> >> >> > It seems to me TargetEntry of the parse tree can inform us
> >> >> > which column should be modified on UPDATE or INSERT. If it has
> >> >> > just a Var element that reference original table as-is, it
> >> >> > means here is no change.
> >> >>
> >> >> Only if you're not going to support BEFORE triggers modifying the
> >> >> row...
> >> >
> >> > +1 for supporting these.
> >
> > Generated identifiers and whole-row matching are two ways to approach
> > this. There are likely others, especially in cases where people have
> > special knowledge of the remote source.
> >
> One major problem is how to carry the generated identifiers on run-time,
> even though we have no slot except for system and regular columns
> defined in TupleDesc of the target foreign tables.
> It may need a feature to expand TupleDesc on demand.

Could be. You know a lot more about the implementation details than I do.

> Of course, I don't deny the benefit of trigger support on foreign-tables.
> Both writable-feature and trigger-support can be supported simultaneously.

Do you see these as independent features, or is there some essential
overlap?

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

From:	Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, Robert Haas <robertmhaas(at)gmail(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Subject:	Re: [v9.3] writable foreign tables
Date:	2012-08-29 09:13:21
Message-ID:	CADyhKSXKxET7A5sKKGCS3RO84srkzKMgD_jTQrtGnyEDfmdS_w@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

2012/8/28 Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>:
> 2012/8/28 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
>> Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp> writes:
>>>> Would it be too invasive to introduce a new pointer in TupleTableSlot
>>>> that is NULL for anything but virtual tuples from foreign tables?
>>
>>> I'm not certain whether the duration of TupleTableSlot is enough to
>>> carry a private datum between scan and modify stage.
>>
>> It's not.
>>
>>> Is it possible to utilize ctid field to move a private pointer?
>>
>> UPDATEs and DELETEs do not rely on the ctid field of tuples to carry the
>> TID from scan to modify --- in fact, most of the time what the modify
>> step is going to get is a "virtual" TupleTableSlot that hasn't even
>> *got* a physical CTID field.
>>
>> Instead, the planner arranges for the TID to be carried up as an
>> explicit resjunk column named ctid. (Currently this is done in
>> rewriteTargetListUD(), but see also preptlist.c which does some related
>> things for SELECT FOR UPDATE.)
>>
>> I'm inclined to think that what we need here is for FDWs to be able to
>> modify the details of that behavior, at least to the extent of being
>> able to specify a different data type than TID for the row
>> identification column.
>>
> Hmm. It seems to me a straight-forward solution rather than ab-use
> of ctid system column. Probably, cstring data type is more suitable
> to carry a private datum between scan and modify stage.
>
> One problem I noticed is how FDW driver returns an extra field that
> is in neither system nor regular column.
> Number of columns and its data type are defined with TupleDesc of
> the target foreign-table, so we also need a feature to extend it on
> run-time. For example, FDW driver may have to be able to extend
> a "virtual" column with cstring data type, even though the target
> foreign table does not have such a column.
>
I tried to investigate the related routines.

TupleDesc of TupleTableSlot associated with ForeignScanState
is initialized at ExecInitForeignScan as literal.
ExecAssignScanType assigns TupleDesc of the target foreign-
table on tts_tupleDescriptor, "as-is".
It is the reason why IterateForeignScan cannot return a private
datum except for the columns being declared as regular ones.

Confrontation between ForeignScan and SubqueryScan tell us
the point to be extended. It assigns TupleDesc of the subplan
generated at run-time.
It seems to me ForeignScan will be able to adopt similar idea;
that allows to append "pseudo-column" onto TupleDesc to
carry identifier of remote-rows to be updated / deleted, if
FDW driver can control TupleDesc being set, instead of the
one come from relation's definition as-is.

Any comment please. Thanks,
--
KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>

From:	Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
To:	David Fetter <david(at)fetter(dot)org>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Shigeru HANADA <shigeru(dot)hanada(at)gmail(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [v9.3] writable foreign tables
Date:	2012-08-29 09:18:56
Message-ID:	CADyhKSVr-EEZ-2Wx1mWcUqk=50_-5Vn8jB_guP6D=WH7bUn8MQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

2012/8/28 David Fetter <david(at)fetter(dot)org>:
> On Tue, Aug 28, 2012 at 06:08:59PM +0200, Kohei KaiGai wrote:
>> 2012/8/28 David Fetter <david(at)fetter(dot)org>:
>> > On Tue, Aug 28, 2012 at 05:18:34PM +0200, Kohei KaiGai wrote:
>> >> 2012/8/28 David Fetter <david(at)fetter(dot)org>:
>> >> > On Tue, Aug 28, 2012 at 10:58:25AM -0400, Tom Lane wrote:
>> >> >> Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp> writes:
>> >> >> > It seems to me TargetEntry of the parse tree can inform us
>> >> >> > which column should be modified on UPDATE or INSERT. If it has
>> >> >> > just a Var element that reference original table as-is, it
>> >> >> > means here is no change.
>> >> >>
>> >> >> Only if you're not going to support BEFORE triggers modifying the
>> >> >> row...
>> >> >
>> >> > +1 for supporting these.
>> >
>> > Generated identifiers and whole-row matching are two ways to approach
>> > this. There are likely others, especially in cases where people have
>> > special knowledge of the remote source.
>> >
>> One major problem is how to carry the generated identifiers on run-time,
>> even though we have no slot except for system and regular columns
>> defined in TupleDesc of the target foreign tables.
>> It may need a feature to expand TupleDesc on demand.
>
> Could be. You know a lot more about the implementation details than I do.
>
>> Of course, I don't deny the benefit of trigger support on foreign-tables.
>> Both writable-feature and trigger-support can be supported simultaneously.
>
> Do you see these as independent features, or is there some essential
> overlap?
>
If we stand on the viewpoint that foreign-tables should perform as if regular
tables, I don't think its writer feature should depend on trigger stuff.
They can work independently.

On the other hand, trigger feature gives users flexibility to control the data
to be written, as if regular tables. We shouldn't miss the point.
At least, I don't think we have some technical differences to support row-level
triggers on foreign tables.

Thanks,
--
KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>

From:	"Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at>
To:	"Tom Lane EXTERN" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Kohei KaiGai" <kaigai(at)kaigai(dot)gr(dot)jp>
Cc:	"Robert Haas" <robertmhaas(at)gmail(dot)com>, "PgHacker" <pgsql-hackers(at)postgresql(dot)org>, "Shigeru Hanada" <shigeru(dot)hanada(at)gmail(dot)com>
Subject:	Re: [v9.3] writable foreign tables
Date:	2012-09-13 13:46:02
Message-ID:	D960CB61B694CF459DCFB4B0128514C2086C1360@exadv11.host.magwien.gv.at
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Tom Lane wrote:
> Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp> writes:
>> Laurenz Albe wrote:
>>> Would it be too invasive to introduce a new pointer in
TupleTableSlot
>>> that is NULL for anything but virtual tuples from foreign tables?

>> I'm not certain whether the duration of TupleTableSlot is enough to
>> carry a private datum between scan and modify stage.

> It's not.

>> Is it possible to utilize ctid field to move a private pointer?

> UPDATEs and DELETEs do not rely on the ctid field of tuples to carry
the
> TID from scan to modify --- in fact, most of the time what the modify
> step is going to get is a "virtual" TupleTableSlot that hasn't even
> *got* a physical CTID field.
>
> Instead, the planner arranges for the TID to be carried up as an
> explicit resjunk column named ctid. (Currently this is done in
> rewriteTargetListUD(), but see also preptlist.c which does some
related
> things for SELECT FOR UPDATE.)
>
> I'm inclined to think that what we need here is for FDWs to be able to
> modify the details of that behavior, at least to the extent of being
> able to specify a different data type than TID for the row
> identification column.

Would that imply inventing a new system attribute for
"foreign tid"?

Yours,
Laurenz Albe

From:	Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
To:	Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Cc:	"Tom Lane EXTERN" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Subject:	Re: [v9.3] writable foreign tables
Date:	2012-09-13 14:09:20
Message-ID:	CADyhKSXCAW6hWZVPvVL0njp1BzeyMv29aF0FoeQbsHHHEi4M9A@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

2012/9/13 Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>:
> Tom Lane wrote:
>> Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp> writes:
>>> Laurenz Albe wrote:
>>>> Would it be too invasive to introduce a new pointer in
> TupleTableSlot
>>>> that is NULL for anything but virtual tuples from foreign tables?
>
>>> I'm not certain whether the duration of TupleTableSlot is enough to
>>> carry a private datum between scan and modify stage.
>
>> It's not.
>
>>> Is it possible to utilize ctid field to move a private pointer?
>
>> UPDATEs and DELETEs do not rely on the ctid field of tuples to carry
> the
>> TID from scan to modify --- in fact, most of the time what the modify
>> step is going to get is a "virtual" TupleTableSlot that hasn't even
>> *got* a physical CTID field.
>>
>> Instead, the planner arranges for the TID to be carried up as an
>> explicit resjunk column named ctid. (Currently this is done in
>> rewriteTargetListUD(), but see also preptlist.c which does some
> related
>> things for SELECT FOR UPDATE.)
>>
>> I'm inclined to think that what we need here is for FDWs to be able to
>> modify the details of that behavior, at least to the extent of being
>> able to specify a different data type than TID for the row
>> identification column.
>
> Would that imply inventing a new system attribute for
> "foreign tid"?
>
It is an idea to implement this feature with minimum code side.

However, my preference is to support "pseudo-column" approach
rather than system columns, because it also can be utilized for
another interesting feature that enables to push-down target entry
onto remote side.
So, I'd like to try to support a feature that allows foreign-table to
return "pseudo-column" in addition to its table definition to move
row-id of remote tuples, as primary purpose of this.

Thanks,
--
KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	"Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Cc:	"Kohei KaiGai" <kaigai(at)kaigai(dot)gr(dot)jp>, "Robert Haas" <robertmhaas(at)gmail(dot)com>, "PgHacker" <pgsql-hackers(at)postgresql(dot)org>, "Shigeru Hanada" <shigeru(dot)hanada(at)gmail(dot)com>
Subject:	Re: [v9.3] writable foreign tables
Date:	2012-09-13 14:32:54
Message-ID:	26384.1347546774@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

"Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at> writes:
> Tom Lane wrote:
>> Instead, the planner arranges for the TID to be carried up as an
>> explicit resjunk column named ctid. (Currently this is done in
>> rewriteTargetListUD(), but see also preptlist.c which does some
>> related things for SELECT FOR UPDATE.)
>>
>> I'm inclined to think that what we need here is for FDWs to be able to
>> modify the details of that behavior, at least to the extent of being
>> able to specify a different data type than TID for the row
>> identification column.

> Would that imply inventing a new system attribute for
> "foreign tid"?

No, I think you missed the point of what I wrote completely. The target
row ID is not treated as a system attribute during UPDATE/DELETE. It's
an ordinary data column that's silently added to what the user wrote.

regards, tom lane

From:	Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, Robert Haas <robertmhaas(at)gmail(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Subject:	Re: [v9.3] writable foreign tables
Date:	2012-09-23 06:25:44
Message-ID:	CADyhKSW9Y4tOPo7SSsKWUuHnVezrGOKDJXH5XQcNm1w+EFTqug@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

2012/8/29 Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>:
> 2012/8/28 Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>:
>> 2012/8/28 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
>>> Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp> writes:
>>>>> Would it be too invasive to introduce a new pointer in TupleTableSlot
>>>>> that is NULL for anything but virtual tuples from foreign tables?
>>>
>>>> I'm not certain whether the duration of TupleTableSlot is enough to
>>>> carry a private datum between scan and modify stage.
>>>
>>> It's not.
>>>
>>>> Is it possible to utilize ctid field to move a private pointer?
>>>
>>> UPDATEs and DELETEs do not rely on the ctid field of tuples to carry the
>>> TID from scan to modify --- in fact, most of the time what the modify
>>> step is going to get is a "virtual" TupleTableSlot that hasn't even
>>> *got* a physical CTID field.
>>>
>>> Instead, the planner arranges for the TID to be carried up as an
>>> explicit resjunk column named ctid. (Currently this is done in
>>> rewriteTargetListUD(), but see also preptlist.c which does some related
>>> things for SELECT FOR UPDATE.)
>>>
>>> I'm inclined to think that what we need here is for FDWs to be able to
>>> modify the details of that behavior, at least to the extent of being
>>> able to specify a different data type than TID for the row
>>> identification column.
>>>
>> Hmm. It seems to me a straight-forward solution rather than ab-use
>> of ctid system column. Probably, cstring data type is more suitable
>> to carry a private datum between scan and modify stage.
>>
>> One problem I noticed is how FDW driver returns an extra field that
>> is in neither system nor regular column.
>> Number of columns and its data type are defined with TupleDesc of
>> the target foreign-table, so we also need a feature to extend it on
>> run-time. For example, FDW driver may have to be able to extend
>> a "virtual" column with cstring data type, even though the target
>> foreign table does not have such a column.
>>
> I tried to investigate the related routines.
>
> TupleDesc of TupleTableSlot associated with ForeignScanState
> is initialized at ExecInitForeignScan as literal.
> ExecAssignScanType assigns TupleDesc of the target foreign-
> table on tts_tupleDescriptor, "as-is".
> It is the reason why IterateForeignScan cannot return a private
> datum except for the columns being declared as regular ones.
>
The attached patch improved its design according to the upthread
discussion. It now got away from ab-use of "ctid" field, and adopts
a concept of pseudo-column to hold row-id with opaque data type
instead.

Pseudo-column is Var reference towards attribute-number larger
than number of attributes on the target relation; thus, it is not
a substantial object. It is normally unavailable to reference such
a larger attribute number because TupleDesc of each ScanState
associated with a particular relation is initialized at ExecInitNode.

The patched ExecInitForeignScan was extended to generate its
own TupleDesc including pseudo-column definitions on the fly,
instead of relation's one, when scan-plan of foreign-table requires
to have pseudo-columns.

Right now, the only possible pseudo-column is "rowid" being
injected at rewriteTargetListUD(). It has no data format
restriction like "ctid" because of VOID data type.
FDW extension can set an appropriate value on the "rowid"
field in addition to contents of regular columns at
IterateForeignScan method, to track which remote row should
be updated or deleted.

Another possible usage of this pseudo-column is push-down
of target-list including complex calculation. It may enable to
move complex mathematical formula into remote devices
(such as GPU device?) instead of just a reference of Var node.

This patch adds a new interface: GetForeignRelInfo being invoked
from get_relation_info() to adjust width of RelOptInfo->attr_needed
according to the target-list which may contain "rowid" pseudo-column.
Some FDW extension may use this interface to push-down a part of
target list into remote side, even though I didn't implement this
feature on file_fdw.

RelOptInfo->max_attr is a good marker whether the plan shall have
pseudo-column reference. Then, ExecInitForeignScan determines
whether it should generate a TupleDesc, or not.

The "rowid" is fetched using ExecGetJunkAttribute as we are currently
doing on regular tables using "ctid", then it shall be delivered to
ExecUpdate or ExecDelete. We can never expect the fist argument of
them now, so "ItemPointer tupleid" redefined to "Datum rowid", and
argument of BR-trigger routines redefined also.

[kaigai(at)iwashi sepgsql]$ cat ~/testfile.csv
10 aaa
11 bbb
12 ccc
13 ddd
14 eee
15 fff
[kaigai(at)iwashi sepgsql]$ psql postgres
psql (9.3devel)
Type "help" for help.

postgres=# UPDATE ftbl SET b = md5(b) WHERE a > 12 RETURNING *;
INFO: ftbl is the target relation of UPDATE
INFO: fdw_file: BeginForeignModify method
INFO: fdw_file: UPDATE (lineno = 4)
INFO: fdw_file: UPDATE (lineno = 5)
INFO: fdw_file: UPDATE (lineno = 6)
INFO: fdw_file: EndForeignModify method
a | b
----+----------------------------------
13 | 77963b7a931377ad4ab5ad6a9cd718aa
14 | d2f2297d6e829cd3493aa7de4416a18f
15 | 343d9040a671c45832ee5381860e2996
(3 rows)

UPDATE 3
postgres=# DELETE FROM ftbl WHERE a % 2 = 1 RETURNING *;
INFO: ftbl is the target relation of DELETE
INFO: fdw_file: BeginForeignModify method
INFO: fdw_file: DELETE (lineno = 2)
INFO: fdw_file: DELETE (lineno = 4)
INFO: fdw_file: DELETE (lineno = 6)
INFO: fdw_file: EndForeignModify method
a | b
----+-----
11 | bbb
13 | ddd
15 | fff
(3 rows)

DELETE 3

In addition, there is a small improvement. ExecForeignInsert,
ExecForeignUpdate and ExecForeignDelete get being able
to return number of processed rows; that allows to push-down
whole the statement into remote-side, if it is enough simple
(e.g, delete statement without any condition).

Even though it does not make matter right now, pseudo-columns
should be adjusted when foreign-table is referenced with table
inheritance feature, because an attribute number being enough
large in parent table is not enough large in child table.
We need to fix up them until foreign table feature got inheritance
capability.

I didn't update the documentation stuff because I positioned
the state of this patch as proof-of-concept now. Please note that.

Thanks,
--
KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>

Attachment	Content-Type	Size
pgsql-v9.3-writable-fdw-poc.v2.patch	application/octet-stream	46.2 KB

From:	Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, Robert Haas <robertmhaas(at)gmail(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Subject:	Re: [v9.3] writable foreign tables
Date:	2012-09-24 08:49:16
Message-ID:	CADyhKSX1wzwpaMWxC-1t1O9BM8Hhgu6Re2e+=bw2FNcVOZWAtw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

2012/9/23 Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>:
> 2012/8/29 Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>:
>> 2012/8/28 Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>:
>>> 2012/8/28 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
>>>> Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp> writes:
>>>>>> Would it be too invasive to introduce a new pointer in TupleTableSlot
>>>>>> that is NULL for anything but virtual tuples from foreign tables?
>>>>
>>>>> I'm not certain whether the duration of TupleTableSlot is enough to
>>>>> carry a private datum between scan and modify stage.
>>>>
>>>> It's not.
>>>>
>>>>> Is it possible to utilize ctid field to move a private pointer?
>>>>
>>>> UPDATEs and DELETEs do not rely on the ctid field of tuples to carry the
>>>> TID from scan to modify --- in fact, most of the time what the modify
>>>> step is going to get is a "virtual" TupleTableSlot that hasn't even
>>>> *got* a physical CTID field.
>>>>
>>>> Instead, the planner arranges for the TID to be carried up as an
>>>> explicit resjunk column named ctid. (Currently this is done in
>>>> rewriteTargetListUD(), but see also preptlist.c which does some related
>>>> things for SELECT FOR UPDATE.)
>>>>
>>>> I'm inclined to think that what we need here is for FDWs to be able to
>>>> modify the details of that behavior, at least to the extent of being
>>>> able to specify a different data type than TID for the row
>>>> identification column.
>>>>
>>> Hmm. It seems to me a straight-forward solution rather than ab-use
>>> of ctid system column. Probably, cstring data type is more suitable
>>> to carry a private datum between scan and modify stage.
>>>
>>> One problem I noticed is how FDW driver returns an extra field that
>>> is in neither system nor regular column.
>>> Number of columns and its data type are defined with TupleDesc of
>>> the target foreign-table, so we also need a feature to extend it on
>>> run-time. For example, FDW driver may have to be able to extend
>>> a "virtual" column with cstring data type, even though the target
>>> foreign table does not have such a column.
>>>
>> I tried to investigate the related routines.
>>
>> TupleDesc of TupleTableSlot associated with ForeignScanState
>> is initialized at ExecInitForeignScan as literal.
>> ExecAssignScanType assigns TupleDesc of the target foreign-
>> table on tts_tupleDescriptor, "as-is".
>> It is the reason why IterateForeignScan cannot return a private
>> datum except for the columns being declared as regular ones.
>>
> The attached patch improved its design according to the upthread
> discussion. It now got away from ab-use of "ctid" field, and adopts
> a concept of pseudo-column to hold row-id with opaque data type
> instead.
>
> Pseudo-column is Var reference towards attribute-number larger
> than number of attributes on the target relation; thus, it is not
> a substantial object. It is normally unavailable to reference such
> a larger attribute number because TupleDesc of each ScanState
> associated with a particular relation is initialized at ExecInitNode.
>
> The patched ExecInitForeignScan was extended to generate its
> own TupleDesc including pseudo-column definitions on the fly,
> instead of relation's one, when scan-plan of foreign-table requires
> to have pseudo-columns.
>
> Right now, the only possible pseudo-column is "rowid" being
> injected at rewriteTargetListUD(). It has no data format
> restriction like "ctid" because of VOID data type.
> FDW extension can set an appropriate value on the "rowid"
> field in addition to contents of regular columns at
> IterateForeignScan method, to track which remote row should
> be updated or deleted.
>
> Another possible usage of this pseudo-column is push-down
> of target-list including complex calculation. It may enable to
> move complex mathematical formula into remote devices
> (such as GPU device?) instead of just a reference of Var node.
>
> This patch adds a new interface: GetForeignRelInfo being invoked
> from get_relation_info() to adjust width of RelOptInfo->attr_needed
> according to the target-list which may contain "rowid" pseudo-column.
> Some FDW extension may use this interface to push-down a part of
> target list into remote side, even though I didn't implement this
> feature on file_fdw.
>
> RelOptInfo->max_attr is a good marker whether the plan shall have
> pseudo-column reference. Then, ExecInitForeignScan determines
> whether it should generate a TupleDesc, or not.
>
> The "rowid" is fetched using ExecGetJunkAttribute as we are currently
> doing on regular tables using "ctid", then it shall be delivered to
> ExecUpdate or ExecDelete. We can never expect the fist argument of
> them now, so "ItemPointer tupleid" redefined to "Datum rowid", and
> argument of BR-trigger routines redefined also.
>
> [kaigai(at)iwashi sepgsql]$ cat ~/testfile.csv
> 10 aaa
> 11 bbb
> 12 ccc
> 13 ddd
> 14 eee
> 15 fff
> [kaigai(at)iwashi sepgsql]$ psql postgres
> psql (9.3devel)
> Type "help" for help.
>
> postgres=# UPDATE ftbl SET b = md5(b) WHERE a > 12 RETURNING *;
> INFO: ftbl is the target relation of UPDATE
> INFO: fdw_file: BeginForeignModify method
> INFO: fdw_file: UPDATE (lineno = 4)
> INFO: fdw_file: UPDATE (lineno = 5)
> INFO: fdw_file: UPDATE (lineno = 6)
> INFO: fdw_file: EndForeignModify method
> a | b
> ----+----------------------------------
> 13 | 77963b7a931377ad4ab5ad6a9cd718aa
> 14 | d2f2297d6e829cd3493aa7de4416a18f
> 15 | 343d9040a671c45832ee5381860e2996
> (3 rows)
>
> UPDATE 3
> postgres=# DELETE FROM ftbl WHERE a % 2 = 1 RETURNING *;
> INFO: ftbl is the target relation of DELETE
> INFO: fdw_file: BeginForeignModify method
> INFO: fdw_file: DELETE (lineno = 2)
> INFO: fdw_file: DELETE (lineno = 4)
> INFO: fdw_file: DELETE (lineno = 6)
> INFO: fdw_file: EndForeignModify method
> a | b
> ----+-----
> 11 | bbb
> 13 | ddd
> 15 | fff
> (3 rows)
>
> DELETE 3
>
> In addition, there is a small improvement. ExecForeignInsert,
> ExecForeignUpdate and ExecForeignDelete get being able
> to return number of processed rows; that allows to push-down
> whole the statement into remote-side, if it is enough simple
> (e.g, delete statement without any condition).
>
> Even though it does not make matter right now, pseudo-columns
> should be adjusted when foreign-table is referenced with table
> inheritance feature, because an attribute number being enough
> large in parent table is not enough large in child table.
> We need to fix up them until foreign table feature got inheritance
> capability.
>
> I didn't update the documentation stuff because I positioned
> the state of this patch as proof-of-concept now. Please note that.
>
A tiny bit of this patch was updated. I noticed INTERNAL data type
is more suitable to move a private datum from scan-stage to
modify-stage, because its type-length is declared as SIZEOF_POINTER.

Also, I fixed up some obvious compiler warnings.

Thanks,
--
KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>

Attachment	Content-Type	Size
pgsql-v9.3-writable-fdw-poc.v3.patch	application/octet-stream	46.3 KB

From:	Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To:	Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, Robert Haas <robertmhaas(at)gmail(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Subject:	Re: [v9.3] writable foreign tables
Date:	2012-11-02 21:07:39
Message-ID:	CAPpHfds=9aJK2OPjJJzER4XXa43YFDHckuAb3pmCdNpCVfZMZw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, Sep 24, 2012 at 12:49 PM, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp> wrote:

> 2012/9/23 Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>:
> > 2012/8/29 Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>:
> >> 2012/8/28 Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>:
> >>> 2012/8/28 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> >>>> Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp> writes:
> >>>>>> Would it be too invasive to introduce a new pointer in
> TupleTableSlot
> >>>>>> that is NULL for anything but virtual tuples from foreign tables?
> >>>>
> >>>>> I'm not certain whether the duration of TupleTableSlot is enough to
> >>>>> carry a private datum between scan and modify stage.
> >>>>
> >>>> It's not.
> >>>>
> >>>>> Is it possible to utilize ctid field to move a private pointer?
> >>>>
> >>>> UPDATEs and DELETEs do not rely on the ctid field of tuples to carry
> the
> >>>> TID from scan to modify --- in fact, most of the time what the modify
> >>>> step is going to get is a "virtual" TupleTableSlot that hasn't even
> >>>> *got* a physical CTID field.
> >>>>
> >>>> Instead, the planner arranges for the TID to be carried up as an
> >>>> explicit resjunk column named ctid. (Currently this is done in
> >>>> rewriteTargetListUD(), but see also preptlist.c which does some
> related
> >>>> things for SELECT FOR UPDATE.)
> >>>>
> >>>> I'm inclined to think that what we need here is for FDWs to be able to
> >>>> modify the details of that behavior, at least to the extent of being
> >>>> able to specify a different data type than TID for the row
> >>>> identification column.
> >>>>
> >>> Hmm. It seems to me a straight-forward solution rather than ab-use
> >>> of ctid system column. Probably, cstring data type is more suitable
> >>> to carry a private datum between scan and modify stage.
> >>>
> >>> One problem I noticed is how FDW driver returns an extra field that
> >>> is in neither system nor regular column.
> >>> Number of columns and its data type are defined with TupleDesc of
> >>> the target foreign-table, so we also need a feature to extend it on
> >>> run-time. For example, FDW driver may have to be able to extend
> >>> a "virtual" column with cstring data type, even though the target
> >>> foreign table does not have such a column.
> >>>
> >> I tried to investigate the related routines.
> >>
> >> TupleDesc of TupleTableSlot associated with ForeignScanState
> >> is initialized at ExecInitForeignScan as literal.
> >> ExecAssignScanType assigns TupleDesc of the target foreign-
> >> table on tts_tupleDescriptor, "as-is".
> >> It is the reason why IterateForeignScan cannot return a private
> >> datum except for the columns being declared as regular ones.
> >>
> > The attached patch improved its design according to the upthread
> > discussion. It now got away from ab-use of "ctid" field, and adopts
> > a concept of pseudo-column to hold row-id with opaque data type
> > instead.
> >
> > Pseudo-column is Var reference towards attribute-number larger
> > than number of attributes on the target relation; thus, it is not
> > a substantial object. It is normally unavailable to reference such
> > a larger attribute number because TupleDesc of each ScanState
> > associated with a particular relation is initialized at ExecInitNode.
> >
> > The patched ExecInitForeignScan was extended to generate its
> > own TupleDesc including pseudo-column definitions on the fly,
> > instead of relation's one, when scan-plan of foreign-table requires
> > to have pseudo-columns.
> >
> > Right now, the only possible pseudo-column is "rowid" being
> > injected at rewriteTargetListUD(). It has no data format
> > restriction like "ctid" because of VOID data type.
> > FDW extension can set an appropriate value on the "rowid"
> > field in addition to contents of regular columns at
> > IterateForeignScan method, to track which remote row should
> > be updated or deleted.
> >
> > Another possible usage of this pseudo-column is push-down
> > of target-list including complex calculation. It may enable to
> > move complex mathematical formula into remote devices
> > (such as GPU device?) instead of just a reference of Var node.
> >
> > This patch adds a new interface: GetForeignRelInfo being invoked
> > from get_relation_info() to adjust width of RelOptInfo->attr_needed
> > according to the target-list which may contain "rowid" pseudo-column.
> > Some FDW extension may use this interface to push-down a part of
> > target list into remote side, even though I didn't implement this
> > feature on file_fdw.
> >
> > RelOptInfo->max_attr is a good marker whether the plan shall have
> > pseudo-column reference. Then, ExecInitForeignScan determines
> > whether it should generate a TupleDesc, or not.
> >
> > The "rowid" is fetched using ExecGetJunkAttribute as we are currently
> > doing on regular tables using "ctid", then it shall be delivered to
> > ExecUpdate or ExecDelete. We can never expect the fist argument of
> > them now, so "ItemPointer tupleid" redefined to "Datum rowid", and
> > argument of BR-trigger routines redefined also.
> >
> > [kaigai(at)iwashi sepgsql]$ cat ~/testfile.csv
> > 10 aaa
> > 11 bbb
> > 12 ccc
> > 13 ddd
> > 14 eee
> > 15 fff
> > [kaigai(at)iwashi sepgsql]$ psql postgres
> > psql (9.3devel)
> > Type "help" for help.
> >
> > postgres=# UPDATE ftbl SET b = md5(b) WHERE a > 12 RETURNING *;
> > INFO: ftbl is the target relation of UPDATE
> > INFO: fdw_file: BeginForeignModify method
> > INFO: fdw_file: UPDATE (lineno = 4)
> > INFO: fdw_file: UPDATE (lineno = 5)
> > INFO: fdw_file: UPDATE (lineno = 6)
> > INFO: fdw_file: EndForeignModify method
> > a | b
> > ----+----------------------------------
> > 13 | 77963b7a931377ad4ab5ad6a9cd718aa
> > 14 | d2f2297d6e829cd3493aa7de4416a18f
> > 15 | 343d9040a671c45832ee5381860e2996
> > (3 rows)
> >
> > UPDATE 3
> > postgres=# DELETE FROM ftbl WHERE a % 2 = 1 RETURNING *;
> > INFO: ftbl is the target relation of DELETE
> > INFO: fdw_file: BeginForeignModify method
> > INFO: fdw_file: DELETE (lineno = 2)
> > INFO: fdw_file: DELETE (lineno = 4)
> > INFO: fdw_file: DELETE (lineno = 6)
> > INFO: fdw_file: EndForeignModify method
> > a | b
> > ----+-----
> > 11 | bbb
> > 13 | ddd
> > 15 | fff
> > (3 rows)
> >
> > DELETE 3
> >
> > In addition, there is a small improvement. ExecForeignInsert,
> > ExecForeignUpdate and ExecForeignDelete get being able
> > to return number of processed rows; that allows to push-down
> > whole the statement into remote-side, if it is enough simple
> > (e.g, delete statement without any condition).
> >
> > Even though it does not make matter right now, pseudo-columns
> > should be adjusted when foreign-table is referenced with table
> > inheritance feature, because an attribute number being enough
> > large in parent table is not enough large in child table.
> > We need to fix up them until foreign table feature got inheritance
> > capability.
> >
> > I didn't update the documentation stuff because I positioned
> > the state of this patch as proof-of-concept now. Please note that.
> >
> A tiny bit of this patch was updated. I noticed INTERNAL data type
> is more suitable to move a private datum from scan-stage to
> modify-stage, because its type-length is declared as SIZEOF_POINTER.
>
> Also, I fixed up some obvious compiler warnings.

I've read previous discussion about this patch. It's generally concentrated
on the question how to identify foreign table row? Your last patch
introduce "rowid" pseudo-column for foreign table row identification. My
notes are following:
1) AFAICS your patch are designed to support arbitrary number of
pseudo-columns while only one is currently used. Do you see more use cases
of pseudo-columns?
2) You wrote that FDW can support or don't support write depending on
having corresponding functions. However it's likely some tables of same FDW
could be writable while another are not. I think we should have some
mechanism for FDW telling whether particular table is writable.
3) I have another point about identification of foreign rows which I didn't
meet in previous discussion. What if we restrict writable FDW table to have
set of column which are unique identifier of a row. Many replication
systems have this restriction: relicated tables should have a unique key.
In case of text or csv file I don't see why making line number column user
visible is bad.

------
With best regards,
Alexander Korotkov.

From:	Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
To:	Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, Robert Haas <robertmhaas(at)gmail(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Subject:	Re: [v9.3] writable foreign tables
Date:	2012-11-04 08:08:57
Message-ID:	CADyhKSX3nkbKXJ3An31Z1wY8q+ZWuakVs+z2LRgsnPJX391yPw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

2012/11/2 Alexander Korotkov <aekorotkov(at)gmail(dot)com>:
> On Mon, Sep 24, 2012 at 12:49 PM, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp> wrote:
>>
>> 2012/9/23 Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>:
>> > 2012/8/29 Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>:
>> >> 2012/8/28 Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>:
>> >>> 2012/8/28 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
>> >>>> Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp> writes:
>> >>>>>> Would it be too invasive to introduce a new pointer in
>> >>>>>> TupleTableSlot
>> >>>>>> that is NULL for anything but virtual tuples from foreign tables?
>> >>>>
>> >>>>> I'm not certain whether the duration of TupleTableSlot is enough to
>> >>>>> carry a private datum between scan and modify stage.
>> >>>>
>> >>>> It's not.
>> >>>>
>> >>>>> Is it possible to utilize ctid field to move a private pointer?
>> >>>>
>> >>>> UPDATEs and DELETEs do not rely on the ctid field of tuples to carry
>> >>>> the
>> >>>> TID from scan to modify --- in fact, most of the time what the modify
>> >>>> step is going to get is a "virtual" TupleTableSlot that hasn't even
>> >>>> *got* a physical CTID field.
>> >>>>
>> >>>> Instead, the planner arranges for the TID to be carried up as an
>> >>>> explicit resjunk column named ctid. (Currently this is done in
>> >>>> rewriteTargetListUD(), but see also preptlist.c which does some
>> >>>> related
>> >>>> things for SELECT FOR UPDATE.)
>> >>>>
>> >>>> I'm inclined to think that what we need here is for FDWs to be able
>> >>>> to
>> >>>> modify the details of that behavior, at least to the extent of being
>> >>>> able to specify a different data type than TID for the row
>> >>>> identification column.
>> >>>>
>> >>> Hmm. It seems to me a straight-forward solution rather than ab-use
>> >>> of ctid system column. Probably, cstring data type is more suitable
>> >>> to carry a private datum between scan and modify stage.
>> >>>
>> >>> One problem I noticed is how FDW driver returns an extra field that
>> >>> is in neither system nor regular column.
>> >>> Number of columns and its data type are defined with TupleDesc of
>> >>> the target foreign-table, so we also need a feature to extend it on
>> >>> run-time. For example, FDW driver may have to be able to extend
>> >>> a "virtual" column with cstring data type, even though the target
>> >>> foreign table does not have such a column.
>> >>>
>> >> I tried to investigate the related routines.
>> >>
>> >> TupleDesc of TupleTableSlot associated with ForeignScanState
>> >> is initialized at ExecInitForeignScan as literal.
>> >> ExecAssignScanType assigns TupleDesc of the target foreign-
>> >> table on tts_tupleDescriptor, "as-is".
>> >> It is the reason why IterateForeignScan cannot return a private
>> >> datum except for the columns being declared as regular ones.
>> >>
>> > The attached patch improved its design according to the upthread
>> > discussion. It now got away from ab-use of "ctid" field, and adopts
>> > a concept of pseudo-column to hold row-id with opaque data type
>> > instead.
>> >
>> > Pseudo-column is Var reference towards attribute-number larger
>> > than number of attributes on the target relation; thus, it is not
>> > a substantial object. It is normally unavailable to reference such
>> > a larger attribute number because TupleDesc of each ScanState
>> > associated with a particular relation is initialized at ExecInitNode.
>> >
>> > The patched ExecInitForeignScan was extended to generate its
>> > own TupleDesc including pseudo-column definitions on the fly,
>> > instead of relation's one, when scan-plan of foreign-table requires
>> > to have pseudo-columns.
>> >
>> > Right now, the only possible pseudo-column is "rowid" being
>> > injected at rewriteTargetListUD(). It has no data format
>> > restriction like "ctid" because of VOID data type.
>> > FDW extension can set an appropriate value on the "rowid"
>> > field in addition to contents of regular columns at
>> > IterateForeignScan method, to track which remote row should
>> > be updated or deleted.
>> >
>> > Another possible usage of this pseudo-column is push-down
>> > of target-list including complex calculation. It may enable to
>> > move complex mathematical formula into remote devices
>> > (such as GPU device?) instead of just a reference of Var node.
>> >
>> > This patch adds a new interface: GetForeignRelInfo being invoked
>> > from get_relation_info() to adjust width of RelOptInfo->attr_needed
>> > according to the target-list which may contain "rowid" pseudo-column.
>> > Some FDW extension may use this interface to push-down a part of
>> > target list into remote side, even though I didn't implement this
>> > feature on file_fdw.
>> >
>> > RelOptInfo->max_attr is a good marker whether the plan shall have
>> > pseudo-column reference. Then, ExecInitForeignScan determines
>> > whether it should generate a TupleDesc, or not.
>> >
>> > The "rowid" is fetched using ExecGetJunkAttribute as we are currently
>> > doing on regular tables using "ctid", then it shall be delivered to
>> > ExecUpdate or ExecDelete. We can never expect the fist argument of
>> > them now, so "ItemPointer tupleid" redefined to "Datum rowid", and
>> > argument of BR-trigger routines redefined also.
>> >
>> > [kaigai(at)iwashi sepgsql]$ cat ~/testfile.csv
>> > 10 aaa
>> > 11 bbb
>> > 12 ccc
>> > 13 ddd
>> > 14 eee
>> > 15 fff
>> > [kaigai(at)iwashi sepgsql]$ psql postgres
>> > psql (9.3devel)
>> > Type "help" for help.
>> >
>> > postgres=# UPDATE ftbl SET b = md5(b) WHERE a > 12 RETURNING *;
>> > INFO: ftbl is the target relation of UPDATE
>> > INFO: fdw_file: BeginForeignModify method
>> > INFO: fdw_file: UPDATE (lineno = 4)
>> > INFO: fdw_file: UPDATE (lineno = 5)
>> > INFO: fdw_file: UPDATE (lineno = 6)
>> > INFO: fdw_file: EndForeignModify method
>> > a | b
>> > ----+----------------------------------
>> > 13 | 77963b7a931377ad4ab5ad6a9cd718aa
>> > 14 | d2f2297d6e829cd3493aa7de4416a18f
>> > 15 | 343d9040a671c45832ee5381860e2996
>> > (3 rows)
>> >
>> > UPDATE 3
>> > postgres=# DELETE FROM ftbl WHERE a % 2 = 1 RETURNING *;
>> > INFO: ftbl is the target relation of DELETE
>> > INFO: fdw_file: BeginForeignModify method
>> > INFO: fdw_file: DELETE (lineno = 2)
>> > INFO: fdw_file: DELETE (lineno = 4)
>> > INFO: fdw_file: DELETE (lineno = 6)
>> > INFO: fdw_file: EndForeignModify method
>> > a | b
>> > ----+-----
>> > 11 | bbb
>> > 13 | ddd
>> > 15 | fff
>> > (3 rows)
>> >
>> > DELETE 3
>> >
>> > In addition, there is a small improvement. ExecForeignInsert,
>> > ExecForeignUpdate and ExecForeignDelete get being able
>> > to return number of processed rows; that allows to push-down
>> > whole the statement into remote-side, if it is enough simple
>> > (e.g, delete statement without any condition).
>> >
>> > Even though it does not make matter right now, pseudo-columns
>> > should be adjusted when foreign-table is referenced with table
>> > inheritance feature, because an attribute number being enough
>> > large in parent table is not enough large in child table.
>> > We need to fix up them until foreign table feature got inheritance
>> > capability.
>> >
>> > I didn't update the documentation stuff because I positioned
>> > the state of this patch as proof-of-concept now. Please note that.
>> >
>> A tiny bit of this patch was updated. I noticed INTERNAL data type
>> is more suitable to move a private datum from scan-stage to
>> modify-stage, because its type-length is declared as SIZEOF_POINTER.
>>
>> Also, I fixed up some obvious compiler warnings.
>

Thanks for your comments.

> I've read previous discussion about this patch. It's generally concentrated
> on the question how to identify foreign table row? Your last patch introduce
> "rowid" pseudo-column for foreign table row identification. My notes are
> following:
> 1) AFAICS your patch are designed to support arbitrary number of
> pseudo-columns while only one is currently used. Do you see more use cases
> of pseudo-columns?
>
Good question. Please see the p.39 of my slide at PGconf.EU 2012 (page of
"TargetList push-down")
(*) http://wiki.postgresql.org/wiki/PostgreSQL_Conference_Europe_Talks_2012

In case when targetList contains complex calculation such as mathematical
operation that takes long CPU cycles, this feature will allows to push down
burden of this calculation into foreign computing resource, not only
implementation
to move identifier of modified rows.
If we can move this calculation into remote RDBMS, all the local pgsql needs
to do is just reference a result value in spite of complex calculation.

> 2) You wrote that FDW can support or don't support write depending on having
> corresponding functions. However it's likely some tables of same FDW could
> be writable while another are not. I think we should have some mechanism for
> FDW telling whether particular table is writable.
>
Hmm. It might be a thing to be considered. My preference is, it should
be handled
at callbacks at planner stage. FDW can raise an error according to individual
foreign table configuration, such as existence of primary key and so on.

The resultRelation of Query can tell whether the scan plan is for
modification of
the table, or not.

> 3) I have another point about identification of foreign rows which I didn't
> meet in previous discussion. What if we restrict writable FDW table to have
> set of column which are unique identifier of a row. Many replication systems
> have this restriction: relicated tables should have a unique key. In case of
> text or csv file I don't see why making line number column user visible is
> bad.
>
The file_fdw portion of this patch is just a proof-of-concept, so I don't think
it is commitable enhancement right now. Thus, here is no special reason
why we don't expose the pseudo line number column for users, however,
it it not a fundamental essence of this patch.

Thanks,
--
KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>

From:	"Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at>
To:	"Alexander Korotkov EXTERN" <aekorotkov(at)gmail(dot)com>, "Kohei KaiGai" <kaigai(at)kaigai(dot)gr(dot)jp>
Cc:	"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Robert Haas" <robertmhaas(at)gmail(dot)com>, "PgHacker" <pgsql-hackers(at)postgresql(dot)org>, "Shigeru Hanada" <shigeru(dot)hanada(at)gmail(dot)com>
Subject:	Re: [v9.3] writable foreign tables
Date:	2012-11-08 08:05:01
Message-ID:	D960CB61B694CF459DCFB4B0128514C208A4EF7A@exadv11.host.magwien.gv.at
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Alexander Korotkov wrote:
> 2) You wrote that FDW can support or don't support write depending on
having corresponding functions.
> However it's likely some tables of same FDW could be writable while
another are not. I think we should
> have some mechanism for FDW telling whether particular table is
writable.

I think that this would best be handled by a table option,
if necessary.
That allows maximum flexibility for the design of the FDW.
In many cases it might be enough if the foreign data source
raises an error on a write request.

Yours,
Laurenz Albe

From:	Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
To:	Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Cc:	Alexander Korotkov EXTERN <aekorotkov(at)gmail(dot)com>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Subject:	Re: [v9.3] writable foreign tables
Date:	2012-11-08 14:53:11
Message-ID:	577511D2-CFC8-4FE7-8066-4FE633904058@gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 08-Nov-2012, at 13:35, "Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at> wrote:

> Alexander Korotkov wrote:
>> 2) You wrote that FDW can support or don't support write depending on
> having corresponding functions.
>> However it's likely some tables of same FDW could be writable while
> another are not. I think we should
>> have some mechanism for FDW telling whether particular table is
> writable.
>
> I think that this would best be handled by a table option,
> if necessary.
> That allows maximum flexibility for the design of the FDW.
> In many cases it might be enough if the foreign data source
> raises an error on a write request.
>
> Yours,
> Laurenz Albe
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

I agree, we should have a system where if the foreign data source raises an error on write, FDW can raise corresponding error on PostgreSQL side.exposing this as a table option is IMHO a bit risky, and the user may not know whether the foreign data source will accept writes or not.

Atri

From:	Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
To:	Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
Cc:	Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, "Alexander Korotkov EXTERN" <aekorotkov(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Subject:	Re: [v9.3] writable foreign tables
Date:	2012-11-15 17:53:46
Message-ID:	CADyhKSWV6mOQ2oD0z1V=HTOY14UCmqqEm0RbDhOS0EWSg9d7bg@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

The attached patch is just a refreshed version for clean applying to
the latest tree.

As previous version doing, it makes pseudo enhancement on file_fdw
to print something about the supplied tuple on INSERT, UPDATE and
DELETE statement.
Here is one other idea. My GPU acceleration module (PG-Strom)
implements column-oriented data store underlying foreign table.
It might make sense to cut out this portion for proof-of-concept of
writable foreign tables.

Any ideas?

2012/11/8 Atri Sharma <atri(dot)jiit(at)gmail(dot)com>:
>
>
> On 08-Nov-2012, at 13:35, "Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at> wrote:
>
>> Alexander Korotkov wrote:
>>> 2) You wrote that FDW can support or don't support write depending on
>> having corresponding functions.
>>> However it's likely some tables of same FDW could be writable while
>> another are not. I think we should
>>> have some mechanism for FDW telling whether particular table is
>> writable.
>>
>> I think that this would best be handled by a table option,
>> if necessary.
>> That allows maximum flexibility for the design of the FDW.
>> In many cases it might be enough if the foreign data source
>> raises an error on a write request.
>>
>> Yours,
>> Laurenz Albe
>>
>>
>> --
>> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-hackers
>
> +1
>
> I agree, we should have a system where if the foreign data source raises an error on write, FDW can raise corresponding error on PostgreSQL side.exposing this as a table option is IMHO a bit risky, and the user may not know whether the foreign data source will accept writes or not.
>
> Atri

--
KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>

Attachment	Content-Type	Size
pgsql-v9.3-writable-fdw-poc.v4.patch	application/octet-stream	46.3 KB

From:	"Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at>
To:	"Kohei KaiGai EXTERN" <kaigai(at)kaigai(dot)gr(dot)jp>, "Atri Sharma" <atri(dot)jiit(at)gmail(dot)com>
Cc:	"Alexander Korotkov EXTERN" <aekorotkov(at)gmail(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Robert Haas" <robertmhaas(at)gmail(dot)com>, "PgHacker" <pgsql-hackers(at)postgresql(dot)org>, "Shigeru Hanada" <shigeru(dot)hanada(at)gmail(dot)com>
Subject:	Re: [v9.3] writable foreign tables
Date:	2012-11-16 14:50:33
Message-ID:	D960CB61B694CF459DCFB4B0128514C208B87CED@exadv11.host.magwien.gv.at
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Kohei KaiGai wrote:
> The attached patch is just a refreshed version for clean applying to
> the latest tree.
>
> As previous version doing, it makes pseudo enhancement on file_fdw
> to print something about the supplied tuple on INSERT, UPDATE and
> DELETE statement.

Basics:
-------

The patch applies cleanly, compiles without warnings and passes
regression tests.

I think that the functionality is highly desirable; judging from
the number of talks at pgConf EU about SQL/MED this is a hot
topic, and further development is welcome.

Testing the functionality:
--------------------------

I ran a few queries with the file_fdw and found this:

$ cat flatfile
1,Laurenz,1968-10-20
2,Renée,1975-09-03
3,Caroline,2009-01-26
4,Ray,2011-03-09
5,Stephan,2011-03-09

CREATE SERVER file FOREIGN DATA WRAPPER file_fdw;

CREATE FOREIGN TABLE flat(
id integer not null,
name varchar(20) not null,
birthday date not null
) SERVER file
OPTIONS (filename 'flatfile', format 'csv');

UPDATE flat SET name='' FROM flat f WHERE f.id = flat.id and f.name like '%e';
ERROR: bitmapset is empty

About the code:
---------------

I am not so happy with GetForeignRelInfo:
- The name seems ill-chosen from the FDW API side.
I guess that you chose the name because the function
is called from get_relation_info, but I think the name
should be more descriptive for the FDW API.
Something like PlanForeignRelRowid.

- I guess that every FDW that only needs "rowid" will
do exactly the same as your fileGetForeignRelInfo.
Why can't that be done in core?
The function could pass an AttrNumber for the rowid
to the FDW, and will receive a boolean return code
depending on whether the FDW plans to use rowid or not.
That would be more convenient for FDW authors.

- I guess the order is dictated by planner steps, but
it would be "nice to have" if GetForeignRelInfo were
not the first function to be called during planning.
That would make it easier for a FDW to support both
9.2 and 9.3 (fewer #ifdefs), because the FDW plan state
will probably have to be created in the first API
function.

I also wonder why you changed the signature of functions in
trigger.c. It changes an exposed API unnecessarily, unless
you plan to have trigger support for foreign tables.
That is currently not supported, and I think that such
changes should be part of the present patch.

Other than that, I like the patch.
I cannot give an educated opinion if the changes to planner
and executor are well done or not.

Other comments:
---------------

I hope I find time to implement the API in oracle_fdw to
be able to test it more thoroughly.

There is an interdependence with the "FDW for PostgreSQL" patch
in the commitfest. If both these patches get committed, the
FDW should be extended to support the new API. If nothing else,
this would greatly improve the ability to test the new API
and find out if it is well designed.

> Here is one other idea. My GPU acceleration module (PG-Strom)
> implements column-oriented data store underlying foreign table.
> It might make sense to cut out this portion for proof-of-concept of
> writable foreign tables.
>
> Any ideas?

The best would be to have a patch on top of the PostgreSQL FDW
to be able to test. It need not be perfect.

Yours,
Laurenz Albe

From:	Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
To:	Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Cc:	Kohei KaiGai EXTERN <kaigai(at)kaigai(dot)gr(dot)jp>, Alexander Korotkov EXTERN <aekorotkov(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Subject:	Re: [v9.3] writable foreign tables
Date:	2012-11-16 17:04:43
Message-ID:	D9313038-1C1B-4E34-BA1B-42BECC14268B@gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Awesome.

I would love to implement this API in JDBC_FDW.

Atri

Sent from my iPad

On 16-Nov-2012, at 20:20, "Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at> wrote:

> Kohei KaiGai wrote:
>> The attached patch is just a refreshed version for clean applying to
>> the latest tree.
>>
>> As previous version doing, it makes pseudo enhancement on file_fdw
>> to print something about the supplied tuple on INSERT, UPDATE and
>> DELETE statement.
>
> Basics:
> -------
>
> The patch applies cleanly, compiles without warnings and passes
> regression tests.
>
> I think that the functionality is highly desirable; judging from
> the number of talks at pgConf EU about SQL/MED this is a hot
> topic, and further development is welcome.
>
> Testing the functionality:
> --------------------------
>
> I ran a few queries with the file_fdw and found this:
>
> $ cat flatfile
> 1,Laurenz,1968-10-20
> 2,Renée,1975-09-03
> 3,Caroline,2009-01-26
> 4,Ray,2011-03-09
> 5,Stephan,2011-03-09
>
> CREATE SERVER file FOREIGN DATA WRAPPER file_fdw;
>
> CREATE FOREIGN TABLE flat(
> id integer not null,
> name varchar(20) not null,
> birthday date not null
> ) SERVER file
> OPTIONS (filename 'flatfile', format 'csv');
>
> UPDATE flat SET name='' FROM flat f WHERE f.id = flat.id and f.name like '%e';
> ERROR: bitmapset is empty
>
> About the code:
> ---------------
>
> I am not so happy with GetForeignRelInfo:
> - The name seems ill-chosen from the FDW API side.
> I guess that you chose the name because the function
> is called from get_relation_info, but I think the name
> should be more descriptive for the FDW API.
> Something like PlanForeignRelRowid.
>
> - I guess that every FDW that only needs "rowid" will
> do exactly the same as your fileGetForeignRelInfo.
> Why can't that be done in core?
> The function could pass an AttrNumber for the rowid
> to the FDW, and will receive a boolean return code
> depending on whether the FDW plans to use rowid or not.
> That would be more convenient for FDW authors.
>
> - I guess the order is dictated by planner steps, but
> it would be "nice to have" if GetForeignRelInfo were
> not the first function to be called during planning.
> That would make it easier for a FDW to support both
> 9.2 and 9.3 (fewer #ifdefs), because the FDW plan state
> will probably have to be created in the first API
> function.
>
> I also wonder why you changed the signature of functions in
> trigger.c. It changes an exposed API unnecessarily, unless
> you plan to have trigger support for foreign tables.
> That is currently not supported, and I think that such
> changes should be part of the present patch.
>
> Other than that, I like the patch.
> I cannot give an educated opinion if the changes to planner
> and executor are well done or not.
>
> Other comments:
> ---------------
>
> I hope I find time to implement the API in oracle_fdw to
> be able to test it more thoroughly.
>
> There is an interdependence with the "FDW for PostgreSQL" patch
> in the commitfest. If both these patches get committed, the
> FDW should be extended to support the new API. If nothing else,
> this would greatly improve the ability to test the new API
> and find out if it is well designed.
>
>> Here is one other idea. My GPU acceleration module (PG-Strom)
>> implements column-oriented data store underlying foreign table.
>> It might make sense to cut out this portion for proof-of-concept of
>> writable foreign tables.
>>
>> Any ideas?
>
> The best would be to have a patch on top of the PostgreSQL FDW
> to be able to test. It need not be perfect.
>
> Yours,
> Laurenz Albe

From:	Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
To:	Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Cc:	Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, "Alexander Korotkov EXTERN" <aekorotkov(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Subject:	Re: [v9.3] writable foreign tables
Date:	2012-11-16 20:09:05
Message-ID:	CADyhKSVPY8imSVo9frV8wm2v5gh+e_Ddq-zWpNd7Y1dPezH1LA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

2012/11/16 Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>:
> Kohei KaiGai wrote:
>> The attached patch is just a refreshed version for clean applying to
>> the latest tree.
>>
>> As previous version doing, it makes pseudo enhancement on file_fdw
>> to print something about the supplied tuple on INSERT, UPDATE and
>> DELETE statement.
>
> Basics:
> -------
>
> The patch applies cleanly, compiles without warnings and passes
> regression tests.
>
> I think that the functionality is highly desirable; judging from
> the number of talks at pgConf EU about SQL/MED this is a hot
> topic, and further development is welcome.
>
> Testing the functionality:
> --------------------------
>
> I ran a few queries with the file_fdw and found this:
>
> $ cat flatfile
> 1,Laurenz,1968-10-20
> 2,Renée,1975-09-03
> 3,Caroline,2009-01-26
> 4,Ray,2011-03-09
> 5,Stephan,2011-03-09
>
> CREATE SERVER file FOREIGN DATA WRAPPER file_fdw;
>
> CREATE FOREIGN TABLE flat(
> id integer not null,
> name varchar(20) not null,
> birthday date not null
> ) SERVER file
> OPTIONS (filename 'flatfile', format 'csv');
>
> UPDATE flat SET name='' FROM flat f WHERE f.id = flat.id and f.name like '%e';
> ERROR: bitmapset is empty
>
Hmm... I'll try to investigate the behavior.

> About the code:
> ---------------
>
> I am not so happy with GetForeignRelInfo:
> - The name seems ill-chosen from the FDW API side.
> I guess that you chose the name because the function
> is called from get_relation_info, but I think the name
> should be more descriptive for the FDW API.
> Something like PlanForeignRelRowid.
>
Indeed, GetForeignRelInfo might give misleading impression
as if this routine collects widespread information about
target relation. So, how about GetForeignRelWidth() instead?

> - I guess that every FDW that only needs "rowid" will
> do exactly the same as your fileGetForeignRelInfo.
> Why can't that be done in core?
> The function could pass an AttrNumber for the rowid
> to the FDW, and will receive a boolean return code
> depending on whether the FDW plans to use rowid or not.
> That would be more convenient for FDW authors.
>
This design tries to kill two-birds with one-stone.
It enables to add multiple number of pseudo-columns,
not only "rowid", and makes possible to push-down
complex calculation of target list into external computing
resource.

For example, when user gives the following query:

SELECT ((c1 - c2) * (c2 - c3))^2 FROM ftable

it contains a complex calculation in the target-list,
thus, it also takes CPU cycles of local process.

If we can replace the "((c1 - c2) * (c2 - c3))^2" by
a reference to a pseudo-column that also references
the calculation result on external node, it effectively
off-load CPU cycles.

In this case, all we need to do is (1) acquire a slot
for pseudo-column at GetForeignRelInfo (2) replace
TargetEntry::expr by Var node that reference this
pseudo-column.

It makes sense for performance optimization, so I don't
want to restrict this handler for "rowid" only.

> - I guess the order is dictated by planner steps, but
> it would be "nice to have" if GetForeignRelInfo were
> not the first function to be called during planning.
> That would make it easier for a FDW to support both
> 9.2 and 9.3 (fewer #ifdefs), because the FDW plan state
> will probably have to be created in the first API
> function.
>
The baserel->fdw_private should be initialized to NULL,
so it can perform as a mark whether private data is already
constructed, or not.

> I also wonder why you changed the signature of functions in
> trigger.c. It changes an exposed API unnecessarily, unless
> you plan to have trigger support for foreign tables.
> That is currently not supported, and I think that such
> changes should be part of the present patch.
>
You are right. It might be unneeded change right now.
I'll revert it.

> Other than that, I like the patch.
> I cannot give an educated opinion if the changes to planner
> and executor are well done or not.
>
> Other comments:
> ---------------
>
> I hope I find time to implement the API in oracle_fdw to
> be able to test it more thoroughly.
>
> There is an interdependence with the "FDW for PostgreSQL" patch
> in the commitfest. If both these patches get committed, the
> FDW should be extended to support the new API. If nothing else,
> this would greatly improve the ability to test the new API
> and find out if it is well designed.
>
It is the reason why I'd like to volunteer to review Hanada-san's patch.
I'll spent my time for reviewing his patch on this weekend.

>> Here is one other idea. My GPU acceleration module (PG-Strom)
>> implements column-oriented data store underlying foreign table.
>> It might make sense to cut out this portion for proof-of-concept of
>> writable foreign tables.
>>
>> Any ideas?
>
> The best would be to have a patch on top of the PostgreSQL FDW
> to be able to test. It need not be perfect.
>
OK,

In addition, I noticed my patch didn't update documentation stuff.
I also add mention about new handlers.

Thanks,
--
KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>

From:	"Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at>
To:	"Kohei KaiGai EXTERN" <kaigai(at)kaigai(dot)gr(dot)jp>
Cc:	"Atri Sharma" <atri(dot)jiit(at)gmail(dot)com>, "Alexander Korotkov EXTERN" <aekorotkov(at)gmail(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Robert Haas" <robertmhaas(at)gmail(dot)com>, "PgHacker" <pgsql-hackers(at)postgresql(dot)org>, "Shigeru Hanada" <shigeru(dot)hanada(at)gmail(dot)com>
Subject:	Re: [v9.3] writable foreign tables
Date:	2012-11-19 15:28:04
Message-ID:	D960CB61B694CF459DCFB4B0128514C208B87FFC@exadv11.host.magwien.gv.at
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Kohei KaiGai wrote:
>> I am not so happy with GetForeignRelInfo:
>> - The name seems ill-chosen from the FDW API side.
>> I guess that you chose the name because the function
>> is called from get_relation_info, but I think the name
>> should be more descriptive for the FDW API.
>> Something like PlanForeignRelRowid.
>
> Indeed, GetForeignRelInfo might give misleading impression
> as if this routine collects widespread information about
> target relation. So, how about GetForeignRelWidth() instead?

That would be better for the function as it is now.

>> - I guess that every FDW that only needs "rowid" will
>> do exactly the same as your fileGetForeignRelInfo.
>> Why can't that be done in core?
>> The function could pass an AttrNumber for the rowid
>> to the FDW, and will receive a boolean return code
>> depending on whether the FDW plans to use rowid or not.
>> That would be more convenient for FDW authors.
>
> This design tries to kill two-birds with one-stone.
> It enables to add multiple number of pseudo-columns,
> not only "rowid", and makes possible to push-down
> complex calculation of target list into external computing
> resource.
>
> For example, when user gives the following query:
>
> SELECT ((c1 - c2) * (c2 - c3))^2 FROM ftable
>
> it contains a complex calculation in the target-list,
> thus, it also takes CPU cycles of local process.
>
> If we can replace the "((c1 - c2) * (c2 - c3))^2" by
> a reference to a pseudo-column that also references
> the calculation result on external node, it effectively
> off-load CPU cycles.
>
> In this case, all we need to do is (1) acquire a slot
> for pseudo-column at GetForeignRelInfo (2) replace
> TargetEntry::expr by Var node that reference this
> pseudo-column.
>
> It makes sense for performance optimization, so I don't
> want to restrict this handler for "rowid" only.

I understand.

But I think that you still can do that with the change that
I suggest. I suggest that GetForeignRelInfo (or whatever the
name ends up being) gets the AttrNumber of the proposed "rowid"
column in addition to the parameters you need for what
you want to do.

Then nothing would keep you from defining those
pseudo-columns. But all the setup necessary for the "rowid"
column could be moved out of the FDW. So for the 99% of all
FDW which are only interested in the rowid, things would
get much simpler and they don't all have to implement the
same code.

Did I make clear what I mean?
Would that be difficult?

>> - I guess the order is dictated by planner steps, but
>> it would be "nice to have" if GetForeignRelInfo were
>> not the first function to be called during planning.
>> That would make it easier for a FDW to support both
>> 9.2 and 9.3 (fewer #ifdefs), because the FDW plan state
>> will probably have to be created in the first API
>> function.
>
> The baserel->fdw_private should be initialized to NULL,
> so it can perform as a mark whether private data is already
> constructed, or not.

Right, if that pointer is pre-initialized to NULL, that
should work. Forget my quibble.

> In addition, I noticed my patch didn't update documentation stuff.
> I also add mention about new handlers.

I didn't get into documentation, comment and spelling issues since
the patch was still called POC, but yes, eventually that would
be necessary.

Yours,
Laurenz Albe

From:	Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
To:	Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Cc:	Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, "Alexander Korotkov EXTERN" <aekorotkov(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Subject:	Re: [v9.3] writable foreign tables
Date:	2012-11-19 17:07:30
Message-ID:	CADyhKSXdG7jPP_umGZjv1SD_Und9Xmn+6wCy=LphFKoHi=VW0g@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

2012/11/19 Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>:
> Kohei KaiGai wrote:
>>> I am not so happy with GetForeignRelInfo:
>>> - The name seems ill-chosen from the FDW API side.
>>> I guess that you chose the name because the function
>>> is called from get_relation_info, but I think the name
>>> should be more descriptive for the FDW API.
>>> Something like PlanForeignRelRowid.
>>
>> Indeed, GetForeignRelInfo might give misleading impression
>> as if this routine collects widespread information about
>> target relation. So, how about GetForeignRelWidth() instead?
>
> That would be better for the function as it is now.
>
>>> - I guess that every FDW that only needs "rowid" will
>>> do exactly the same as your fileGetForeignRelInfo.
>>> Why can't that be done in core?
>>> The function could pass an AttrNumber for the rowid
>>> to the FDW, and will receive a boolean return code
>>> depending on whether the FDW plans to use rowid or not.
>>> That would be more convenient for FDW authors.
>>
>> This design tries to kill two-birds with one-stone.
>> It enables to add multiple number of pseudo-columns,
>> not only "rowid", and makes possible to push-down
>> complex calculation of target list into external computing
>> resource.
>>
>> For example, when user gives the following query:
>>
>> SELECT ((c1 - c2) * (c2 - c3))^2 FROM ftable
>>
>> it contains a complex calculation in the target-list,
>> thus, it also takes CPU cycles of local process.
>>
>> If we can replace the "((c1 - c2) * (c2 - c3))^2" by
>> a reference to a pseudo-column that also references
>> the calculation result on external node, it effectively
>> off-load CPU cycles.
>>
>> In this case, all we need to do is (1) acquire a slot
>> for pseudo-column at GetForeignRelInfo (2) replace
>> TargetEntry::expr by Var node that reference this
>> pseudo-column.
>>
>> It makes sense for performance optimization, so I don't
>> want to restrict this handler for "rowid" only.
>
> I understand.
>
> But I think that you still can do that with the change that
> I suggest. I suggest that GetForeignRelInfo (or whatever the
> name ends up being) gets the AttrNumber of the proposed "rowid"
> column in addition to the parameters you need for what
> you want to do.
>
> Then nothing would keep you from defining those
> pseudo-columns. But all the setup necessary for the "rowid"
> column could be moved out of the FDW. So for the 99% of all
> FDW which are only interested in the rowid, things would
> get much simpler and they don't all have to implement the
> same code.
>
> Did I make clear what I mean?
> Would that be difficult?
>
All we have to do at get_relation_info() to deal with pseudo-
columns (including alternatives of complex calculation, not
only "rowid") is just expansion of rel->max_attr.
So, if FDW is not interested in something except for "rowid",
it can just inform the caller "Yeah, we need just one slot for
a pseudo-column of rowid". Otherwise, it can return another
value to acquire the slot for arbitrary pseudo-column.
I don't think it is a problematic design.

However, I'm skeptical 99% of FDWs don't support target-list
push-down. At least, it was very desired feature when I had
a talk at PGconf.EU last month. :-)

So, if we rename it to GetForeignRelWidth, is it defined as
follows?

extern AttrNumber
GetForeignRelWidth(PlannerInfo *root,
RelOptInfo *baserel,
Oid foreigntableid,
bool inhparent,
List *targetList);

Right now, inhparent makes no sense because foreign table
does not support table inheritance, but it seems to me we
shall have this functionality near future.

>>> - I guess the order is dictated by planner steps, but
>>> it would be "nice to have" if GetForeignRelInfo were
>>> not the first function to be called during planning.
>>> That would make it easier for a FDW to support both
>>> 9.2 and 9.3 (fewer #ifdefs), because the FDW plan state
>>> will probably have to be created in the first API
>>> function.
>>
>> The baserel->fdw_private should be initialized to NULL,
>> so it can perform as a mark whether private data is already
>> constructed, or not.
>
> Right, if that pointer is pre-initialized to NULL, that
> should work. Forget my quibble.
>
>> In addition, I noticed my patch didn't update documentation stuff.
>> I also add mention about new handlers.
>
> I didn't get into documentation, comment and spelling issues since
> the patch was still called POC, but yes, eventually that would
> be necessary.
>
> Yours,
> Laurenz Albe

Thanks,
--
KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>

From:	"Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at>
To:	"Kohei KaiGai EXTERN" <kaigai(at)kaigai(dot)gr(dot)jp>
Cc:	"Atri Sharma" <atri(dot)jiit(at)gmail(dot)com>, "Alexander Korotkov EXTERN" <aekorotkov(at)gmail(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Robert Haas" <robertmhaas(at)gmail(dot)com>, "PgHacker" <pgsql-hackers(at)postgresql(dot)org>, "Shigeru Hanada" <shigeru(dot)hanada(at)gmail(dot)com>
Subject:	Re: [v9.3] writable foreign tables
Date:	2012-11-20 08:53:11
Message-ID:	D960CB61B694CF459DCFB4B0128514C208B880DE@exadv11.host.magwien.gv.at
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Kohei KaiGai wrote:
>>> This design tries to kill two-birds with one-stone.
>>> It enables to add multiple number of pseudo-columns,
>>> not only "rowid", and makes possible to push-down
>>> complex calculation of target list into external computing
>>> resource.
>>>
>>> For example, when user gives the following query:
>>>
>>> SELECT ((c1 - c2) * (c2 - c3))^2 FROM ftable
>>>
>>> it contains a complex calculation in the target-list,
>>> thus, it also takes CPU cycles of local process.
>>>
>>> If we can replace the "((c1 - c2) * (c2 - c3))^2" by
>>> a reference to a pseudo-column that also references
>>> the calculation result on external node, it effectively
>>> off-load CPU cycles.
>>>
>>> In this case, all we need to do is (1) acquire a slot
>>> for pseudo-column at GetForeignRelInfo (2) replace
>>> TargetEntry::expr by Var node that reference this
>>> pseudo-column.
>>>
>>> It makes sense for performance optimization, so I don't
>>> want to restrict this handler for "rowid" only.

>> I understand.
>>
>> But I think that you still can do that with the change that
>> I suggest. I suggest that GetForeignRelInfo (or whatever the
>> name ends up being) gets the AttrNumber of the proposed "rowid"
>> column in addition to the parameters you need for what
>> you want to do.
>>
>> Then nothing would keep you from defining those
>> pseudo-columns. But all the setup necessary for the "rowid"
>> column could be moved out of the FDW. So for the 99% of all
>> FDW which are only interested in the rowid, things would
>> get much simpler and they don't all have to implement the
>> same code.

> All we have to do at get_relation_info() to deal with pseudo-
> columns (including alternatives of complex calculation, not
> only "rowid") is just expansion of rel->max_attr.
> So, if FDW is not interested in something except for "rowid",
> it can just inform the caller "Yeah, we need just one slot for
> a pseudo-column of rowid". Otherwise, it can return another
> value to acquire the slot for arbitrary pseudo-column.
> I don't think it is a problematic design.
>
> However, I'm skeptical 99% of FDWs don't support target-list
> push-down. At least, it was very desired feature when I had
> a talk at PGconf.EU last month. :-)

I agree that PostgreSQL should make this technique possible.

My idea should not make this any more difficult.

> So, if we rename it to GetForeignRelWidth, is it defined as
> follows?
>
> extern AttrNumber
> GetForeignRelWidth(PlannerInfo *root,
> RelOptInfo *baserel,
> Oid foreigntableid,
> bool inhparent,
> List *targetList);
>
> Right now, inhparent makes no sense because foreign table
> does not support table inheritance, but it seems to me we
> shall have this functionality near future.

I am thinking of this declaration:

extern bool
GetForeignRelWidth(PlannerInfo *root,
RelOptInfo *baserel,
Oid foreigntableid,
bool inhparent,
List *targetList,
AttrNumber rowidAttr);

Let me illustrate my idea with some code.

Here's your fileGetForeignRelInfo:

static void
fileGetForeignRelInfo(PlannerInfo *root,
RelOptInfo *baserel,
Oid foreigntableid,
bool inhparent, List *targetList)
{
FileFdwPlanState *fdw_private;
ListCell *cell;

fdw_private = (FileFdwPlanState *)
palloc0(sizeof(FileFdwPlanState));

foreach(cell, targetList)
{
TargetEntry *tle = lfirst(cell);

if (tle->resjunk &&
tle->resname && strcmp(tle->resname, "rowid")==0)
{
Bitmapset *temp = NULL;
AttrNumber anum_rowid;
DefElem *defel;

pull_varattnos((Node *)tle, baserel->relid, &temp);
anum_rowid = bms_singleton_member(temp)
+ FirstLowInvalidHeapAttributeNumber;
/* adjust attr_needed of baserel */
if (anum_rowid > baserel->max_attr)
baserel->max_attr = anum_rowid;
defel = makeDefElem("anum_rowid",
(Node *)makeInteger(anum_rowid));
fdw_private->anum_rowid = defel;
}
}
baserel->fdw_private = fdw_private;
}

I hope that this can be reduced to:

static bool
fileGetForeignRelRowid(PlannerInfo *root,
RelOptInfo *baserel,
Oid foreigntableid,
bool inhparent,
List *targetList,
AttrNumber rowidAttr)
{
FileFdwPlanState *fdw_private;
fdw_private = (FileFdwPlanState *)
palloc0(sizeof(FileFdwPlanState));

defel = makeDefElem("anum_rowid",
(Node *)makeInteger(rowidAttr));
fdw_private->anum_rowid = defel;

baserel->fdw_private = fdw_private;

return true; /* we'll use rowid, so please extend baserel->max_attr
*/
}

That wouldn't mean that the FDW cannot define any other
pseudo-columns in this function, just the case of rowid
would be simplified.

Does that make any sense?

Yours,
Laurenz Albe

From:	Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
To:	Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Cc:	Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, "Alexander Korotkov EXTERN" <aekorotkov(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Subject:	Re: [v9.3] writable foreign tables
Date:	2012-11-20 14:05:21
Message-ID:	CADyhKSV68j_4SiQpM0wW4w+Vf64iBofEj1oD-autSuesHx6DQg@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

2012/11/20 Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>:
> Kohei KaiGai wrote:
>>>> This design tries to kill two-birds with one-stone.
>>>> It enables to add multiple number of pseudo-columns,
>>>> not only "rowid", and makes possible to push-down
>>>> complex calculation of target list into external computing
>>>> resource.
>>>>
>>>> For example, when user gives the following query:
>>>>
>>>> SELECT ((c1 - c2) * (c2 - c3))^2 FROM ftable
>>>>
>>>> it contains a complex calculation in the target-list,
>>>> thus, it also takes CPU cycles of local process.
>>>>
>>>> If we can replace the "((c1 - c2) * (c2 - c3))^2" by
>>>> a reference to a pseudo-column that also references
>>>> the calculation result on external node, it effectively
>>>> off-load CPU cycles.
>>>>
>>>> In this case, all we need to do is (1) acquire a slot
>>>> for pseudo-column at GetForeignRelInfo (2) replace
>>>> TargetEntry::expr by Var node that reference this
>>>> pseudo-column.
>>>>
>>>> It makes sense for performance optimization, so I don't
>>>> want to restrict this handler for "rowid" only.
>
>>> I understand.
>>>
>>> But I think that you still can do that with the change that
>>> I suggest. I suggest that GetForeignRelInfo (or whatever the
>>> name ends up being) gets the AttrNumber of the proposed "rowid"
>>> column in addition to the parameters you need for what
>>> you want to do.
>>>
>>> Then nothing would keep you from defining those
>>> pseudo-columns. But all the setup necessary for the "rowid"
>>> column could be moved out of the FDW. So for the 99% of all
>>> FDW which are only interested in the rowid, things would
>>> get much simpler and they don't all have to implement the
>>> same code.
>
>> All we have to do at get_relation_info() to deal with pseudo-
>> columns (including alternatives of complex calculation, not
>> only "rowid") is just expansion of rel->max_attr.
>> So, if FDW is not interested in something except for "rowid",
>> it can just inform the caller "Yeah, we need just one slot for
>> a pseudo-column of rowid". Otherwise, it can return another
>> value to acquire the slot for arbitrary pseudo-column.
>> I don't think it is a problematic design.
>>
>> However, I'm skeptical 99% of FDWs don't support target-list
>> push-down. At least, it was very desired feature when I had
>> a talk at PGconf.EU last month. :-)
>
> I agree that PostgreSQL should make this technique possible.
>
> My idea should not make this any more difficult.
>
>> So, if we rename it to GetForeignRelWidth, is it defined as
>> follows?
>>
>> extern AttrNumber
>> GetForeignRelWidth(PlannerInfo *root,
>> RelOptInfo *baserel,
>> Oid foreigntableid,
>> bool inhparent,
>> List *targetList);
>>
>> Right now, inhparent makes no sense because foreign table
>> does not support table inheritance, but it seems to me we
>> shall have this functionality near future.
>
> I am thinking of this declaration:
>
> extern bool
> GetForeignRelWidth(PlannerInfo *root,
> RelOptInfo *baserel,
> Oid foreigntableid,
> bool inhparent,
> List *targetList,
> AttrNumber rowidAttr);
>
> Let me illustrate my idea with some code.
>
> Here's your fileGetForeignRelInfo:
>
>
> static void
> fileGetForeignRelInfo(PlannerInfo *root,
> RelOptInfo *baserel,
> Oid foreigntableid,
> bool inhparent, List *targetList)
> {
> FileFdwPlanState *fdw_private;
> ListCell *cell;
>
> fdw_private = (FileFdwPlanState *)
> palloc0(sizeof(FileFdwPlanState));
>
> foreach(cell, targetList)
> {
> TargetEntry *tle = lfirst(cell);
>
> if (tle->resjunk &&
> tle->resname && strcmp(tle->resname, "rowid")==0)
> {
> Bitmapset *temp = NULL;
> AttrNumber anum_rowid;
> DefElem *defel;
>
> pull_varattnos((Node *)tle, baserel->relid, &temp);
> anum_rowid = bms_singleton_member(temp)
> + FirstLowInvalidHeapAttributeNumber;
> /* adjust attr_needed of baserel */
> if (anum_rowid > baserel->max_attr)
> baserel->max_attr = anum_rowid;
> defel = makeDefElem("anum_rowid",
> (Node *)makeInteger(anum_rowid));
> fdw_private->anum_rowid = defel;
> }
> }
> baserel->fdw_private = fdw_private;
> }
>
>
> I hope that this can be reduced to:
>
>
> static bool
> fileGetForeignRelRowid(PlannerInfo *root,
> RelOptInfo *baserel,
> Oid foreigntableid,
> bool inhparent,
> List *targetList,
> AttrNumber rowidAttr)
> {
> FileFdwPlanState *fdw_private;
> fdw_private = (FileFdwPlanState *)
> palloc0(sizeof(FileFdwPlanState));
>
> defel = makeDefElem("anum_rowid",
> (Node *)makeInteger(rowidAttr));
> fdw_private->anum_rowid = defel;
>
> baserel->fdw_private = fdw_private;
>
> return true; /* we'll use rowid, so please extend baserel->max_attr
> */
> }
>
>
> That wouldn't mean that the FDW cannot define any other
> pseudo-columns in this function, just the case of rowid
> would be simplified.
>
> Does that make any sense?
>
I think "bool" is not suitable data type to inform how many pseudo-columns
are needed for this scan on foreign-table.

Probably, it is helpful to provide a helper function that fetches an attribute-
number of pseudo "rowid" column from the supplied targetlist.
If we have GetPseudoRowidColumn() at foreign/foreign.c, the avove
routine can be rewritten as:

static AttrNumber
fileGetForeignRelWidth(PlannerInfo *root,
RelOptInfo *baserel,
Relation foreignrel,
bool inhparent, List *targetList)
{
FileFdwPlanState *fdw_private;
AttrNumber nattrs = RelationGetNumberOfAttributes(foreignrel);
AttrNumber anum_rowid;

fdw_private = palloc0(sizeof(FileFdwPlanState));
anum_rowid = GetPseudoRowidColumn(..., targetList);
if (anum_rowid > 0)
{
Assert(anum_rowid > nattrs);
fdw_private->anum_rowid
= makeDefElem("anum_rowid", (Node *)makeInteger(anum_rowid));
nattrs = anum_rowid;
}
baserel->fdw_private = fdw_private;

return nattrs;
}

In case when FDW drive wants to push-down other target entry into foreign-
side, thus, it needs multiple pseudo-columns, it is decision of the extension.
In addition, it does not take API change in the future, if some more additional
pseudo-column is required by some other new features.

How about your opinion?

Thanks,
--
KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>

From:	"Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at>
To:	"Kohei KaiGai EXTERN" <kaigai(at)kaigai(dot)gr(dot)jp>
Cc:	"Atri Sharma" <atri(dot)jiit(at)gmail(dot)com>, "Alexander Korotkov EXTERN" <aekorotkov(at)gmail(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Robert Haas" <robertmhaas(at)gmail(dot)com>, "PgHacker" <pgsql-hackers(at)postgresql(dot)org>, "Shigeru Hanada" <shigeru(dot)hanada(at)gmail(dot)com>
Subject:	Re: [v9.3] writable foreign tables
Date:	2012-11-20 15:11:28
Message-ID:	D960CB61B694CF459DCFB4B0128514C208B882AF@exadv11.host.magwien.gv.at
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Kohei KaiGai wrote:
> Probably, it is helpful to provide a helper function that fetches an
attribute-
> number of pseudo "rowid" column from the supplied targetlist.
> If we have GetPseudoRowidColumn() at foreign/foreign.c, the avove
> routine can be rewritten as:
>
> static AttrNumber
> fileGetForeignRelWidth(PlannerInfo *root,
> RelOptInfo *baserel,
> Relation foreignrel,
> bool inhparent, List
*targetList)
> {
> FileFdwPlanState *fdw_private;
> AttrNumber nattrs = RelationGetNumberOfAttributes(foreignrel);
> AttrNumber anum_rowid;
>
> fdw_private = palloc0(sizeof(FileFdwPlanState));
> anum_rowid = GetPseudoRowidColumn(..., targetList);
> if (anum_rowid > 0)
> {
> Assert(anum_rowid > nattrs);
> fdw_private->anum_rowid
> = makeDefElem("anum_rowid", (Node
*)makeInteger(anum_rowid));
> nattrs = anum_rowid;
> }
> baserel->fdw_private = fdw_private;
>
> return nattrs;
> }
>
> In case when FDW drive wants to push-down other target entry into
foreign-
> side, thus, it needs multiple pseudo-columns, it is decision of the
extension.
> In addition, it does not take API change in the future, if some more
additional
> pseudo-column is required by some other new features.
>
> How about your opinion?

I think that this is better than what I suggested.

Yours,
Laurenz Albe

From:	Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
To:	Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Cc:	Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, "Alexander Korotkov EXTERN" <aekorotkov(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Subject:	Re: [v9.3] writable foreign tables
Date:	2012-12-03 12:39:21
Message-ID:	CADyhKSVrNV_c5rrXym776wTxi8Ff4NA-MNQ9R3A3WpRekMrJiA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

The attached patch is revised version.

One most difference from the previous version is, it constructed
PoC features on Hanada-san's latest postgres-fdw.v5.patch.
Yesh, it looks to me working fine on RDBMS backend also.

Even though the filename of this patch contains "poc" phrase,
I think it may be time to consider adoption of the core regarding
to the interface portion.
(Of course, postgres_fdw is still works in progress.)

Here is a few operation examples.

postgres=# CREATE FOREIGN TABLE tbl (a int, b text, c date) SERVER loopback;
CREATE FOREIGN TABLE
postgres=# SELECT * FROM tbl;
a | b | c
---+-----+------------
1 | aaa | 2012-12-01
2 | bbb | 2012-12-02
3 | ccc | 2012-12-03
4 | ddd | 2012-12-04
5 | eee | 2012-12-05
6 | fff | 2012-12-06
(6 rows)

postgres=# UPDATE tbl SET b = b || b WHERE a % 2 = 1;
UPDATE 3
postgres=# SELECT * FROM tbl ORDER BY a;
a | b | c
---+--------+------------
1 | aaaaaa | 2012-12-01
2 | bbb | 2012-12-02
3 | cccccc | 2012-12-03
4 | ddd | 2012-12-04
5 | eeeeee | 2012-12-05
6 | fff | 2012-12-06
(6 rows)

postgres=# INSERT INTO tbl VALUES (7,'ggg'),(8,'hhh');
INSERT 0 2
postgres=# SELECT * FROM tbl ORDER BY a;
a | b | c
---+--------+------------
1 | aaaaaa | 2012-12-01
2 | bbb | 2012-12-02
3 | cccccc | 2012-12-03
4 | ddd | 2012-12-04
5 | eeeeee | 2012-12-05
6 | fff | 2012-12-06
7 | ggg |
8 | hhh |
(8 rows)

postgres=# DELETE FROM tbl WHERE a % 2 = 0;
DELETE 4
postgres=# SELECT * FROM tbl ORDER BY a;
a | b | c
---+--------+------------
1 | aaaaaa | 2012-12-01
3 | cccccc | 2012-12-03
5 | eeeeee | 2012-12-05
7 | ggg |
(4 rows)

Even though it still has restriction of transaction control on remote side,
I believe Hanada-san will provide more graceful implementation next to
the first read-only version getting committed.

So, let's back to the main topic of this patch.
According to the upthread discussion, I renamed the interface to inform
expected width of result set as follows:

+typedef AttrNumber (*GetForeignRelWidth_function) (PlannerInfo *root,
+ RelOptInfo *baserel,
+ Relation foreignrel,
+ bool inhparent,
+ List *targetList);

It informs the core how many slots for regular and pseudo columns shall
be acquired. If it is identical with number of attributed in foreign table
definition, it also means this scan does not use any pseudo columns.
A typical use case of pseudo column is "rowid" to move an identifier of
remote row from scan stage to modify stage. It is responsibility of FDW
driver to ensure "rowid" has uniqueness on the remote side; my
enhancement on postgres_fdw uses ctid.

get_pseudo_rowid_column() is a utility function that picks up an attribute
number of pseudo "rowid" column if query rewriter injected on previous
stage. If FDW does not support any other pseudo column features, the
value to be returned is just return-value of this function.

Other relevant APIs are as follows:

+typedef List *(*PlanForeignModify_function) (PlannerInfo *root,
+ ModifyTable *plan,
+ Index resultRelation,
+ Plan *subplan);
+
I newly added this handler on construction of ModifyTable structure.
Because INSERT command does not have scan stage directly connected
with table modification, FDW driver has no chance to construct its private
stuff relevant to table modification. (In case postgres_fdw, it constructs
the second query to modify remote table with/without given ctid.)
Its returned List * value is moved to BeginForeignModify handler as
third argument.

+typedef void (*BeginForeignModify_function) (ModifyTableState *mtstate,
+ ResultRelInfo *resultRelInfo,
+ List *fdw_private,
+ Plan *subplan,
+ int eflags);
I adjusted some arguments to reference fdw_private being constructed
on query plan stage. The role of this handler is not changed. FDW driver
should have all the initialization stuff on this handler, like we are doing at
BeginForeignScan.

+typedef int (*ExecForeignInsert_function) (ResultRelInfo *resultRelInfo,
+ HeapTuple tuple);
+typedef int (*ExecForeignDelete_function) (ResultRelInfo *resultRelInfo,
+ Datum rowid);
+typedef int (*ExecForeignUpdate_function) (ResultRelInfo *resultRelInfo,
+ Datum rowid,
+ HeapTuple tuple);
These are not changed.

+typedef void (*EndForeignModify_function) (ResultRelInfo *resultRelInfo);

Also, it is not changed.

Thanks,
--
KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>

Attachment	Content-Type	Size
pgsql-v9.3-writable-fdw-poc.v5.patch	application/octet-stream	158.7 KB

From:	Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
To:	"Kohei KaiGai EXTERN" <kaigai(at)kaigai(dot)gr(dot)jp>
Cc:	Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Alexander Korotkov EXTERN <aekorotkov(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, "Shigeru Hanada" <shigeru(dot)hanada(at)gmail(dot)com>
Subject:	Re: [v9.3] writable foreign tables
Date:	2012-12-07 11:24:56
Message-ID:	A737B7A37273E048B164557ADEF4A58B05785FFE@ntex2010i.host.magwien.gv.at
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Kohei KaiGai wrote:
> The attached patch is revised version.
>
> One most difference from the previous version is, it constructed
> PoC features on Hanada-san's latest postgres-fdw.v5.patch.
> Yesh, it looks to me working fine on RDBMS backend also.

Cool.

> Even though the filename of this patch contains "poc" phrase,
> I think it may be time to consider adoption of the core regarding
> to the interface portion.
> (Of course, postgres_fdw is still works in progress.)

Ok, I'll try to review it with regard to that.

> Here is a few operation examples.
[...]
> postgres=# INSERT INTO tbl VALUES (7,'ggg'),(8,'hhh');
> INSERT 0 2

Weird, that fails for me.

CREATE TABLE test(
id integer PRIMARY KEY,
val text NOT NULL
);

CREATE FOREIGN TABLE rtest(
id integer not null,
val text not null
) SERVER loopback OPTIONS (nspname 'laurenz', relname 'test');

test=> INSERT INTO test(id, val) VALUES (1, 'one');
INSERT 0 1
test=> INSERT INTO rtest(id, val) VALUES (2, 'two');
The connection to the server was lost. Attempting reset: Failed.
!>

Here is the stack trace:
317 RangeTblEntry *rte = root->simple_rte_array[rtindex];
#0 0x00231cb9 in deparseInsertSql (buf=0xbfffafb0, root=0x9be3614, rtindex=1) at deparse.c:317
#1 0x0023018c in postgresPlanForeignModify (root=0x9be3614, plan=0x9be3278, resultRelaion=1, subplan=0x9be3bec)
at postgres_fdw.c:1538
#2 0x082a3ac2 in make_modifytable (root=0x9be3614, operation=CMD_INSERT, canSetTag=1 '\001', resultRelations=0x9be3c7c,
subplans=0x9be3c4c, returningLists=0x0, rowMarks=0x0, epqParam=0) at createplan.c:4787
#3 0x082a7ada in subquery_planner (glob=0x9be357c, parse=0x9be3304, parent_root=0x0, hasRecursion=0 '\0', tuple_fraction=0,
subroot=0xbfffb0ec) at planner.c:574
#4 0x082a71dc in standard_planner (parse=0x9be3304, cursorOptions=0, boundParams=0x0) at planner.c:200
#5 0x082a707b in planner (parse=0x9be3304, cursorOptions=0, boundParams=0x0) at planner.c:129
#6 0x0832c14c in pg_plan_query (querytree=0x9be3304, cursorOptions=0, boundParams=0x0) at postgres.c:753
#7 0x0832c1ec in pg_plan_queries (querytrees=0x9be3a98, cursorOptions=0, boundParams=0x0) at postgres.c:812
#8 0x0832c46e in exec_simple_query (query_string=0x9be267c "INSERT INTO rtest(id, val) VALUES (2, 'two');") at postgres.c:977

(gdb) print root->simple_rte_array
$1 = (RangeTblEntry **) 0x0

The bug I reported in my last review does not seem to be fixed, either.
The symptoms are different now (with the definitions from above):

test=> UPDATE rtest SET val='new' FROM rtest t2 WHERE rtest.id=t2.id AND t2.val LIKE '%e';
TRAP: FailedAssertion("!(baserel->relid == var->varno)", File: "foreign.c", Line: 601)
The connection to the server was lost. Attempting reset: Failed.
!>

The same happens for DELETE ... USING.

A different failure happens if I join with a local table:

test=> UPDATE rtest SET val='new' FROM test t2 WHERE rtest.id=t2.id AND t2.val = 'nonexist';
TRAP: FailedAssertion("!(((((const Node*)(fscan))->type) == T_ForeignScan))", File: "postgres_fdw.c", Line: 1526)
The connection to the server was lost. Attempting reset: Failed.
!>

I gave up testing the functionality after that.

> So, let's back to the main topic of this patch.
> According to the upthread discussion, I renamed the interface to inform
> expected width of result set as follows:
>
> +typedef AttrNumber (*GetForeignRelWidth_function) (PlannerInfo *root,
> + RelOptInfo *baserel,
> + Relation foreignrel,
> + bool inhparent,
> + List *targetList);
>
> It informs the core how many slots for regular and pseudo columns shall
> be acquired. If it is identical with number of attributed in foreign table
> definition, it also means this scan does not use any pseudo columns.
> A typical use case of pseudo column is "rowid" to move an identifier of
> remote row from scan stage to modify stage. It is responsibility of FDW
> driver to ensure "rowid" has uniqueness on the remote side; my
> enhancement on postgres_fdw uses ctid.
>
> get_pseudo_rowid_column() is a utility function that picks up an attribute
> number of pseudo "rowid" column if query rewriter injected on previous
> stage. If FDW does not support any other pseudo column features, the
> value to be returned is just return-value of this function.

Thanks, I think this will make the FDW writer's work easier.

> Other relevant APIs are as follows:
>
> +typedef List *(*PlanForeignModify_function) (PlannerInfo *root,
> + ModifyTable *plan,
> + Index resultRelation,
> + Plan *subplan);
> +
> I newly added this handler on construction of ModifyTable structure.
> Because INSERT command does not have scan stage directly connected
> with table modification, FDW driver has no chance to construct its private
> stuff relevant to table modification. (In case postgres_fdw, it constructs
> the second query to modify remote table with/without given ctid.)
> Its returned List * value is moved to BeginForeignModify handler as
> third argument.

So, in the case of an INSERT, GetForeignPlan and friends are not
called? Then such a functionality is indeed desirable.

> +typedef void (*BeginForeignModify_function) (ModifyTableState *mtstate,
> + ResultRelInfo *resultRelInfo,
> + List *fdw_private,
> + Plan *subplan,
> + int eflags);
> I adjusted some arguments to reference fdw_private being constructed
> on query plan stage. The role of this handler is not changed. FDW driver
> should have all the initialization stuff on this handler, like we are doing at
> BeginForeignScan.

I wonder about the PlanForeignModify/BeginForeignModify pair.
It is quite different from the "Scan" functions.

Would it make sense to invent a ForeignModifyTableState that has
the fdw_private information included, similar to ForeignScanState?
It might make the API more homogenous.
But maybe that's overkill.

I took a brief look at the documentation; that will need some more
work.

Yours,
Laurenz Albe

From:	Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
To:	Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Cc:	Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, "Alexander Korotkov EXTERN" <aekorotkov(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Subject:	Re: [v9.3] writable foreign tables
Date:	2012-12-08 20:40:37
Message-ID:	CADyhKSUGX527ORWmcVCzrAy9mVZ6ViPiZ1-7-TsXVQ48rt9Fxg@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

2012/12/7 Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>:
> Kohei KaiGai wrote:
>> The attached patch is revised version.
>>
>> One most difference from the previous version is, it constructed
>> PoC features on Hanada-san's latest postgres-fdw.v5.patch.
>> Yesh, it looks to me working fine on RDBMS backend also.
>
> Cool.
>
>> Even though the filename of this patch contains "poc" phrase,
>> I think it may be time to consider adoption of the core regarding
>> to the interface portion.
>> (Of course, postgres_fdw is still works in progress.)
>
> Ok, I'll try to review it with regard to that.
>
>> Here is a few operation examples.
> [...]
>> postgres=# INSERT INTO tbl VALUES (7,'ggg'),(8,'hhh');
>> INSERT 0 2
>
> Weird, that fails for me.
>
> CREATE TABLE test(
> id integer PRIMARY KEY,
> val text NOT NULL
> );
>
> CREATE FOREIGN TABLE rtest(
> id integer not null,
> val text not null
> ) SERVER loopback OPTIONS (nspname 'laurenz', relname 'test');
>
> test=> INSERT INTO test(id, val) VALUES (1, 'one');
> INSERT 0 1
> test=> INSERT INTO rtest(id, val) VALUES (2, 'two');
> The connection to the server was lost. Attempting reset: Failed.
> !>
>
> Here is the stack trace:
> 317 RangeTblEntry *rte = root->simple_rte_array[rtindex];
> #0 0x00231cb9 in deparseInsertSql (buf=0xbfffafb0, root=0x9be3614, rtindex=1) at deparse.c:317
> #1 0x0023018c in postgresPlanForeignModify (root=0x9be3614, plan=0x9be3278, resultRelaion=1, subplan=0x9be3bec)
> at postgres_fdw.c:1538
> #2 0x082a3ac2 in make_modifytable (root=0x9be3614, operation=CMD_INSERT, canSetTag=1 '\001', resultRelations=0x9be3c7c,
> subplans=0x9be3c4c, returningLists=0x0, rowMarks=0x0, epqParam=0) at createplan.c:4787
> #3 0x082a7ada in subquery_planner (glob=0x9be357c, parse=0x9be3304, parent_root=0x0, hasRecursion=0 '\0', tuple_fraction=0,
> subroot=0xbfffb0ec) at planner.c:574
> #4 0x082a71dc in standard_planner (parse=0x9be3304, cursorOptions=0, boundParams=0x0) at planner.c:200
> #5 0x082a707b in planner (parse=0x9be3304, cursorOptions=0, boundParams=0x0) at planner.c:129
> #6 0x0832c14c in pg_plan_query (querytree=0x9be3304, cursorOptions=0, boundParams=0x0) at postgres.c:753
> #7 0x0832c1ec in pg_plan_queries (querytrees=0x9be3a98, cursorOptions=0, boundParams=0x0) at postgres.c:812
> #8 0x0832c46e in exec_simple_query (query_string=0x9be267c "INSERT INTO rtest(id, val) VALUES (2, 'two');") at postgres.c:977
>
> (gdb) print root->simple_rte_array
> $1 = (RangeTblEntry **) 0x0
>
The reason why simple_rte_array was not initialized is, invocation of
setup_simple_rel_arrays() was skipped in case of simple INSERT INTO
... VALUES(). So, I put setup_simple_rel_arrays() just after short-cut
code path at query_planner().

> The bug I reported in my last review does not seem to be fixed, either.
> The symptoms are different now (with the definitions from above):
>
> test=> UPDATE rtest SET val='new' FROM rtest t2 WHERE rtest.id=t2.id AND t2.val LIKE '%e';
> TRAP: FailedAssertion("!(baserel->relid == var->varno)", File: "foreign.c", Line: 601)
> The connection to the server was lost. Attempting reset: Failed.
> !>
>
> The same happens for DELETE ... USING.
>
> A different failure happens if I join with a local table:
>
> test=> UPDATE rtest SET val='new' FROM test t2 WHERE rtest.id=t2.id AND t2.val = 'nonexist';
> TRAP: FailedAssertion("!(((((const Node*)(fscan))->type) == T_ForeignScan))", File: "postgres_fdw.c", Line: 1526)
> The connection to the server was lost. Attempting reset: Failed.
> !>
>
> I gave up testing the functionality after that.
>
The reason of this assertion trap is, I incorrectly assumed subplan of
ModifyTable is always ForeignScan. However, it should be picked up
it from the subplan tree using recursive node walking.
I newly added lookup_foreign_scan_plan() not to reinvent same logic
for each extension. It returns ForeignScan node that has supplied
rtindex, even if the top of subplan tree is not ForeignScan.

Both of the troubles you reported was fixed with attached patch.
I also added relevant test cases into regression test, please check it.

>> So, let's back to the main topic of this patch.
>> According to the upthread discussion, I renamed the interface to inform
>> expected width of result set as follows:
>>
>> +typedef AttrNumber (*GetForeignRelWidth_function) (PlannerInfo *root,
>> + RelOptInfo *baserel,
>> + Relation foreignrel,
>> + bool inhparent,
>> + List *targetList);
>>
>> It informs the core how many slots for regular and pseudo columns shall
>> be acquired. If it is identical with number of attributed in foreign table
>> definition, it also means this scan does not use any pseudo columns.
>> A typical use case of pseudo column is "rowid" to move an identifier of
>> remote row from scan stage to modify stage. It is responsibility of FDW
>> driver to ensure "rowid" has uniqueness on the remote side; my
>> enhancement on postgres_fdw uses ctid.
>>
>> get_pseudo_rowid_column() is a utility function that picks up an attribute
>> number of pseudo "rowid" column if query rewriter injected on previous
>> stage. If FDW does not support any other pseudo column features, the
>> value to be returned is just return-value of this function.
>
> Thanks, I think this will make the FDW writer's work easier.
>
I adjusted the code a bit to work this routine correctly, even if a query
contains two or more foreign table references.

>> Other relevant APIs are as follows:
>>
>> +typedef List *(*PlanForeignModify_function) (PlannerInfo *root,
>> + ModifyTable *plan,
>> + Index resultRelation,
>> + Plan *subplan);
>> +
>> I newly added this handler on construction of ModifyTable structure.
>> Because INSERT command does not have scan stage directly connected
>> with table modification, FDW driver has no chance to construct its private
>> stuff relevant to table modification. (In case postgres_fdw, it constructs
>> the second query to modify remote table with/without given ctid.)
>> Its returned List * value is moved to BeginForeignModify handler as
>> third argument.
>
> So, in the case of an INSERT, GetForeignPlan and friends are not
> called? Then such a functionality is indeed desirable.
>
Yes. In case of INSERT, its resultRelation is not appeared in the fromList,
thus, no chance to create relevant scan plan towards the table to be inserted.

>> +typedef void (*BeginForeignModify_function) (ModifyTableState *mtstate,
>> + ResultRelInfo *resultRelInfo,
>> + List *fdw_private,
>> + Plan *subplan,
>> + int eflags);
>> I adjusted some arguments to reference fdw_private being constructed
>> on query plan stage. The role of this handler is not changed. FDW driver
>> should have all the initialization stuff on this handler, like we are doing at
>> BeginForeignScan.
>
> I wonder about the PlanForeignModify/BeginForeignModify pair.
> It is quite different from the "Scan" functions.
>
> Would it make sense to invent a ForeignModifyTableState that has
> the fdw_private information included, similar to ForeignScanState?
> It might make the API more homogenous.
> But maybe that's overkill.
>
The origin of this asymmetry is ModifyTableState may affect multiple
tables if result relation has inheritance, even though foreign table does
not support inheritance feature right now.
Please see the entrypoint to invoke BeginForeignModify handler.
It is in a loop block that initializes executor state being associated with
thie ModifyTableState.

Even if we have symmetric definition, the variable "i" need to be delivered
to identify which executor state is under initialization at least.
The relevant resultRelInfo, fdwprivate, and subplan can be picked up
from the ModifyTableState using this index. But it enforces extensions
to implement same logic with individual code. My preference is to pick
up necessary information at the code side.

> I took a brief look at the documentation; that will need some more
> work.
>
Yep, I believe it should be revised prior to this patch being committed.

Thanks,
--
KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>

Attachment	Content-Type	Size
pgsql-v9.3-writable-fdw-poc.v6.patch	application/octet-stream	154.1 KB

From:	Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
To:	"Kohei KaiGai EXTERN" <kaigai(at)kaigai(dot)gr(dot)jp>
Cc:	Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Alexander Korotkov EXTERN <aekorotkov(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, "Shigeru Hanada" <shigeru(dot)hanada(at)gmail(dot)com>
Subject:	Re: [v9.3] writable foreign tables
Date:	2012-12-11 15:38:48
Message-ID:	A737B7A37273E048B164557ADEF4A58B05788237@ntex2010i.host.magwien.gv.at
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Kohei KaiGai wrote:
>> Weird, that fails for me.

> Both of the troubles you reported was fixed with attached patch.
> I also added relevant test cases into regression test, please check it.

It passes the regression tests, and solves the problems I found.

I came up with one more query that causes a problem:

CREATE TABLE test(
id integer PRIMARY KEY,
val text NOT NULL
);

INSERT INTO test(id, val) VALUES (1, 'one');

CREATE FOREIGN TABLE rtest(
id integer not null,
val text not null
) SERVER loopback OPTIONS (relname 'test');

/* loopback points to the same database */

WITH ch AS (
UPDATE test
SET val='changed'
RETURNING id
) UPDATE rtest
SET val='new'
FROM ch
WHERE rtest.id = ch.id;

This causes a deadlock, but one that is not detected;
the query just keeps hanging.

The UPDATE in the CTE has the rows locked, so the
SELECT ... FOR UPDATE issued via the FDW connection will hang
indefinitely.

I wonder if that's just a pathological corner case that we shouldn't
worry about. Loopback connections for FDWs themselves might not
be so rare, for example as a substitute for autonomous subtransactions.

I guess it is not easily possible to detect such a situation or
to do something reasonable about it.

>> I took a brief look at the documentation; that will need some more
>> work.
>
> Yep, I believe it should be revised prior to this patch being committed.

I tried to overhaul the documentation, see the attached patch.

There was one thing that I was not certain of:
You say that for writable foreign tables, BeginForeignModify
and EndForeignModify *must* be implemented.
I thought that these were optional, and if you can do your work
with just, say, ExecForeignDelete, you could do that.

I left that paragraph roughly as it is, please change it if
appropriate.

I also changed the misspelled "resultRelaion" and updated a
few comments.

May I suggest to split the patch in two parts, one for
all the parts that affect postgres_fdw and one for the rest?
That might make the committer's work easier, since
postgres_fdw is not applied (yet).

Yours,
Laurenz Albe

Attachment	Content-Type	Size
pgsql-v9.3-writable-fdw-poc.v7.patch	application/octet-stream	170.1 KB

From:	Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
To:	Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Cc:	Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, "Alexander Korotkov EXTERN" <aekorotkov(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Subject:	Re: [v9.3] writable foreign tables
Date:	2012-12-12 13:45:04
Message-ID:	CADyhKSWWerCNpPfqSxwY6XykAmWA9FnJcut8tadD8roO-0f9OQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

2012/12/11 Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>:
> Kohei KaiGai wrote:
>>> Weird, that fails for me.
>
>> Both of the troubles you reported was fixed with attached patch.
>> I also added relevant test cases into regression test, please check it.
>
> It passes the regression tests, and solves the problems I found.
>
> I came up with one more query that causes a problem:
>
> CREATE TABLE test(
> id integer PRIMARY KEY,
> val text NOT NULL
> );
>
> INSERT INTO test(id, val) VALUES (1, 'one');
>
> CREATE FOREIGN TABLE rtest(
> id integer not null,
> val text not null
> ) SERVER loopback OPTIONS (relname 'test');
>
> /* loopback points to the same database */
>
> WITH ch AS (
> UPDATE test
> SET val='changed'
> RETURNING id
> ) UPDATE rtest
> SET val='new'
> FROM ch
> WHERE rtest.id = ch.id;
>
> This causes a deadlock, but one that is not detected;
> the query just keeps hanging.
>
> The UPDATE in the CTE has the rows locked, so the
> SELECT ... FOR UPDATE issued via the FDW connection will hang
> indefinitely.
>
> I wonder if that's just a pathological corner case that we shouldn't
> worry about. Loopback connections for FDWs themselves might not
> be so rare, for example as a substitute for autonomous subtransactions.
>
> I guess it is not easily possible to detect such a situation or
> to do something reasonable about it.
>
It is not avoidable problem due to the nature of distributed database system,
not only loopback scenario.

In my personal opinion, I'd like to wait for someone implements distributed
lock/transaction manager on top of the background worker framework being
recently committed, to intermediate lock request.
However, it will take massive amount of efforts to existing lock/transaction
management layer, not only enhancement of FDW APIs. It is obviously out
of scope in this patch.

So, I'd like to suggest authors of FDW that support writable features to put
mention about possible deadlock scenario in their documentation.
At least, above writable CTE example is a situation that two different sessions
concurrently update the "test" relation, thus, one shall be blocked.

>>> I took a brief look at the documentation; that will need some more
>>> work.
>>
>> Yep, I believe it should be revised prior to this patch being committed.
>
> I tried to overhaul the documentation, see the attached patch.
>
> There was one thing that I was not certain of:
> You say that for writable foreign tables, BeginForeignModify
> and EndForeignModify *must* be implemented.
> I thought that these were optional, and if you can do your work
> with just, say, ExecForeignDelete, you could do that.
>
Yes, that's right. What I wrote was incorrect.
If FDW driver does not require any state during modification of
foreign tables, indeed, these are not mandatory handler.

> I left that paragraph roughly as it is, please change it if
> appropriate.
>
> I also changed the misspelled "resultRelaion" and updated a
> few comments.
>
> May I suggest to split the patch in two parts, one for
> all the parts that affect postgres_fdw and one for the rest?
> That might make the committer's work easier, since
> postgres_fdw is not applied (yet).
>
OK. I split the patch into two portion, part-1 is the APIs relevant
patch, part-2 is relevant to postgres_fdw patch.

In addition, I adjusted the APIs a bit.
Even though the rowid pseudo-column is constructed as cstring
data type, the API delivered it as Datum type. We have no reason
why not to case prior to invoke Update or Delete handler.
At the beginning, I constructed the rowid pseudo-column using
INTERNALOID, but it made troubles when fetched tuples are
materialized prior to table modification.
So, the "rowid" argument of them are re-defined as "const char *".

Also, I found a possible bug when EXPLAIN command prints plan
tree with qualifiers that reference pseudo-columns. In this case,
it raise an error with "invalid attnum %d ...".
So, I adjusted the logic to reference eref alias when get_attname
returns NULL towards foreign-tables. It also required FDW driver
also need to set up rte->eref alias for each pseudo-column.

Thanks,
--
KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>

Attachment	Content-Type	Size
pgsql-v9.3-writable-fdw-poc.v8.part-2.patch	application/octet-stream	109.6 KB
pgsql-v9.3-writable-fdw-poc.v8.part-1.patch	application/octet-stream	51.1 KB

From:	"Erik Rijkers" <er(at)xs4all(dot)nl>
To:	"Kohei KaiGai" <kaigai(at)kaigai(dot)gr(dot)jp>
Cc:	"Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at>, "Atri Sharma" <atri(dot)jiit(at)gmail(dot)com>, "Alexander Korotkov EXTERN" <aekorotkov(at)gmail(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Robert Haas" <robertmhaas(at)gmail(dot)com>, "PgHacker" <pgsql-hackers(at)postgresql(dot)org>, "Shigeru Hanada" <shigeru(dot)hanada(at)gmail(dot)com>
Subject:	Re: [v9.3] writable foreign tables
Date:	2012-12-12 14:44:35
Message-ID:	c54efc622f20ef0e6f3e4a79d8caa0cc.squirrel@webmail.xs4all.nl
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, December 12, 2012 14:45, Kohei KaiGai wrote:

> pgsql-v9.3-writable-fdw-poc.v8.part-2.patch 151 k
> pgsql-v9.3-writable-fdw-poc.v8.part-1.patch 70 k

I wanted to have a look at this, and tried to apply part 1, en then part 2 on top of that (that's
the idea, right?)

part 1 applies and then part 2 does not.

Erik Rijkers

From:	Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
To:	Erik Rijkers EXTERN <er(at)xs4all(dot)nl>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Cc:	Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Alexander Korotkov EXTERN <aekorotkov(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, "Shigeru Hanada" <shigeru(dot)hanada(at)gmail(dot)com>
Subject:	Re: [v9.3] writable foreign tables
Date:	2012-12-12 14:54:46
Message-ID:	A737B7A37273E048B164557ADEF4A58B05788B3B@ntex2010i.host.magwien.gv.at
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Erik Rijkers wrote:
>> pgsql-v9.3-writable-fdw-poc.v8.part-2.patch 151 k
>> pgsql-v9.3-writable-fdw-poc.v8.part-1.patch 70 k
>
> I wanted to have a look at this, and tried to apply part 1, en then part 2 on top of that (that's
> the idea, right?)
>
> part 1 applies and then part 2 does not.

Part 2 needs this prerequisite:
http://archives.postgresql.org/message-id/CAEZqfEcQtxn1JSjhC5usqhL4_n+Zck3mqo=RzEDFpz+DAWfF_g@mail.gmail.com

Yours,
Laurenz Albe

From:	Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
To:	"Kohei KaiGai EXTERN" <kaigai(at)kaigai(dot)gr(dot)jp>
Cc:	Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Alexander Korotkov EXTERN <aekorotkov(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, "Shigeru Hanada" <shigeru(dot)hanada(at)gmail(dot)com>
Subject:	Re: [v9.3] writable foreign tables
Date:	2012-12-14 14:31:07
Message-ID:	A737B7A37273E048B164557ADEF4A58B057895F7@ntex2010i.host.magwien.gv.at
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Kohei KaiGai wrote:
>> I came up with one more query that causes a problem:
[...]
>> This causes a deadlock, but one that is not detected;
>> the query just keeps hanging.
>>
>> The UPDATE in the CTE has the rows locked, so the
>> SELECT ... FOR UPDATE issued via the FDW connection will hang
>> indefinitely.
>>
>> I wonder if that's just a pathological corner case that we shouldn't
>> worry about. Loopback connections for FDWs themselves might not
>> be so rare, for example as a substitute for autonomous subtransactions.
>>
>> I guess it is not easily possible to detect such a situation or
>> to do something reasonable about it.
>
> It is not avoidable problem due to the nature of distributed database system,
> not only loopback scenario.
>
> In my personal opinion, I'd like to wait for someone implements distributed
> lock/transaction manager on top of the background worker framework being
> recently committed, to intermediate lock request.
> However, it will take massive amount of efforts to existing lock/transaction
> management layer, not only enhancement of FDW APIs. It is obviously out
> of scope in this patch.
>
> So, I'd like to suggest authors of FDW that support writable features to put
> mention about possible deadlock scenario in their documentation.
> At least, above writable CTE example is a situation that two different sessions
> concurrently update the "test" relation, thus, one shall be blocked.

Fair enough.

>> I tried to overhaul the documentation, see the attached patch.
>>
>> There was one thing that I was not certain of:
>> You say that for writable foreign tables, BeginForeignModify
>> and EndForeignModify *must* be implemented.
>> I thought that these were optional, and if you can do your work
> with just, say, ExecForeignDelete, you could do that.
>
> Yes, that's right. What I wrote was incorrect.
> If FDW driver does not require any state during modification of
> foreign tables, indeed, these are not mandatory handler.

I have updated the documentation, that was all I changed in the
attached patches.

> OK. I split the patch into two portion, part-1 is the APIs relevant
> patch, part-2 is relevant to postgres_fdw patch.

Great.

I'll mark the patch as "ready for committer".

Yours,
Laurenz Albe

Attachment	Content-Type	Size
pgsql-v9.3-writable-fdw-poc.v9.part-2.patch	application/octet-stream	109.2 KB
pgsql-v9.3-writable-fdw-poc.v9.part-1.patch	application/octet-stream	49.5 KB

From:	Ronan Dunklau <rdunklau(at)gmail(dot)com>
To:	Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Cc:	"Kohei KaiGai EXTERN" <kaigai(at)kaigai(dot)gr(dot)jp>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, "Alexander Korotkov EXTERN" <aekorotkov(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Subject:	Re: [v9.3] writable foreign tables
Date:	2012-12-18 16:28:17
Message-ID:	CAJWq4=Y6a5g0tif2xLq7V429QYbcRJ9opxQh8BeeqnYg0mokDw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hello.

I've tried to implement this API for our Multicorn FDW, and I have a few
questions about the API.

First, I don't understand what are the requirements on the "rowid"
pseudo-column values.

In particular, this sentence from a previous mail makes it ambiguous to me:

> At the beginning, I constructed the rowid pseudo-column using
> INTERNALOID, but it made troubles when fetched tuples are
> materialized prior to table modification.
> So, the "rowid" argument of them are re-defined as "const char *".

Does that mean that one should only store a cstring in the rowid
pseudo-column ? Or can one store an arbitrary pointer ? Currently, I'm
building a text value using cstring_to_text_with_len. Could there be a
problem with that ?

Secondly, how does one use a returning clause ?
I've tried to look at the postgres_fdw code, but it does not seem to handle
that.

For what its worth, knowing that the postgres_fdw is still in WIP status,
here is a couple result of my experimentation with it:

- Insert into a foreign table mapped to a table with a "before" trigger,
using a returning clause, will return the values as they were before being
modified by the trigger.
- Same thing, but if the trigger prevents insertion (returning NULL), the
"would-have-been" inserted row is still returned, although the insert
statement reports zero rows.
- Inserting into a table with default values does not work as intended,
since missing values are replaced by a null value in the remote statement.

What can be done to make the behaviour more consistent ?

I'm very excited about this feature, thank you for making this possible.

Regards,
--
Ronan Dunklau

2012/12/14 Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>

> Kohei KaiGai wrote:
> >> I came up with one more query that causes a problem:
> [...]
> >> This causes a deadlock, but one that is not detected;
> >> the query just keeps hanging.
> >>
> >> The UPDATE in the CTE has the rows locked, so the
> >> SELECT ... FOR UPDATE issued via the FDW connection will hang
> >> indefinitely.
> >>
> >> I wonder if that's just a pathological corner case that we shouldn't
> >> worry about. Loopback connections for FDWs themselves might not
> >> be so rare, for example as a substitute for autonomous subtransactions.
> >>
> >> I guess it is not easily possible to detect such a situation or
> >> to do something reasonable about it.
> >
> > It is not avoidable problem due to the nature of distributed database
> system,
> > not only loopback scenario.
> >
> > In my personal opinion, I'd like to wait for someone implements
> distributed
> > lock/transaction manager on top of the background worker framework being
> > recently committed, to intermediate lock request.
> > However, it will take massive amount of efforts to existing
> lock/transaction
> > management layer, not only enhancement of FDW APIs. It is obviously out
> > of scope in this patch.
> >
> > So, I'd like to suggest authors of FDW that support writable features to
> put
> > mention about possible deadlock scenario in their documentation.
> > At least, above writable CTE example is a situation that two different
> sessions
> > concurrently update the "test" relation, thus, one shall be blocked.
>
> Fair enough.
>
> >> I tried to overhaul the documentation, see the attached patch.
> >>
> >> There was one thing that I was not certain of:
> >> You say that for writable foreign tables, BeginForeignModify
> >> and EndForeignModify *must* be implemented.
> >> I thought that these were optional, and if you can do your work
> > with just, say, ExecForeignDelete, you could do that.
> >
> > Yes, that's right. What I wrote was incorrect.
> > If FDW driver does not require any state during modification of
> > foreign tables, indeed, these are not mandatory handler.
>
> I have updated the documentation, that was all I changed in the
> attached patches.
>
> > OK. I split the patch into two portion, part-1 is the APIs relevant
> > patch, part-2 is relevant to postgres_fdw patch.
>
> Great.
>
> I'll mark the patch as "ready for committer".
>
> Yours,
> Laurenz Albe
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>

From:	Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
To:	Ronan Dunklau <rdunklau(at)gmail(dot)com>
Cc:	Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, "Alexander Korotkov EXTERN" <aekorotkov(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Subject:	Re: [v9.3] writable foreign tables
Date:	2012-12-18 21:39:55
Message-ID:	CADyhKSW1VFDZA_1Cs=_+DwT5+qgTd4YbJjBBAUxdA=bSdkTCVg@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hi,

2012/12/18 Ronan Dunklau <rdunklau(at)gmail(dot)com>:
> Hello.
>
> I've tried to implement this API for our Multicorn FDW, and I have a few
> questions about the API.
>
> First, I don't understand what are the requirements on the "rowid"
> pseudo-column values.
>
> In particular, this sentence from a previous mail makes it ambiguous to me:
>
>
>> At the beginning, I constructed the rowid pseudo-column using
>> INTERNALOID, but it made troubles when fetched tuples are
>> materialized prior to table modification.
>> So, the "rowid" argument of them are re-defined as "const char *".
>
> Does that mean that one should only store a cstring in the rowid
> pseudo-column ? Or can one store an arbitrary pointer ? Currently, I'm
> building a text value using cstring_to_text_with_len. Could there be a
> problem with that ?
>
All what we require on the rowid pseudo-column is it has capability to
identify a particular row on the remote side. In case of postgres_fdw,
ctid of the relevant table is sufficient for the purpose.
I don't recommend to set the rowid field an arbitrary pointer, because
scanned tuple may be materialized between scanning and modifying
foreign table, thus, the arbitrary pointer shall be dealt under the
assumption of cstring data type.

> Secondly, how does one use a returning clause ?
> I've tried to look at the postgres_fdw code, but it does not seem to handle
> that.
>
> For what its worth, knowing that the postgres_fdw is still in WIP status,
> here is a couple result of my experimentation with it:
>
> - Insert into a foreign table mapped to a table with a "before" trigger,
> using a returning clause, will return the values as they were before being
> modified by the trigger.
> - Same thing, but if the trigger prevents insertion (returning NULL), the
> "would-have-been" inserted row is still returned, although the insert
> statement reports zero rows.
>
Hmm. Do you want to see the "real" final contents of the rows being inserted,
don't you. Yep, the proposed interface does not have capability to modify
the supplied tuple on ExecForeignInsert or other methods.

Probably, it needs to adjust interface to allow FDW drivers to return
a modified HeapTuple or TupleTableSlot for RETURNING calculation.
(As trigger doing, it can return the given one as-is if no change)

Let me investigate the code around this topic.

> - Inserting into a table with default values does not work as intended,
> since missing values are replaced by a null value in the remote statement.
>
It might be a bug of my proof-of-concept portion at postgres_fdw.
The prepared INSERT statement should list up columns being actually
used only, not all ones unconditionally.

Thanks,

> What can be done to make the behaviour more consistent ?
>
> I'm very excited about this feature, thank you for making this possible.
>
> Regards,
> --
> Ronan Dunklau
>
> 2012/12/14 Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
>>
>> Kohei KaiGai wrote:
>> >> I came up with one more query that causes a problem:
>> [...]
>> >> This causes a deadlock, but one that is not detected;
>> >> the query just keeps hanging.
>> >>
>> >> The UPDATE in the CTE has the rows locked, so the
>> >> SELECT ... FOR UPDATE issued via the FDW connection will hang
>> >> indefinitely.
>> >>
>> >> I wonder if that's just a pathological corner case that we shouldn't
>> >> worry about. Loopback connections for FDWs themselves might not
>> >> be so rare, for example as a substitute for autonomous subtransactions.
>> >>
>> >> I guess it is not easily possible to detect such a situation or
>> >> to do something reasonable about it.
>> >
>> > It is not avoidable problem due to the nature of distributed database
>> > system,
>> > not only loopback scenario.
>> >
>> > In my personal opinion, I'd like to wait for someone implements
>> > distributed
>> > lock/transaction manager on top of the background worker framework being
>> > recently committed, to intermediate lock request.
>> > However, it will take massive amount of efforts to existing
>> > lock/transaction
>> > management layer, not only enhancement of FDW APIs. It is obviously out
>> > of scope in this patch.
>> >
>> > So, I'd like to suggest authors of FDW that support writable features to
>> > put
>> > mention about possible deadlock scenario in their documentation.
>> > At least, above writable CTE example is a situation that two different
>> > sessions
>> > concurrently update the "test" relation, thus, one shall be blocked.
>>
>> Fair enough.
>>
>> >> I tried to overhaul the documentation, see the attached patch.
>> >>
>> >> There was one thing that I was not certain of:
>> >> You say that for writable foreign tables, BeginForeignModify
>> >> and EndForeignModify *must* be implemented.
>> >> I thought that these were optional, and if you can do your work
>> > with just, say, ExecForeignDelete, you could do that.
>> >
>> > Yes, that's right. What I wrote was incorrect.
>> > If FDW driver does not require any state during modification of
>> > foreign tables, indeed, these are not mandatory handler.
>>
>> I have updated the documentation, that was all I changed in the
>> attached patches.
>>
>> > OK. I split the patch into two portion, part-1 is the APIs relevant
>> > patch, part-2 is relevant to postgres_fdw patch.
>>
>> Great.
>>
>> I'll mark the patch as "ready for committer".
>>
>> Yours,
>> Laurenz Albe
>>
>>
>> --
>> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-hackers
>>
>

--
KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>

From:	Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
To:	Ronan Dunklau <rdunklau(at)gmail(dot)com>
Cc:	Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, "Alexander Korotkov EXTERN" <aekorotkov(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Subject:	Re: [v9.3] writable foreign tables
Date:	2012-12-23 11:30:59
Message-ID:	CADyhKSXnbm1kvn9H7SxVxckuJn+fLPcY9SVZJWjimk4zaHPYkg@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

The attached patches are revised version for before-row triggers or
default setting.

I adjusted APIs to allow FDW drivers to modify the supplied slot
according to the row
being actually inserted / updated on the remote side.
Now, relevant handlers are declared as follows:

TupleTableSlot *
ExecForeignInsert (ResultRelInfo *resultRelInfo,
TupleTableSlot *slot);

TupleTableSlot *
ExecForeignUpdate (ResultRelInfo *resultRelInfo,
const char *rowid,
TupleTableSlot *slot);

bool
ExecForeignDelete (ResultRelInfo *resultRelInfo,
const char *rowid);

If nothing were modified, insert or update handler usually returns the supplied
slot as is. Elsewhere, it can return a modified image of tuple according to the
result of remote query execution, including default setting or
before-row trigger.

Let's see the following examples with enhanced postgres_fdw.

postgres=# CREATE FOREIGN TABLE ft1 (a int, b text, c date) SERVER
loopback OPTIONS(relname 't1');
CREATE FOREIGN TABLE
postgres=# SELECT * FROM ft1;
a | b | c
---+-----+------------
1 | aaa | 2012-12-10
2 | bbb | 2012-12-15
3 | ccc | 2012-12-20
(3 rows)

==> ft1 is a foreign table behalf of t1 on loopback server.

postgres=# ALTER TABLE t1 ALTER c SET DEFAULT current_date;
ALTER TABLE
postgres=# CREATE OR REPLACE FUNCTION t1_trig_func() RETURNS trigger AS $$
postgres$# BEGIN
postgres$# NEW.b = NEW.b || '_trig_update';
postgres$# RETURN NEW;
postgres$# END;
postgres$# $$ LANGUAGE plpgsql;
CREATE FUNCTION
postgres=# CREATE TRIGGER t1_br_trig BEFORE INSERT OR UPDATE ON t1 FOR
EACH ROW EXECUTE PROCEDURE t1_trig_func();
CREATE TRIGGER

==> The "t1" table has default setting and before-row triggers.

postgres=# INSERT INTO ft1 VALUES (4,'ddd','2012-12-05'),
(5,'eee',null) RETURNING *;
a | b | c
---+-----------------+------------
4 | ddd_trig_update | 2012-12-05
5 | eee_trig_update |
(2 rows)

INSERT 0 2
postgres=# INSERT INTO ft1 VALUES (6, 'fff') RETURNING *;
a | b | c
---+-----------------+------------
6 | fff_trig_update | 2012-12-23
(1 row)

INSERT 0 1

==> RETURNING clause can back modified image on the remote side.
Please note that the results are affected by default-setting and
before-row procedure on remote side.

postgres=# UPDATE ft1 SET a = a + 100 RETURNING *;
a | b | c
-----+-----------------------------+------------
101 | aaa_trig_update | 2012-12-10
102 | bbb_trig_update | 2012-12-15
103 | ccc_trig_update | 2012-12-20
104 | ddd_trig_update_trig_update | 2012-12-05
105 | eee_trig_update_trig_update |
106 | fff_trig_update_trig_update | 2012-12-23
(6 rows)

UPDATE 6

==> update command also

The modified API lost a feature to return the number of rows being affected
that might be useful in case when underlying query multiple rows on the
remote side. However, I rethought it does not really make sense.
If FDW driver support a feature to push down whole UPDATE or DELETE
command towards simple queries, it is not so difficult to cache the result
of the query that affect multiple rows, then the relevant routine can pop
the cached result for each invocation without remote query execution on
demand.

Thanks,

2012/12/18 Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>:
> Hi,
>
> 2012/12/18 Ronan Dunklau <rdunklau(at)gmail(dot)com>:
>> Hello.
>>
>> I've tried to implement this API for our Multicorn FDW, and I have a few
>> questions about the API.
>>
>> First, I don't understand what are the requirements on the "rowid"
>> pseudo-column values.
>>
>> In particular, this sentence from a previous mail makes it ambiguous to me:
>>
>>
>>> At the beginning, I constructed the rowid pseudo-column using
>>> INTERNALOID, but it made troubles when fetched tuples are
>>> materialized prior to table modification.
>>> So, the "rowid" argument of them are re-defined as "const char *".
>>
>> Does that mean that one should only store a cstring in the rowid
>> pseudo-column ? Or can one store an arbitrary pointer ? Currently, I'm
>> building a text value using cstring_to_text_with_len. Could there be a
>> problem with that ?
>>
> All what we require on the rowid pseudo-column is it has capability to
> identify a particular row on the remote side. In case of postgres_fdw,
> ctid of the relevant table is sufficient for the purpose.
> I don't recommend to set the rowid field an arbitrary pointer, because
> scanned tuple may be materialized between scanning and modifying
> foreign table, thus, the arbitrary pointer shall be dealt under the
> assumption of cstring data type.
>
>> Secondly, how does one use a returning clause ?
>> I've tried to look at the postgres_fdw code, but it does not seem to handle
>> that.
>>
>> For what its worth, knowing that the postgres_fdw is still in WIP status,
>> here is a couple result of my experimentation with it:
>>
>> - Insert into a foreign table mapped to a table with a "before" trigger,
>> using a returning clause, will return the values as they were before being
>> modified by the trigger.
>> - Same thing, but if the trigger prevents insertion (returning NULL), the
>> "would-have-been" inserted row is still returned, although the insert
>> statement reports zero rows.
>>
> Hmm. Do you want to see the "real" final contents of the rows being inserted,
> don't you. Yep, the proposed interface does not have capability to modify
> the supplied tuple on ExecForeignInsert or other methods.
>
> Probably, it needs to adjust interface to allow FDW drivers to return
> a modified HeapTuple or TupleTableSlot for RETURNING calculation.
> (As trigger doing, it can return the given one as-is if no change)
>
> Let me investigate the code around this topic.
>
>> - Inserting into a table with default values does not work as intended,
>> since missing values are replaced by a null value in the remote statement.
>>
> It might be a bug of my proof-of-concept portion at postgres_fdw.
> The prepared INSERT statement should list up columns being actually
> used only, not all ones unconditionally.
>
> Thanks,
>
>> What can be done to make the behaviour more consistent ?
>>
>> I'm very excited about this feature, thank you for making this possible.
>>
>> Regards,
>> --
>> Ronan Dunklau
>>
>> 2012/12/14 Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
>>>
>>> Kohei KaiGai wrote:
>>> >> I came up with one more query that causes a problem:
>>> [...]
>>> >> This causes a deadlock, but one that is not detected;
>>> >> the query just keeps hanging.
>>> >>
>>> >> The UPDATE in the CTE has the rows locked, so the
>>> >> SELECT ... FOR UPDATE issued via the FDW connection will hang
>>> >> indefinitely.
>>> >>
>>> >> I wonder if that's just a pathological corner case that we shouldn't
>>> >> worry about. Loopback connections for FDWs themselves might not
>>> >> be so rare, for example as a substitute for autonomous subtransactions.
>>> >>
>>> >> I guess it is not easily possible to detect such a situation or
>>> >> to do something reasonable about it.
>>> >
>>> > It is not avoidable problem due to the nature of distributed database
>>> > system,
>>> > not only loopback scenario.
>>> >
>>> > In my personal opinion, I'd like to wait for someone implements
>>> > distributed
>>> > lock/transaction manager on top of the background worker framework being
>>> > recently committed, to intermediate lock request.
>>> > However, it will take massive amount of efforts to existing
>>> > lock/transaction
>>> > management layer, not only enhancement of FDW APIs. It is obviously out
>>> > of scope in this patch.
>>> >
>>> > So, I'd like to suggest authors of FDW that support writable features to
>>> > put
>>> > mention about possible deadlock scenario in their documentation.
>>> > At least, above writable CTE example is a situation that two different
>>> > sessions
>>> > concurrently update the "test" relation, thus, one shall be blocked.
>>>
>>> Fair enough.
>>>
>>> >> I tried to overhaul the documentation, see the attached patch.
>>> >>
>>> >> There was one thing that I was not certain of:
>>> >> You say that for writable foreign tables, BeginForeignModify
>>> >> and EndForeignModify *must* be implemented.
>>> >> I thought that these were optional, and if you can do your work
>>> > with just, say, ExecForeignDelete, you could do that.
>>> >
>>> > Yes, that's right. What I wrote was incorrect.
>>> > If FDW driver does not require any state during modification of
>>> > foreign tables, indeed, these are not mandatory handler.
>>>
>>> I have updated the documentation, that was all I changed in the
>>> attached patches.
>>>
>>> > OK. I split the patch into two portion, part-1 is the APIs relevant
>>> > patch, part-2 is relevant to postgres_fdw patch.
>>>
>>> Great.
>>>
>>> I'll mark the patch as "ready for committer".
>>>
>>> Yours,
>>> Laurenz Albe
>>>
>>>
>>> --
>>> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
>>> To make changes to your subscription:
>>> http://www.postgresql.org/mailpref/pgsql-hackers
>>>
>>
>
>
>
> --
> KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>

--
KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>

Attachment	Content-Type	Size
pgsql-v9.3-writable-fdw-poc.v10.part-2.patch	application/octet-stream	129.0 KB
pgsql-v9.3-writable-fdw-poc.v10.part-1.patch	application/octet-stream	51.3 KB

From:	Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, "Alexander Korotkov EXTERN" <aekorotkov(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Ronan Dunklau <rdunklau(at)gmail(dot)com>
Subject:	Re: [v9.3] writable foreign tables
Date:	2013-01-29 09:19:42
Message-ID:	CADyhKSXXcqED944s1x5_W+cWX8f+Pehk+5vtB4Awa2ARoWw6WQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

I noticed the v10 patch cannot be applied on the latest master branch
cleanly. The attached ones were rebased.

I also noticed that deparse_expression() assumes relation never has
columns with attribute number equal or larger than number of columns
of the relation, thus it makes a problem when EXPLAIN shows a plan
tree including Var node towards pseudo-column.
So, I adjusted the code to construct column-name array being used
for EXPLAIN output.

I don't touch the part-2 portion that extends Hanada-san's postgres_fdw
v5 patch.

Thanks,

2012/12/23 Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>:
> The attached patches are revised version for before-row triggers or
> default setting.
>
> I adjusted APIs to allow FDW drivers to modify the supplied slot
> according to the row
> being actually inserted / updated on the remote side.
> Now, relevant handlers are declared as follows:
>
> TupleTableSlot *
> ExecForeignInsert (ResultRelInfo *resultRelInfo,
> TupleTableSlot *slot);
>
> TupleTableSlot *
> ExecForeignUpdate (ResultRelInfo *resultRelInfo,
> const char *rowid,
> TupleTableSlot *slot);
>
> bool
> ExecForeignDelete (ResultRelInfo *resultRelInfo,
> const char *rowid);
>
> If nothing were modified, insert or update handler usually returns the supplied
> slot as is. Elsewhere, it can return a modified image of tuple according to the
> result of remote query execution, including default setting or
> before-row trigger.
>
> Let's see the following examples with enhanced postgres_fdw.
>
> postgres=# CREATE FOREIGN TABLE ft1 (a int, b text, c date) SERVER
> loopback OPTIONS(relname 't1');
> CREATE FOREIGN TABLE
> postgres=# SELECT * FROM ft1;
> a | b | c
> ---+-----+------------
> 1 | aaa | 2012-12-10
> 2 | bbb | 2012-12-15
> 3 | ccc | 2012-12-20
> (3 rows)
>
> ==> ft1 is a foreign table behalf of t1 on loopback server.
>
> postgres=# ALTER TABLE t1 ALTER c SET DEFAULT current_date;
> ALTER TABLE
> postgres=# CREATE OR REPLACE FUNCTION t1_trig_func() RETURNS trigger AS $$
> postgres$# BEGIN
> postgres$# NEW.b = NEW.b || '_trig_update';
> postgres$# RETURN NEW;
> postgres$# END;
> postgres$# $$ LANGUAGE plpgsql;
> CREATE FUNCTION
> postgres=# CREATE TRIGGER t1_br_trig BEFORE INSERT OR UPDATE ON t1 FOR
> EACH ROW EXECUTE PROCEDURE t1_trig_func();
> CREATE TRIGGER
>
> ==> The "t1" table has default setting and before-row triggers.
>
> postgres=# INSERT INTO ft1 VALUES (4,'ddd','2012-12-05'),
> (5,'eee',null) RETURNING *;
> a | b | c
> ---+-----------------+------------
> 4 | ddd_trig_update | 2012-12-05
> 5 | eee_trig_update |
> (2 rows)
>
> INSERT 0 2
> postgres=# INSERT INTO ft1 VALUES (6, 'fff') RETURNING *;
> a | b | c
> ---+-----------------+------------
> 6 | fff_trig_update | 2012-12-23
> (1 row)
>
> INSERT 0 1
>
> ==> RETURNING clause can back modified image on the remote side.
> Please note that the results are affected by default-setting and
> before-row procedure on remote side.
>
> postgres=# UPDATE ft1 SET a = a + 100 RETURNING *;
> a | b | c
> -----+-----------------------------+------------
> 101 | aaa_trig_update | 2012-12-10
> 102 | bbb_trig_update | 2012-12-15
> 103 | ccc_trig_update | 2012-12-20
> 104 | ddd_trig_update_trig_update | 2012-12-05
> 105 | eee_trig_update_trig_update |
> 106 | fff_trig_update_trig_update | 2012-12-23
> (6 rows)
>
> UPDATE 6
>
> ==> update command also
>
> The modified API lost a feature to return the number of rows being affected
> that might be useful in case when underlying query multiple rows on the
> remote side. However, I rethought it does not really make sense.
> If FDW driver support a feature to push down whole UPDATE or DELETE
> command towards simple queries, it is not so difficult to cache the result
> of the query that affect multiple rows, then the relevant routine can pop
> the cached result for each invocation without remote query execution on
> demand.
>
> Thanks,
>
> 2012/12/18 Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>:
>> Hi,
>>
>> 2012/12/18 Ronan Dunklau <rdunklau(at)gmail(dot)com>:
>>> Hello.
>>>
>>> I've tried to implement this API for our Multicorn FDW, and I have a few
>>> questions about the API.
>>>
>>> First, I don't understand what are the requirements on the "rowid"
>>> pseudo-column values.
>>>
>>> In particular, this sentence from a previous mail makes it ambiguous to me:
>>>
>>>
>>>> At the beginning, I constructed the rowid pseudo-column using
>>>> INTERNALOID, but it made troubles when fetched tuples are
>>>> materialized prior to table modification.
>>>> So, the "rowid" argument of them are re-defined as "const char *".
>>>
>>> Does that mean that one should only store a cstring in the rowid
>>> pseudo-column ? Or can one store an arbitrary pointer ? Currently, I'm
>>> building a text value using cstring_to_text_with_len. Could there be a
>>> problem with that ?
>>>
>> All what we require on the rowid pseudo-column is it has capability to
>> identify a particular row on the remote side. In case of postgres_fdw,
>> ctid of the relevant table is sufficient for the purpose.
>> I don't recommend to set the rowid field an arbitrary pointer, because
>> scanned tuple may be materialized between scanning and modifying
>> foreign table, thus, the arbitrary pointer shall be dealt under the
>> assumption of cstring data type.
>>
>>> Secondly, how does one use a returning clause ?
>>> I've tried to look at the postgres_fdw code, but it does not seem to handle
>>> that.
>>>
>>> For what its worth, knowing that the postgres_fdw is still in WIP status,
>>> here is a couple result of my experimentation with it:
>>>
>>> - Insert into a foreign table mapped to a table with a "before" trigger,
>>> using a returning clause, will return the values as they were before being
>>> modified by the trigger.
>>> - Same thing, but if the trigger prevents insertion (returning NULL), the
>>> "would-have-been" inserted row is still returned, although the insert
>>> statement reports zero rows.
>>>
>> Hmm. Do you want to see the "real" final contents of the rows being inserted,
>> don't you. Yep, the proposed interface does not have capability to modify
>> the supplied tuple on ExecForeignInsert or other methods.
>>
>> Probably, it needs to adjust interface to allow FDW drivers to return
>> a modified HeapTuple or TupleTableSlot for RETURNING calculation.
>> (As trigger doing, it can return the given one as-is if no change)
>>
>> Let me investigate the code around this topic.
>>
>>> - Inserting into a table with default values does not work as intended,
>>> since missing values are replaced by a null value in the remote statement.
>>>
>> It might be a bug of my proof-of-concept portion at postgres_fdw.
>> The prepared INSERT statement should list up columns being actually
>> used only, not all ones unconditionally.
>>
>> Thanks,
>>
>>> What can be done to make the behaviour more consistent ?
>>>
>>> I'm very excited about this feature, thank you for making this possible.
>>>
>>> Regards,
>>> --
>>> Ronan Dunklau
>>>
>>> 2012/12/14 Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
>>>>
>>>> Kohei KaiGai wrote:
>>>> >> I came up with one more query that causes a problem:
>>>> [...]
>>>> >> This causes a deadlock, but one that is not detected;
>>>> >> the query just keeps hanging.
>>>> >>
>>>> >> The UPDATE in the CTE has the rows locked, so the
>>>> >> SELECT ... FOR UPDATE issued via the FDW connection will hang
>>>> >> indefinitely.
>>>> >>
>>>> >> I wonder if that's just a pathological corner case that we shouldn't
>>>> >> worry about. Loopback connections for FDWs themselves might not
>>>> >> be so rare, for example as a substitute for autonomous subtransactions.
>>>> >>
>>>> >> I guess it is not easily possible to detect such a situation or
>>>> >> to do something reasonable about it.
>>>> >
>>>> > It is not avoidable problem due to the nature of distributed database
>>>> > system,
>>>> > not only loopback scenario.
>>>> >
>>>> > In my personal opinion, I'd like to wait for someone implements
>>>> > distributed
>>>> > lock/transaction manager on top of the background worker framework being
>>>> > recently committed, to intermediate lock request.
>>>> > However, it will take massive amount of efforts to existing
>>>> > lock/transaction
>>>> > management layer, not only enhancement of FDW APIs. It is obviously out
>>>> > of scope in this patch.
>>>> >
>>>> > So, I'd like to suggest authors of FDW that support writable features to
>>>> > put
>>>> > mention about possible deadlock scenario in their documentation.
>>>> > At least, above writable CTE example is a situation that two different
>>>> > sessions
>>>> > concurrently update the "test" relation, thus, one shall be blocked.
>>>>
>>>> Fair enough.
>>>>
>>>> >> I tried to overhaul the documentation, see the attached patch.
>>>> >>
>>>> >> There was one thing that I was not certain of:
>>>> >> You say that for writable foreign tables, BeginForeignModify
>>>> >> and EndForeignModify *must* be implemented.
>>>> >> I thought that these were optional, and if you can do your work
>>>> > with just, say, ExecForeignDelete, you could do that.
>>>> >
>>>> > Yes, that's right. What I wrote was incorrect.
>>>> > If FDW driver does not require any state during modification of
>>>> > foreign tables, indeed, these are not mandatory handler.
>>>>
>>>> I have updated the documentation, that was all I changed in the
>>>> attached patches.
>>>>
>>>> > OK. I split the patch into two portion, part-1 is the APIs relevant
>>>> > patch, part-2 is relevant to postgres_fdw patch.
>>>>
>>>> Great.
>>>>
>>>> I'll mark the patch as "ready for committer".
>>>>
>>>> Yours,
>>>> Laurenz Albe
>>>>
>>>>
>>>> --
>>>> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
>>>> To make changes to your subscription:
>>>> http://www.postgresql.org/mailpref/pgsql-hackers
>>>>
>>>
>>
>>
>>
>> --
>> KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>
>
>
>
> --
> KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>

--
KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>

Attachment	Content-Type	Size
pgsql-v9.3-writable-fdw-poc.v11.part-2.patch	application/octet-stream	129.0 KB
pgsql-v9.3-writable-fdw-poc.v11.part-1.patch	application/octet-stream	52.5 KB

From:	Daniel Farina <daniel(at)heroku(dot)com>
To:	Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [v9.3] writable foreign tables
Date:	2013-02-01 10:22:06
Message-ID:	CAAZKuFZNUW-tEwiLgsJYejv_aoyx+6+YcbtExigeqjPdhR8Sog@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, Jan 29, 2013 at 1:19 AM, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp> wrote:
> I noticed the v10 patch cannot be applied on the latest master branch
> cleanly. The attached ones were rebased.

Hello,

I'm just getting started looking at this, but notice that the second
patch relies on contrib/postgres_fdw to apply, but it's not clear to
me where to get the correct version of that. One way to solve this
would be to tell me where to get a version of that, and another that
may work well would be to relay a Git repository with postgres_fdw and
the writable fdw changes accumulated.

I poked around on git.postgresql.org and github and wasn't able to
find a recent pushed copy of this.

Anyway, I'm looking at the first patch, which applies neatly. Thanks.

--
fdr

From:	Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
To:	Daniel Farina <daniel(at)heroku(dot)com>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [v9.3] writable foreign tables
Date:	2013-02-01 10:34:38
Message-ID:	CADyhKSUG=hgjHn3KKDdfE2AAKYTfxe6kT4uQeOsFknmCfoRMEg@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

2013/2/1 Daniel Farina <daniel(at)heroku(dot)com>:
> On Tue, Jan 29, 2013 at 1:19 AM, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp> wrote:
>> I noticed the v10 patch cannot be applied on the latest master branch
>> cleanly. The attached ones were rebased.
>
> Hello,
>
> I'm just getting started looking at this, but notice that the second
> patch relies on contrib/postgres_fdw to apply, but it's not clear to
> me where to get the correct version of that. One way to solve this
> would be to tell me where to get a version of that, and another that
> may work well would be to relay a Git repository with postgres_fdw and
> the writable fdw changes accumulated.
>
> I poked around on git.postgresql.org and github and wasn't able to
> find a recent pushed copy of this.
>
> Anyway, I'm looking at the first patch, which applies neatly. Thanks.
>
Thanks for your reviewing.

The second part assumes the latest (v5) postgres_fdw patch being
applied. It has been in status of ready-for-committer since last CF,
and nothing was changed.
https://commitfest.postgresql.org/action/patch_view?id=940

If I oversight nothing, it is the latest version I reviewed before.
Is there somebody who can volunteer to review his patch and commit it?

Hanada-san, if you have some updates, please share it with us.

Thanks,
--
KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>

From:	Daniel Farina <daniel(at)heroku(dot)com>
To:	Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [v9.3] writable foreign tables
Date:	2013-02-05 09:17:48
Message-ID:	CAAZKuFbSKeiTnoTA89WNbugPfbvV6wawZG+7Wm=NTEX9wA+ZoQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Fri, Feb 1, 2013 at 2:22 AM, Daniel Farina <daniel(at)heroku(dot)com> wrote:
> On Tue, Jan 29, 2013 at 1:19 AM, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp> wrote:
>> I noticed the v10 patch cannot be applied on the latest master branch
>> cleanly. The attached ones were rebased.
>
> Anyway, I'm looking at the first patch, which applies neatly. Thanks.

Hello,

My review is incomplete, but to the benefit of pipelining I wanted to
ask a couple of things and submit some changes for your consideration
while continuing to look at this.

So far, I've been trying to understand in very close detail how your
changes affect non-FDW related paths first, before delving into the
actual writable FDW functionality. There's not that much in this
category, but there's one thing that gave me pause: the fact that the
target list (by convention, tlist) has to be pushed down from
planmain.c/query_planner all the way to a
fdwroutine->GetForeignRelWidth callsite in plancat.c/get_relation_info
(so even in the end, it is just pushed down -- never inspected or
modified). In relative terms, it's a significant widening of
currently fairly short parameter lists in these procedures:

add_base_rels_to_query(PlannerInfo *root, List *tlist, Node *jtnode)
build_simple_rel(PlannerInfo *root, int relid, List *tlist, RelOptKind
reloptkind)
get_relation_info(PlannerInfo *root, Oid relationObjectId, bool
inhparent, List *tlist, RelOptInfo *rel)

In the case of all the other parameters except tlist, each is more
intimately related with the inner workings and mechanisms of the
procedure. I wonder if there's a nice way that can train the reader's
eye that the tlist is simply pushed down rather than actively managed
at each level. However, I can't immediately produce a way to both
achieve that that doesn't seem overwrought. Perhaps a comment will
suffice.

Related to this, I found that make_modifytable grew a dependency on
PlannerInfo *root, and it seemed like a bunch of the arguments are
functionally related to that, so the attached patch that should be
able to be applied to yours tries to utilize that as often as
possible. The weirdest thing in there is how make_modifytable has
been taught to call SS_assign_special_param, which has a side effect
on the PlannerInfo, but there exist exactly zero cases where one
*doesn't* want to do that prior to the (exactly two) call sites to
make_modifytable, so I pushed it in. The second weirdest thing is
pushing in the handling of rowMarks (e.g. SELECT FOR UPDATE et al)

Let me know as to your thoughts, otherwise I'll keep moving on...

--
fdr

Attachment	Content-Type	Size
wfdw-rely-on-plannerinfo.patch	application/octet-stream	5.1 KB

From:	Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
To:	Daniel Farina <daniel(at)heroku(dot)com>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [v9.3] writable foreign tables
Date:	2013-02-07 17:03:56
Message-ID:	CADyhKSX=-jabz5GbqxSj++uP5pYBD+vVje9WpoU0iZa3G_s-Ug@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

2013/2/5 Daniel Farina <daniel(at)heroku(dot)com>:
> On Fri, Feb 1, 2013 at 2:22 AM, Daniel Farina <daniel(at)heroku(dot)com> wrote:
>> On Tue, Jan 29, 2013 at 1:19 AM, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp> wrote:
>>> I noticed the v10 patch cannot be applied on the latest master branch
>>> cleanly. The attached ones were rebased.
>>
>> Anyway, I'm looking at the first patch, which applies neatly. Thanks.
>
> Hello,
>
> My review is incomplete, but to the benefit of pipelining I wanted to
> ask a couple of things and submit some changes for your consideration
> while continuing to look at this.
>
> So far, I've been trying to understand in very close detail how your
> changes affect non-FDW related paths first, before delving into the
> actual writable FDW functionality. There's not that much in this
> category, but there's one thing that gave me pause: the fact that the
> target list (by convention, tlist) has to be pushed down from
> planmain.c/query_planner all the way to a
> fdwroutine->GetForeignRelWidth callsite in plancat.c/get_relation_info
> (so even in the end, it is just pushed down -- never inspected or
> modified). In relative terms, it's a significant widening of
> currently fairly short parameter lists in these procedures:
>
> add_base_rels_to_query(PlannerInfo *root, List *tlist, Node *jtnode)
> build_simple_rel(PlannerInfo *root, int relid, List *tlist, RelOptKind
> reloptkind)
> get_relation_info(PlannerInfo *root, Oid relationObjectId, bool
> inhparent, List *tlist, RelOptInfo *rel)
>
> In the case of all the other parameters except tlist, each is more
> intimately related with the inner workings and mechanisms of the
> procedure. I wonder if there's a nice way that can train the reader's
> eye that the tlist is simply pushed down rather than actively managed
> at each level. However, I can't immediately produce a way to both
> achieve that that doesn't seem overwrought. Perhaps a comment will
> suffice.
>
Thanks for your checks.

add_base_rels_to_query() originally has two arguments; PlannerInfo
and Node, but these are wide-spreading in use. I don't think it is a good
idea to add a special field to reduce number of arguments only.
Instead, I tried to add a short comment on top of add_base_rels_to_query()
as follows.

* XXX - Also note that tlist needs to be pushed down into deeper level,
* for construction of RelOptInfo relevant to foreign-tables with pseudo-
* columns.
*/

I'm not 100% sure whether it is a good comment here, because internal
will be revised patch-by-patch. So, if future version also modifies argument
list here, this comment may talks a lie.

> Related to this, I found that make_modifytable grew a dependency on
> PlannerInfo *root, and it seemed like a bunch of the arguments are
> functionally related to that, so the attached patch that should be
> able to be applied to yours tries to utilize that as often as
> possible. The weirdest thing in there is how make_modifytable has
> been taught to call SS_assign_special_param, which has a side effect
> on the PlannerInfo, but there exist exactly zero cases where one
> *doesn't* want to do that prior to the (exactly two) call sites to
> make_modifytable, so I pushed it in. The second weirdest thing is
> pushing in the handling of rowMarks (e.g. SELECT FOR UPDATE et al)
>
Good suggestion. I added the PlannerInfo *root with spinal-reflexes.
Indeed, I'd like to agree with your optimization.

The attached patch adds Daniel's reworks on make_modifytable
invocation, and add a short comment on add_base_rels_to_query().
Rest of portion has not been changed from the previous version.

Thanks,
--
KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>

Attachment	Content-Type	Size
pgsql-v9.3-writable-fdw-poc.v12.part-2.patch	application/octet-stream	129.0 KB
pgsql-v9.3-writable-fdw-poc.v12.part-1.patch	application/octet-stream	54.7 KB

From:	Craig Ringer <craig(at)2ndquadrant(dot)com>
To:	Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Cc:	Daniel Farina <daniel(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [v9.3] writable foreign tables
Date:	2013-03-03 14:15:57
Message-ID:	51335B1D.5010000@2ndquadrant.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 02/08/2013 01:03 AM, Kohei KaiGai wrote:
> The attached patch adds Daniel's reworks on make_modifytable
> invocation, and add a short comment on add_base_rels_to_query(). Rest
> of portion has not been changed from the previous version.
How's this looking for 9.3? On-list discussion seems to have been
positive but inconclusive and time's running out. Do you think this can
be turned into a production-worthy feature in the next week or two?

A quick look at the patch shows that it includes reasonable-looking
documentation changes. I didn't see any regression test suite changes,
though; either I missed them or that's something that probably needs
addressing.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc:	Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>, Daniel Farina <daniel(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [v9.3] writable foreign tables
Date:	2013-03-03 15:17:26
Message-ID:	15550.1362323846@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Craig Ringer <craig(at)2ndquadrant(dot)com> writes:
> On 02/08/2013 01:03 AM, Kohei KaiGai wrote:
>> The attached patch adds Daniel's reworks on make_modifytable
>> invocation, and add a short comment on add_base_rels_to_query(). Rest
>> of portion has not been changed from the previous version.

> How's this looking for 9.3? On-list discussion seems to have been
> positive but inconclusive and time's running out. Do you think this can
> be turned into a production-worthy feature in the next week or two?

I think it needs major changes. The portion against
contrib/postgres_fdw fails to apply at all, of course, but that's my
fault for having hacked so much on postgres_fdw before committing it.
More generally, I don't much like the approach to ctid-substitute
columns --- I think hacking on the rel's tupledesc like that is
guaranteed to break things all over the place. The assorted ugly
kluges that are already in the patch because of it are just scratching
the surface, and there may well be consequences that are flat out
unfixable. Probably the resjunk-columns mechanism would offer a better
solution.

I had hoped to spend several days on this and perhaps get it into
committable shape, because I think this is a pretty significant feature
that will take FDWs over the line from curiosity to useful tool.
However, I've been hoping that for nigh two weeks now and not actually
had any cycles to spend on it ...

regards, tom lane

From:	Craig Ringer <craig(at)2ndquadrant(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>, Daniel Farina <daniel(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [v9.3] writable foreign tables
Date:	2013-03-04 02:51:06
Message-ID:	51340C1A.1060903@2ndquadrant.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 03/03/2013 11:17 PM, Tom Lane wrote:
> Craig Ringer <craig(at)2ndquadrant(dot)com> writes:
>> On 02/08/2013 01:03 AM, Kohei KaiGai wrote:
>>> The attached patch adds Daniel's reworks on make_modifytable
>>> invocation, and add a short comment on add_base_rels_to_query(). Rest
>>> of portion has not been changed from the previous version.
>> How's this looking for 9.3? On-list discussion seems to have been
>> positive but inconclusive and time's running out. Do you think this can
>> be turned into a production-worthy feature in the next week or two?
> I think it needs major changes. The portion against
> contrib/postgres_fdw fails to apply at all, of course, but that's my
> fault for having hacked so much on postgres_fdw before committing it.
> More generally, I don't much like the approach to ctid-substitute
> columns --- I think hacking on the rel's tupledesc like that is
> guaranteed to break things all over the place. The assorted ugly
> kluges that are already in the patch because of it are just scratching
> the surface, and there may well be consequences that are flat out
> unfixable. Probably the resjunk-columns mechanism would offer a better
> solution.
>
> I had hoped to spend several days on this and perhaps get it into
> committable shape, because I think this is a pretty significant feature
> that will take FDWs over the line from curiosity to useful tool.
> However, I've been hoping that for nigh two weeks now and not actually
> had any cycles to spend on it ...
Do you have any further brief suggestions for things that KaiGai Kohei
or others could do to make your side of this process easier and reduce
the amount of your time it'll demand?

For now it seems this stays in hopefully-can-be-made-ready limbo. I'll
keep looking through the list.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

From:	Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Craig Ringer <craig(at)2ndquadrant(dot)com>, Daniel Farina <daniel(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [v9.3] writable foreign tables
Date:	2013-03-04 09:05:17
Message-ID:	CADyhKSXpTki-E629Ui51zMutUW94=mrZ+DZpTqZitZUEtrAYBw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

2013/3/3 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> Craig Ringer <craig(at)2ndquadrant(dot)com> writes:
>> On 02/08/2013 01:03 AM, Kohei KaiGai wrote:
>>> The attached patch adds Daniel's reworks on make_modifytable
>>> invocation, and add a short comment on add_base_rels_to_query(). Rest
>>> of portion has not been changed from the previous version.
>
>> How's this looking for 9.3? On-list discussion seems to have been
>> positive but inconclusive and time's running out. Do you think this can
>> be turned into a production-worthy feature in the next week or two?
>
> I think it needs major changes. The portion against
> contrib/postgres_fdw fails to apply at all, of course, but that's my
> fault for having hacked so much on postgres_fdw before committing it.
> More generally, I don't much like the approach to ctid-substitute
> columns --- I think hacking on the rel's tupledesc like that is
> guaranteed to break things all over the place. The assorted ugly
> kluges that are already in the patch because of it are just scratching
> the surface, and there may well be consequences that are flat out
> unfixable. Probably the resjunk-columns mechanism would offer a better
> solution.
>
Probably, the latest patch takes an approach that utilizes resjunk-columns
that moves remote row-identifier on scan stage to modify stage, but no
hacks on tupledesc.
The new GetForeignRelWidth API allows FDW drivers to append slots
to return a few pseudo-columns to upper level scan. It can contain
a remote row-identifier of the row to be modified.
Also, rewriteTargetListUD() injects a junk target-entry to reference this
pseudo-column on update or delete from foreign tables as regular
table is doing on ctid.

Regarding to the portion towards postgres_fdw, I'm under reworking
on the part-2 of this patch to apply cleanly.

Thanks,
--
KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Cc:	Daniel Farina <daniel(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [v9.3] writable foreign tables
Date:	2013-03-10 18:32:53
Message-ID:	3157.1362940373@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp> writes:
> [ pgsql-v9.3-writable-fdw-poc.v12.part-1/2.patch ]

Applied after rather extensive editorialization. DELETE RETURNING in
particular was a mess, and I also tried to make SELECT FOR UPDATE behave
in what seemed like a sane fashion.

There's a lot left to do here of course. One thing I was wondering
about was why we don't allow DEFAULTs to be attached to foreign-table
columns. There was no use in it before, but it seems sensible enough
now.

regards, tom lane

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Cc:	Daniel Farina <daniel(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	postgres_fdw vs data formatting GUCs (was Re: [v9.3] writable foreign tables)
Date:	2013-03-10 19:15:56
Message-ID:	9032.1362942956@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

I wrote:
> There's a lot left to do here of course. One thing I was wondering
> about was why we don't allow DEFAULTs to be attached to foreign-table
> columns. There was no use in it before, but it seems sensible enough
> now.

Hmm ... the buildfarm just rubbed my nose in a more immediate issue,
which is that postgres_fdw is vulnerable to problems if the remote
server is using different GUC settings than it is for things like
timezone and datestyle. The failure seen here:
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=rover_firefly&dt=2013-03-10%2018%3A30%3A00
is basically just cosmetic, but it's not hard to imagine non-cosmetic
problems coming up. For instance, suppose our instance is running in
DMY datestyle and transmits an ambiguous date to a remote running in
MDY datestyle.

We could consider sending our settings to the remote at connection
establishment, but that doesn't seem terribly bulletproof --- what if
someone does a local SET later? What seems safer is to set the remote
to ISO style always, but then we have to figure out how to get the local
timestamptz_out to emit that style without touching our local GUC.
Ugh.

(One more reason why GUCs that affect application-visible semantics are
dangerous.)

regards, tom lane

From:	Andrew Dunstan <andrew(at)dunslane(dot)net>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>, Daniel Farina <daniel(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [v9.3] writable foreign tables
Date:	2013-03-10 20:32:22
Message-ID:	513CEDD6.2030203@dunslane.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 03/10/2013 02:32 PM, Tom Lane wrote:
> Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp> writes:
>> [ pgsql-v9.3-writable-fdw-poc.v12.part-1/2.patch ]
> Applied after rather extensive editorialization. DELETE RETURNING in
> particular was a mess, and I also tried to make SELECT FOR UPDATE behave
> in what seemed like a sane fashion.
>
> There's a lot left to do here of course. One thing I was wondering
> about was why we don't allow DEFAULTs to be attached to foreign-table
> columns. There was no use in it before, but it seems sensible enough
> now.

Excellent news. But I noticed as I went to update my non-writeable FDW
that this has happened in the regression tests. Is this correct?

***
/home/pgl/npgl/fdw/file_text_array_fdw/expected/file_textarray_fdw.out
2013-03-10 16:28:00.120340629 -0400
---
/home/pgl/npgl/fdw/file_text_array_fdw/results/file_textarray_fdw.out
2013-03-10 16:28:00.595340910 -0400
***************
*** 188,196 ****
LINE 1: DELETE FROM agg_csv_array WHERE a = 100;
^
SELECT * FROM agg_csv_array FOR UPDATE OF agg_csv_array;
! ERROR: SELECT FOR UPDATE/SHARE cannot be used with foreign table
"agg_csv_array"
! LINE 1: SELECT * FROM agg_csv_array FOR UPDATE OF agg_csv_array;
! ^
-- but this should be ignored
SELECT * FROM agg_csv_array FOR UPDATE;
t
--- 188,200 ----
LINE 1: DELETE FROM agg_csv_array WHERE a = 100;
^
SELECT * FROM agg_csv_array FOR UPDATE OF agg_csv_array;
! t
! --------------
! {100,99.097}
! {0,0.09561}
! {42,324.78}
! (3 rows)
!
-- but this should be ignored
SELECT * FROM agg_csv_array FOR UPDATE;
t

cheers

andrew

From:	Thom Brown <thom(at)linux(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>, Daniel Farina <daniel(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [v9.3] writable foreign tables
Date:	2013-03-10 20:38:21
Message-ID:	CAA-aLv7c3QM_3uYQumfYMeSKaGN5j3VnpPyXS4U2J0OTZ0-PDw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 10 March 2013 18:32, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp> writes:
>> [ pgsql-v9.3-writable-fdw-poc.v12.part-1/2.patch ]
>
> Applied after rather extensive editorialization. DELETE RETURNING in
> particular was a mess, and I also tried to make SELECT FOR UPDATE behave
> in what seemed like a sane fashion.
>
> There's a lot left to do here of course. One thing I was wondering
> about was why we don't allow DEFAULTs to be attached to foreign-table
> columns. There was no use in it before, but it seems sensible enough
> now.

Yes...

postgres=# INSERT INTO animals (id, animal, age) VALUES (DEFAULT,
'okapi', NULL);
ERROR: null value in column "id" violates not-null constraint
DETAIL: Failing row contains (null, okapi, null).
CONTEXT: Remote SQL command: INSERT INTO public.animals(id, animal,
age) VALUES ($1, $2, $3)

Out of curiosity, is there any way to explicitly force a foreign
DEFAULT with column-omission?

--
Thom

From:	Thom Brown <thom(at)linux(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>, Daniel Farina <daniel(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [v9.3] writable foreign tables
Date:	2013-03-10 20:55:28
Message-ID:	CAA-aLv71QL3YZ=mbhJsM1SzOzkNR7eOkJ6tMiq+=oHimf9TMBQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 10 March 2013 20:38, Thom Brown <thom(at)linux(dot)com> wrote:
> On 10 March 2013 18:32, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp> writes:
>>> [ pgsql-v9.3-writable-fdw-poc.v12.part-1/2.patch ]
>>
>> Applied after rather extensive editorialization. DELETE RETURNING in
>> particular was a mess, and I also tried to make SELECT FOR UPDATE behave
>> in what seemed like a sane fashion.
>>
>> There's a lot left to do here of course. One thing I was wondering
>> about was why we don't allow DEFAULTs to be attached to foreign-table
>> columns. There was no use in it before, but it seems sensible enough
>> now.
>
> Yes...
>
> postgres=# INSERT INTO animals (id, animal, age) VALUES (DEFAULT,
> 'okapi', NULL);
> ERROR: null value in column "id" violates not-null constraint
> DETAIL: Failing row contains (null, okapi, null).
> CONTEXT: Remote SQL command: INSERT INTO public.animals(id, animal,
> age) VALUES ($1, $2, $3)
>
> Out of curiosity, is there any way to explicitly force a foreign
> DEFAULT with column-omission?

Looks like we'll also need tab-completion for UPDATE, INSERT and
DELETE statements on foreign tables.

--
Thom

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc:	Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>, Daniel Farina <daniel(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [v9.3] writable foreign tables
Date:	2013-03-10 21:53:00
Message-ID:	12448.1362952380@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> Excellent news. But I noticed as I went to update my non-writeable FDW
> that this has happened in the regression tests. Is this correct?

Yeah, see the adjustment I made in the file_fdw test (which that looks
to be borrowed from).

The new theory is that SELECT FOR UPDATE is allowed on foreign tables,
and if the FDW doesn't do anything to implement it, it's just a no-op.

I looked into having the core code throw an error, but it was a pain
in the rear and of dubious merit anyway (since you can't really tell
for sure from outside the FDW whether the FDW did anything or whether
there's even any need to do anything for the particular data source).
Besides, the old behavior was less than consistent, since it only
complained when the FOR UPDATE directly mentioned the foreign table,
though that's not what the semantics are supposed to be.

regards, tom lane

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Thom Brown <thom(at)linux(dot)com>
Cc:	Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>, Daniel Farina <daniel(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [v9.3] writable foreign tables
Date:	2013-03-10 22:01:59
Message-ID:	12701.1362952919@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Thom Brown <thom(at)linux(dot)com> writes:
> On 10 March 2013 18:32, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> There's a lot left to do here of course. One thing I was wondering
>> about was why we don't allow DEFAULTs to be attached to foreign-table
>> columns. There was no use in it before, but it seems sensible enough
>> now.

> postgres=# INSERT INTO animals (id, animal, age) VALUES (DEFAULT,
> 'okapi', NULL);
> ERROR: null value in column "id" violates not-null constraint
> DETAIL: Failing row contains (null, okapi, null).

> Out of curiosity, is there any way to explicitly force a foreign
> DEFAULT with column-omission?

That's one of the things that would have to be worked out before
we could implement anything here. The easy answer would be that DEFAULT
specifies the local default, and only if you omit the column entirely
from the local command (including not having a local default) does the
remote default take effect. But whether that would be convenient to
use is hard to tell.

Another thing that would be easy to implement is to say that the new row
value is fully determined locally (including defaults if any) and remote
defaults have nothing to do with it. But I think that's almost
certainly a usability fail --- imagine that the remote has a
sequence-generated primary key, for instance. I think it's probably
necessary to permit remote insertion of defaults for that sort of table
definition to work conveniently.

Not real sure what the ideal behavior would be or how hard it would be
to implement.

regards, tom lane

From:	Daniel Farina <daniel(at)heroku(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: postgres_fdw vs data formatting GUCs (was Re: [v9.3] writable foreign tables)
Date:	2013-03-11 04:53:39
Message-ID:	CAAZKuFYi+9=jNLE3bTXNGTRSNHVLb0gwmYYBP0D+=wAax-Ns5Q@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Sun, Mar 10, 2013 at 12:15 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I wrote:
>> There's a lot left to do here of course. One thing I was wondering
>> about was why we don't allow DEFAULTs to be attached to foreign-table
>> columns. There was no use in it before, but it seems sensible enough
>> now.
>
> Hmm ... the buildfarm just rubbed my nose in a more immediate issue,
> which is that postgres_fdw is vulnerable to problems if the remote
> server is using different GUC settings than it is for things like
> timezone and datestyle. The failure seen here:
> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=rover_firefly&dt=2013-03-10%2018%3A30%3A00
> is basically just cosmetic, but it's not hard to imagine non-cosmetic
> problems coming up. For instance, suppose our instance is running in
> DMY datestyle and transmits an ambiguous date to a remote running in
> MDY datestyle.
>
> We could consider sending our settings to the remote at connection
> establishment, but that doesn't seem terribly bulletproof --- what if
> someone does a local SET later? What seems safer is to set the remote
> to ISO style always, but then we have to figure out how to get the local
> timestamptz_out to emit that style without touching our local GUC.
> Ugh.

Forgive my naivety: why would a timestamptz value not be passed
through the _in/_out function locally at least once (hence, respecting
local GUCs) when using the FDW? Is the problem overhead in not
wanting to parse and re-format the value on the local server?

Although it seems that if GUCs affected *parsing* then the problem
comes back again, since one may choose to canonicalize output from a
remote server (e.g. via setting ISO time in all cases) but should the
user have set up GUCs to interpret input in a particular way for a
data type that is not enough.

As-is this situation is already technically true for timestamptz in
the case of time stamps without any explicit time offset or time zone,
but since timestamptz_out doesn't ever render without the offset
(right?) it's not a problem. Close shave, though, and one that an
extension author could easily find themselves on the wrong side of.

I suppose that means any non-immutable in/out function pair may have
to be carefully considered, and the list is despairingly long...but
finite:

SELECT proname
FROM pg_proc WHERE EXISTS
(SELECT 1
FROM pg_type
WHERE pg_proc.oid = pg_type.typinput OR
pg_proc.oid = pg_type.typoutput OR
pg_proc.oid = pg_type.typsend OR
pg_proc.oid = pg_type.typreceive)
AND provolatile != 'i';

> (One more reason why GUCs that affect application-visible semantics are
dangerous.)

--
fdr

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Daniel Farina <daniel(at)heroku(dot)com>
Cc:	Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: postgres_fdw vs data formatting GUCs (was Re: [v9.3] writable foreign tables)
Date:	2013-03-11 13:56:57
Message-ID:	28874.1363010217@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Daniel Farina <daniel(at)heroku(dot)com> writes:
> On Sun, Mar 10, 2013 at 12:15 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Hmm ... the buildfarm just rubbed my nose in a more immediate issue,
>> which is that postgres_fdw is vulnerable to problems if the remote
>> server is using different GUC settings than it is for things like
>> timezone and datestyle.

> Forgive my naivety: why would a timestamptz value not be passed
> through the _in/_out function locally at least once (hence, respecting
> local GUCs) when using the FDW? Is the problem overhead in not
> wanting to parse and re-format the value on the local server?

No, the problem is that ambiguous dates may be transferred incorrectly
to or from the remote server. Once a timestamp value is inside our
server, we are responsible for getting it to the remote end accurately,
IMO.

Here's an example using the "loopback" server that's set up by the
postgres_fdw regression test:

contrib_regression=# show datestyle; -- this is the style the "remote" session will be using
DateStyle
-----------
ISO, MDY
(1 row)

contrib_regression=# create table remote (f1 timestamptz);
CREATE TABLE
contrib_regression=# create foreign table ft (f1 timestamptz) server loopback options (table_name 'remote');
CREATE FOREIGN TABLE
contrib_regression=# set datestyle = german, dmy;
SET
contrib_regression=# select now();
now
--------------------------------
11.03.2013 09:40:17.401173 EDT
(1 row)

contrib_regression=# insert into ft values(now());
INSERT 0 1
contrib_regression=# select *, now() from ft;
f1 | now
--------------------------------+-------------------------------
03.11.2013 08:40:58.682679 EST | 11.03.2013 09:41:30.96724 EDT
(1 row)

The remote end has entirely misinterpreted the day vs month fields.
Now, to hit this you need to be using a datestyle which will print
in an ambiguous format, so the "ISO" and "Postgres" styles are
not vulnerable; but "German" and "SQL" are.

> I suppose that means any non-immutable in/out function pair may have
> to be carefully considered, and the list is despairingly long...but
> finite:

A look at pg_dump says that it only worries about setting datestyle,
intervalstyle, and extra_float_digits.

regards, tom lane

From:	Josh Berkus <josh(at)agliodbs(dot)com>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: postgres_fdw vs data formatting GUCs (was Re: [v9.3] writable foreign tables)
Date:	2013-03-11 16:40:43
Message-ID:	513E090B.1080703@agliodbs.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

> The remote end has entirely misinterpreted the day vs month fields.
> Now, to hit this you need to be using a datestyle which will print
> in an ambiguous format, so the "ISO" and "Postgres" styles are
> not vulnerable; but "German" and "SQL" are.

Is the "timezone" GUC a problem, also, for this? Seems like that could
be much worse ...

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Josh Berkus <josh(at)agliodbs(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: postgres_fdw vs data formatting GUCs (was Re: [v9.3] writable foreign tables)
Date:	2013-03-11 17:01:12
Message-ID:	20072.1363021272@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Josh Berkus <josh(at)agliodbs(dot)com> writes:
>> The remote end has entirely misinterpreted the day vs month fields.
>> Now, to hit this you need to be using a datestyle which will print
>> in an ambiguous format, so the "ISO" and "Postgres" styles are
>> not vulnerable; but "German" and "SQL" are.

> Is the "timezone" GUC a problem, also, for this? Seems like that could
> be much worse ...

I'm not sure why you think being off by an hour is "much worse" than
being off by nine months ;-). But anyway, timezone seems to be mostly a
cosmetic issue, since timestamptz values will always get printed with an
explicit zone value. I think you could possibly burn yourself if a
foreign table were declared as being type timestamp when the underlying
column is really timestamptz ... but that kind of misconfiguration would
probably create a lot of misbehaviors above and beyond this one.

Having said that, I'd still be inclined to try to set the remote's
timezone GUC just so that error messages coming back from the remote
don't reflect a randomly different timezone, which was the basic issue
in the buildfarm failures we saw yesterday. OTOH, there is no guarantee
at all that the remote has the same timezone database we do, so it may
not know the zone or may think it has different DST rules than we think;
so it's not clear how far we can get with that. Maybe we should just
set the remote session's timezone to GMT always.

regards, tom lane

From:	Josh Berkus <josh(at)agliodbs(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: postgres_fdw vs data formatting GUCs (was Re: [v9.3] writable foreign tables)
Date:	2013-03-11 17:48:01
Message-ID:	513E18D1.1060908@agliodbs.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

> Having said that, I'd still be inclined to try to set the remote's
> timezone GUC just so that error messages coming back from the remote
> don't reflect a randomly different timezone, which was the basic issue
> in the buildfarm failures we saw yesterday. OTOH, there is no guarantee
> at all that the remote has the same timezone database we do, so it may
> not know the zone or may think it has different DST rules than we think;
> so it's not clear how far we can get with that. Maybe we should just
> set the remote session's timezone to GMT always.

Yeah, that seems the safest choice. What are the potential drawbacks,
if any?

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	pgsql-hackers(at)postgreSQL(dot)org
Subject:	Column defaults for foreign tables (was Re: [v9.3] writable foreign tables)
Date:	2013-03-11 18:50:48
Message-ID:	24107.1363027848@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

I wrote:
> Thom Brown <thom(at)linux(dot)com> writes:
>> Out of curiosity, is there any way to explicitly force a foreign
>> DEFAULT with column-omission?

> That's one of the things that would have to be worked out before
> we could implement anything here. The easy answer would be that DEFAULT
> specifies the local default, and only if you omit the column entirely
> from the local command (including not having a local default) does the
> remote default take effect. But whether that would be convenient to
> use is hard to tell.

> Another thing that would be easy to implement is to say that the new row
> value is fully determined locally (including defaults if any) and remote
> defaults have nothing to do with it. But I think that's almost
> certainly a usability fail --- imagine that the remote has a
> sequence-generated primary key, for instance. I think it's probably
> necessary to permit remote insertion of defaults for that sort of table
> definition to work conveniently.

I looked into this a bit, and realize that the code-as-committed is
already not self-consistent, because these give different results:

insert into foreigntable default values;

insert into foreigntable values(default, default, ...);

The former case inserts whatever the remote-side default values are.
The latter case inserts NULLs, regardless of the remote defaults,
because that's what the query is expanded to by the rewriter. So it
seems like this is something we must fix for 9.3, because otherwise
we're going to have backwards-compatibility issues if we try to change
the behavior later.

I've concluded that the "ideal behavior" probably is that if you have
declared a DEFAULT expression for a foreign table's column, then that's
what the default is for the purpose of inserts or updates through the
foreign table; but if you haven't, then (at least for postgres_fdw)
the effective default is whatever the remote table has.

I thought at first that we could fix this, and the related case

update foreigntable set somecolumn = default

with some relatively localized hacking in the rewriter. However, that
idea fell down when I looked at multi-row inserts:

insert into foreigntable
values (x, y, z), (a, default, b), (c, d, default), ...

The current implementation of this requires substituting the appropriate
column default expressions into the VALUES lists at rewrite time.
That's okay for a default expression that is known locally and should be
evaluated locally; but I see absolutely no practical way to make it work
if we'd like to have the defaults inserted remotely. We'd need to have
some out-of-band indication that a row being returned by the ValuesScan
node had had "default" placeholders in particular columns --- and I just
can't see us doing the amount of violence that would need to be done to
the executor to make that happen. Especially not in the 9.3 timeframe.

So one possible answer is to adopt the ignore-remote-defaults semantics
I suggested above, but I don't much like that from a usability standpoint.

Another idea is to throw a "not implemented" error on the specific case
of a multi-row VALUES with DEFAULT placeholders when the target is a
foreign table. That's pretty grotty, not least because it would have to
reject the case for all foreign tables not just postgres_fdw ones. But
it would give us some wiggle room to implement more desirable semantics
in the cases that we can handle reasonably.

Thoughts?

regards, tom lane

From:	Greg Stark <stark(at)mit(dot)edu>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Thom Brown <thom(at)linux(dot)com>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>, Daniel Farina <daniel(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [v9.3] writable foreign tables
Date:	2013-03-11 19:00:44
Message-ID:	CAM-w4HNPHookzP4ZwdQEXcUWT_4B0+tjcmZP6QFsu93_f4ptkg@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Sun, Mar 10, 2013 at 10:01 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Another thing that would be easy to implement is to say that the new row
> value is fully determined locally (including defaults if any) and remote
> defaults have nothing to do with it. But I think that's almost
> certainly a usability fail --- imagine that the remote has a
> sequence-generated primary key, for instance. I think it's probably
> necessary to permit remote insertion of defaults for that sort of table
> definition to work conveniently.

It feels a bit like unpredictable magic to have "DEFAULT" mean one
thing and omitted columns mean something else. Perhaps we should have
an explicit LOCAL DEFAULT and REMOTE DEFAULT and then have DEFAULT and
omitted columns both mean the same thing.

This starts getting a bit weird if you start to ask what happens when
the remote table is itself an FDW though....

--
greg

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Greg Stark <stark(at)mit(dot)edu>
Cc:	Thom Brown <thom(at)linux(dot)com>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>, Daniel Farina <daniel(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [v9.3] writable foreign tables
Date:	2013-03-11 19:06:54
Message-ID:	24499.1363028814@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Greg Stark <stark(at)mit(dot)edu> writes:
> It feels a bit like unpredictable magic to have "DEFAULT" mean one
> thing and omitted columns mean something else.

Agreed. The current code behaves that way, but I think that's
indisputably a bug not behavior we want to keep.

> Perhaps we should have
> an explicit LOCAL DEFAULT and REMOTE DEFAULT and then have DEFAULT and
> omitted columns both mean the same thing.

I don't think we really want to introduce new syntax for this, do you?
Especially not when many FDWs won't have a notion of a remote default
at all.

My thought was that the ideal behavior is that there's only one default
for a column, with any local definition of it taking precedence over any
remote definition. But see later message about how that may be hard to
implement correctly.

regards, tom lane

From:	Thom Brown <thom(at)linux(dot)com>
To:	Greg Stark <stark(at)mit(dot)edu>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>, Daniel Farina <daniel(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [v9.3] writable foreign tables
Date:	2013-03-11 19:07:14
Message-ID:	CAA-aLv5vn9YVLFRdcTC6_=PhFwBGWahewKHaOSf-iA7FPGeXbw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 11 March 2013 19:00, Greg Stark <stark(at)mit(dot)edu> wrote:
> On Sun, Mar 10, 2013 at 10:01 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Another thing that would be easy to implement is to say that the new row
>> value is fully determined locally (including defaults if any) and remote
>> defaults have nothing to do with it. But I think that's almost
>> certainly a usability fail --- imagine that the remote has a
>> sequence-generated primary key, for instance. I think it's probably
>> necessary to permit remote insertion of defaults for that sort of table
>> definition to work conveniently.
>
> It feels a bit like unpredictable magic to have "DEFAULT" mean one
> thing and omitted columns mean something else. Perhaps we should have
> an explicit LOCAL DEFAULT and REMOTE DEFAULT and then have DEFAULT and
> omitted columns both mean the same thing.
>
> This starts getting a bit weird if you start to ask what happens when
> the remote table is itself an FDW though....

We could have something like:

CREATE FOREIGN TABLE ...
... OPTION (default <locality>);

where <locality> is 'local' or 'remote'. But then this means it still
can't be specified in individual queries, or have a different locality
between columns on the same foreign table.

--
Thom

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Josh Berkus <josh(at)agliodbs(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: postgres_fdw vs data formatting GUCs (was Re: [v9.3] writable foreign tables)
Date:	2013-03-11 19:30:01
Message-ID:	25015.1363030201@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Josh Berkus <josh(at)agliodbs(dot)com> writes:
>> Having said that, I'd still be inclined to try to set the remote's
>> timezone GUC just so that error messages coming back from the remote
>> don't reflect a randomly different timezone, which was the basic issue
>> in the buildfarm failures we saw yesterday. OTOH, there is no guarantee
>> at all that the remote has the same timezone database we do, so it may
>> not know the zone or may think it has different DST rules than we think;
>> so it's not clear how far we can get with that. Maybe we should just
>> set the remote session's timezone to GMT always.

> Yeah, that seems the safest choice. What are the potential drawbacks,
> if any?

Hard to tell if there are any without testing it.

I remembered that there's a relatively inexpensive way to set GUC values
transiently within an operation, which is GUC_ACTION_SAVE; both
extension.c and ri_triggers.c are relying on that. So here's my
proposal for a fix:

* To make the remote end transmit values unambiguously, send SET
commands for the GUCs listed below during remote session setup.
(postgres_fdw is already assuming that such SETs will persist for the
whole session.)

* To make our end transmit values unambiguously, use GUC_ACTION_SAVE to
transiently change the GUCs listed below whenever we are converting
values to text form to send to the remote end. (This would include
deparsing of Const nodes as well as transmission of query parameters.)

* Judging from the precedent of pg_dump, these are the things we ought
to set this way:

DATESTYLE = ISO

INTERVALSTYLE = POSTGRES (skip on remote side, if version < 8.4)

EXTRA_FLOAT_DIGITS = 3 (or 2 on remote side, if version < 9.0)

* In addition I propose we set TIMEZONE = UTC on the remote side only.
This is, I believe, just a cosmetic hack so that timestamptz values
coming back in error messages will be printed consistently; it would let
us revert the kluge solution I put in place for this type of regression
failure:
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=rover_firefly&dt=2013-03-10%2018%3A30%3A00

BTW, it strikes me that dblink is probably subject to at least some of
these same failure modes. I'm not personally volunteering to fix any
of this in dblink, but maybe someone ought to look into that.

Thoughts?

regards, tom lane

From:	Daniel Farina <daniel(at)heroku(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: postgres_fdw vs data formatting GUCs (was Re: [v9.3] writable foreign tables)
Date:	2013-03-11 20:52:23
Message-ID:	CAAZKuFZc-VXxX7Fk4TR5hXmQccApmK_WhN5Z9vaUsU8H2eXxWA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, Mar 11, 2013 at 12:30 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> BTW, it strikes me that dblink is probably subject to at least some of
> these same failure modes. I'm not personally volunteering to fix any
> of this in dblink, but maybe someone ought to look into that.

I will try to make time for this, although it seems like the general
approach should match pgsql_fdw if possible. Is the current thinking
to forward the settings and then use the GUC hooks to track updates?

--
fdr

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Daniel Farina <daniel(at)heroku(dot)com>
Cc:	Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: postgres_fdw vs data formatting GUCs (was Re: [v9.3] writable foreign tables)
Date:	2013-03-12 02:04:08
Message-ID:	4309.1363053848@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Daniel Farina <daniel(at)heroku(dot)com> writes:
> I will try to make time for this, although it seems like the general
> approach should match pgsql_fdw if possible. Is the current thinking
> to forward the settings and then use the GUC hooks to track updates?

That's not what I had in mind for postgres_fdw --- rather the idea is to
avoid needing on-the-fly changes in remote-side settings, because those
are so expensive to make. However, postgres_fdw is fortunate in that
the SQL it expects to execute on the remote side is very constrained.
dblink might need a different solution that would leave room for
user-driven changes of remote-side settings.

regards, tom lane

From:	Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
To:	"Tom Lane EXTERN" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-hackers(at)postgreSQL(dot)org" <pgsql-hackers(at)postgreSQL(dot)org>
Subject:	Re: Column defaults for foreign tables (was Re: [v9.3] writable foreign tables)
Date:	2013-03-12 09:27:02
Message-ID:	A737B7A37273E048B164557ADEF4A58B057BDB54@ntex2010a.host.magwien.gv.at
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Tom Lane wrote:
>> Thom Brown <thom(at)linux(dot)com> writes:
>>> Out of curiosity, is there any way to explicitly force a foreign
>>> DEFAULT with column-omission?

> I've concluded that the "ideal behavior" probably is that if you have
> declared a DEFAULT expression for a foreign table's column, then that's
> what the default is for the purpose of inserts or updates through the
> foreign table; but if you haven't, then (at least for postgres_fdw)
> the effective default is whatever the remote table has.

I agree.

> I thought at first that we could fix this, and the related case
>
> update foreigntable set somecolumn = default
>
> with some relatively localized hacking in the rewriter. However, that
> idea fell down when I looked at multi-row inserts:
>
> insert into foreigntable
> values (x, y, z), (a, default, b), (c, d, default), ...
>
> The current implementation of this requires substituting the appropriate
> column default expressions into the VALUES lists at rewrite time.
> That's okay for a default expression that is known locally and should be
> evaluated locally; but I see absolutely no practical way to make it work
> if we'd like to have the defaults inserted remotely. We'd need to have
> some out-of-band indication that a row being returned by the ValuesScan
> node had had "default" placeholders in particular columns --- and I just
> can't see us doing the amount of violence that would need to be done to
> the executor to make that happen. Especially not in the 9.3 timeframe.
>
> So one possible answer is to adopt the ignore-remote-defaults semantics
> I suggested above, but I don't much like that from a usability standpoint.
>
> Another idea is to throw a "not implemented" error on the specific case
> of a multi-row VALUES with DEFAULT placeholders when the target is a
> foreign table. That's pretty grotty, not least because it would have to
> reject the case for all foreign tables not just postgres_fdw ones. But
> it would give us some wiggle room to implement more desirable semantics
> in the cases that we can handle reasonably.
>
> Thoughts?

Do you think that it is possible to insert remote defaults
by omitting columns like this:

INSERT INTO foreigntable (col1, col3) VALUES (a, c);

If that can be made to work, then my opinion is that throwing an
error on

INSERT INTO foreigntable (col1, col2, col3) VALUES (a, DEFAULT, c);

would be acceptable, because there is at least a workaround.

If the first variant also cannot be made to work with remote
defaults, then I'd say that the best is to use local
defaults throughout and accept the loss of usability.

Yours,
Laurenz Albe

From:	Daniel Farina <daniel(at)heroku(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: postgres_fdw vs data formatting GUCs (was Re: [v9.3] writable foreign tables)
Date:	2013-03-12 17:40:51
Message-ID:	CAAZKuFbneQRwRjOPoTZv2UWKKFC-uCB3+UkC47H+aA_2GNZ+QA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, Mar 11, 2013 at 7:04 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Daniel Farina <daniel(at)heroku(dot)com> writes:
>> I will try to make time for this, although it seems like the general
>> approach should match pgsql_fdw if possible. Is the current thinking
>> to forward the settings and then use the GUC hooks to track updates?
>
> That's not what I had in mind for postgres_fdw --- rather the idea is to
> avoid needing on-the-fly changes in remote-side settings, because those
> are so expensive to make. However, postgres_fdw is fortunate in that
> the SQL it expects to execute on the remote side is very constrained.
> dblink might need a different solution that would leave room for
> user-driven changes of remote-side settings.

Okay, I see. So inverting the thinking I wrote earlier: how about
hearkening carefully to any ParameterStatus messages on the local side
before entering the inner loop of dblink.c:materializeResult as to set
the local GUC (and carefully dropping it back off after
materializeResult) so that the the _in functions can evaluate the
input in the same relevant GUC context as the remote side?

That should handle SET actions executed remotely.

Otherwise it seems like a solution would have to be ambitious enough
to encompass reifying the GUCs from the afflicted parsers, which I
surmise is not something that we want to treat right now.

--
fdr

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Daniel Farina <daniel(at)heroku(dot)com>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: postgres_fdw vs data formatting GUCs (was Re: [v9.3] writable foreign tables)
Date:	2013-03-12 18:51:17
Message-ID:	24082.1363114277@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Daniel Farina <daniel(at)heroku(dot)com> writes:
> Okay, I see. So inverting the thinking I wrote earlier: how about
> hearkening carefully to any ParameterStatus messages on the local side
> before entering the inner loop of dblink.c:materializeResult as to set
> the local GUC (and carefully dropping it back off after
> materializeResult) so that the the _in functions can evaluate the
> input in the same relevant GUC context as the remote side?

Yeah, watching the remote side's datestyle and intervalstyle and
matching them (for both input and output) would probably work.

regards, tom lane

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Cc:	"pgsql-hackers(at)postgreSQL(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Column defaults for foreign tables (was Re: [v9.3] writable foreign tables)
Date:	2013-03-12 19:11:27
Message-ID:	24508.1363115487@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at> writes:
> Do you think that it is possible to insert remote defaults
> by omitting columns like this:
> INSERT INTO foreigntable (col1, col3) VALUES (a, c);

Well, that's how it works right now, but it's not good that it's
inconsistent with the explicit-DEFAULT case.

> If that can be made to work, then my opinion is that throwing an
> error on
> INSERT INTO foreigntable (col1, col2, col3) VALUES (a, DEFAULT, c);
> would be acceptable, because there is at least a workaround.

Aside from the oddness of not supporting that when it's equivalent
to the other case, what about this:

UPDATE foreigntable SET col2 = DEFAULT WHERE ...

There is no simple workaround if we don't provide support for that.

> If the first variant also cannot be made to work with remote
> defaults, then I'd say that the best is to use local
> defaults throughout and accept the loss of usability.

Yeah, I'm drifting towards the position that we should just define the
defaults as being whatever they are locally, rather than trying to be
cute about supporting remotely-executed defaults. It looks to me like
if we try to do the latter, we're going to have pitfalls and weird
corner cases that will never be quite transparent. There's also the
argument that this'd be a lot of work that benefits only some FDWs,
since the whole concept of remote column defaults doesn't apply when
the FDW's data source isn't a traditional RDBMS.

regards, tom lane

From:	Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
To:	"Tom Lane EXTERN" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"pgsql-hackers(at)postgreSQL(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Column defaults for foreign tables (was Re: [v9.3] writable foreign tables)
Date:	2013-03-13 08:07:25
Message-ID:	A737B7A37273E048B164557ADEF4A58B057BDE89@ntex2010a.host.magwien.gv.at
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Tom Lane wrote:
> Yeah, I'm drifting towards the position that we should just define the
> defaults as being whatever they are locally, rather than trying to be
> cute about supporting remotely-executed defaults. It looks to me like
> if we try to do the latter, we're going to have pitfalls and weird
> corner cases that will never be quite transparent. There's also the
> argument that this'd be a lot of work that benefits only some FDWs,
> since the whole concept of remote column defaults doesn't apply when
> the FDW's data source isn't a traditional RDBMS.

That was my first thought on the topic, to have a solution that
is simple (if not perfect).
Your argument that it would be unpleasant to lose the ability
to use sequence-generated remote default values made me reconsider.

But there is a workaround, namely to use a trigger before insert
to generate an automatic primary key (e.g. if the inserted value is
NULL).
Maybe it would be good to add a few hints at workarounds like that
to the documentation if it's going to be local defaults.

Yours,
Laurenz Albe

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Cc:	"pgsql-hackers(at)postgreSQL(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Column defaults for foreign tables (was Re: [v9.3] writable foreign tables)
Date:	2013-03-13 19:04:35
Message-ID:	2595.1363201475@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at> writes:
>> Yeah, I'm drifting towards the position that we should just define the
>> defaults as being whatever they are locally, rather than trying to be
>> cute about supporting remotely-executed defaults.

> That was my first thought on the topic, to have a solution that
> is simple (if not perfect).
> Your argument that it would be unpleasant to lose the ability
> to use sequence-generated remote default values made me reconsider.

> But there is a workaround, namely to use a trigger before insert
> to generate an automatic primary key (e.g. if the inserted value is
> NULL).

Another attack is to set up a different foreign table pointing to the
same remote table, but lacking the sequence column. When you insert via
that table, you'll get the remote's default for the hidden column.
This doesn't require any weird triggers on the remote side, but it could
be hard to persuade existing apps to use the second foreign table.

regards, tom lane

From:	Thom Brown <thom(at)linux(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, "pgsql-hackers(at)postgreSQL(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Column defaults for foreign tables (was Re: [v9.3] writable foreign tables)
Date:	2013-03-13 20:13:53
Message-ID:	CAA-aLv5m-rvTJTWtF66n5qKnjn8t_9D_nVBGFL4H7CS1mHHQ-w@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 13 March 2013 19:04, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at> writes:
>>> Yeah, I'm drifting towards the position that we should just define the
>>> defaults as being whatever they are locally, rather than trying to be
>>> cute about supporting remotely-executed defaults.
>
>> That was my first thought on the topic, to have a solution that
>> is simple (if not perfect).
>> Your argument that it would be unpleasant to lose the ability
>> to use sequence-generated remote default values made me reconsider.
>
>> But there is a workaround, namely to use a trigger before insert
>> to generate an automatic primary key (e.g. if the inserted value is
>> NULL).
>
> Another attack is to set up a different foreign table pointing to the
> same remote table, but lacking the sequence column. When you insert via
> that table, you'll get the remote's default for the hidden column.
> This doesn't require any weird triggers on the remote side, but it could
> be hard to persuade existing apps to use the second foreign table.

How about:

CREATE FOREIGN TABLE tablename (id int DEFAULT PASSTHROUGH) SERVER pg_server;

That way it will pass DEFAULT through to the remote table as it's
defined on the table. Users can then explicitly insert values, or
select the default, which will configured to ensure the default on the
remote server is used... although I suspect I'm overlooking something
here.

--
Thom

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Thom Brown <thom(at)linux(dot)com>
Cc:	Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, "pgsql-hackers(at)postgreSQL(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Column defaults for foreign tables (was Re: [v9.3] writable foreign tables)
Date:	2013-03-13 23:59:09
Message-ID:	9848.1363219149@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Thom Brown <thom(at)linux(dot)com> writes:
> How about:

> CREATE FOREIGN TABLE tablename (id int DEFAULT PASSTHROUGH) SERVER pg_server;

> That way it will pass DEFAULT through to the remote table as it's
> defined on the table. Users can then explicitly insert values, or
> select the default, which will configured to ensure the default on the
> remote server is used... although I suspect I'm overlooking something
> here.

Yeah ... how to implement it.

regards, tom lane

From:	Daniel Farina <daniel(at)heroku(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: postgres_fdw vs data formatting GUCs (was Re: [v9.3] writable foreign tables)
Date:	2013-03-14 21:27:04
Message-ID:	CAAZKuFYhoxaQz5BDGtzQ_NWeMfaP=FjKMR+kb2aRXNPwT3zThQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, Mar 12, 2013 at 11:51 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Daniel Farina <daniel(at)heroku(dot)com> writes:
>> Okay, I see. So inverting the thinking I wrote earlier: how about
>> hearkening carefully to any ParameterStatus messages on the local side
>> before entering the inner loop of dblink.c:materializeResult as to set
>> the local GUC (and carefully dropping it back off after
>> materializeResult) so that the the _in functions can evaluate the
>> input in the same relevant GUC context as the remote side?
>
> Yeah, watching the remote side's datestyle and intervalstyle and
> matching them (for both input and output) would probably work.

Alright, so I've been whacking at this and there's one interesting
thing to ask about: saving and restoring the local GUCs. There are a
bunch of things about GUCs besides their value that are maintained,
such as their 'source', so writing a little ad-hoc save/restore is not
going to do the right thing. Luckily, something much closer to the
right thing is done for SET LOCAL with transactions and
subtransactions, to push and pop GUC contexts:

guc.h:

extern int NewGUCNestLevel(void);
extern void AtEOXact_GUC(bool isCommit, int nestLevel);

By and large looking at the mechanics of these two functions, the
latter in particular has very little to do with the transaction
machinery in general. It's already being called from a bunch of
places that don't, to me, seem to really indicate a intrinsic
connection to transactions, e.g. do_analyze_rel, ri_triggers, and
postgres_fdw. I think in those cases the justification for settings
of 'isCommit' is informed by the mechanism more than the symbol name
or its comments. I count about ten call sites that are like this.

So, I can add one more such use of AtEOXact_GUC for the dblink fix,
but would it also be appropriate to find some new names for the
concepts (instead of AtEOXact_GUC and isCommit) here to more
accurately express what's going on?

I'll perhaps do that after finishing up the dblink fix if I receive
some positive response on the matter. However, if for some reason I
*shouldn't* use that machinery in dblink, I'd appreciate the guidance.

--
fdr

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Daniel Farina <daniel(at)heroku(dot)com>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: postgres_fdw vs data formatting GUCs (was Re: [v9.3] writable foreign tables)
Date:	2013-03-15 03:07:33
Message-ID:	19774.1363316853@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Daniel Farina <daniel(at)heroku(dot)com> writes:
> On Tue, Mar 12, 2013 at 11:51 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Yeah, watching the remote side's datestyle and intervalstyle and
>> matching them (for both input and output) would probably work.

> Alright, so I've been whacking at this and there's one interesting
> thing to ask about: saving and restoring the local GUCs. There are a
> bunch of things about GUCs besides their value that are maintained,
> such as their 'source', so writing a little ad-hoc save/restore is not
> going to do the right thing.

Right, you should use NewGUCNestLevel/AtEOXact_GUC. Look at the fixes
I committed in postgres_fdw a day or two ago for an example.

> So, I can add one more such use of AtEOXact_GUC for the dblink fix,
> but would it also be appropriate to find some new names for the
> concepts (instead of AtEOXact_GUC and isCommit) here to more
> accurately express what's going on?

Meh. I guess we could invent an "EndGUCNestLevel" that's a wrapper
around AtEOXact_GUC, but I'm not that excited about it ...

regards, tom lane

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Greg Stark <stark(at)mit(dot)edu>, Thom Brown <thom(at)linux(dot)com>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>, Daniel Farina <daniel(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [v9.3] writable foreign tables
Date:	2013-03-15 03:20:03
Message-ID:	CA+TgmoZX0t7zhFe4V-3Q+W1U1yPj3e2RjaMhB9hbzq2R_OEKYw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, Mar 11, 2013 at 3:06 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Greg Stark <stark(at)mit(dot)edu> writes:
>> It feels a bit like unpredictable magic to have "DEFAULT" mean one
>> thing and omitted columns mean something else.
>
> Agreed. The current code behaves that way, but I think that's
> indisputably a bug not behavior we want to keep.

I'm not entirely convinced that's a bug. Both behaviors seem useful,
and there has to be some way to specify each one.

But I just work here.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Greg Stark <stark(at)mit(dot)edu>, Thom Brown <thom(at)linux(dot)com>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>, Daniel Farina <daniel(at)heroku(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [v9.3] writable foreign tables
Date:	2013-03-15 15:00:38
Message-ID:	5336.1363359638@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Mon, Mar 11, 2013 at 3:06 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Greg Stark <stark(at)mit(dot)edu> writes:
>>> It feels a bit like unpredictable magic to have "DEFAULT" mean one
>>> thing and omitted columns mean something else.

>> Agreed. The current code behaves that way, but I think that's
>> indisputably a bug not behavior we want to keep.

> I'm not entirely convinced that's a bug. Both behaviors seem useful,
> and there has to be some way to specify each one.

I would love it if we had a way to provide remote-default
functionality. But per SQL spec these should produce the same results:
INSERT INTO t(f1, f2) VALUES(1, DEFAULT);
INSERT INTO t(f1) VALUES(1);
If PG fails to work like that, it's not a feature, it's a bug.
Where the default is coming from is not a justification for failing
the POLA like that.

regards, tom lane

From:	Daniel Farina <daniel(at)heroku(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: postgres_fdw vs data formatting GUCs (was Re: [v9.3] writable foreign tables)
Date:	2013-03-19 13:16:31
Message-ID:	CAAZKuFa-VsgTNZBV=ayKgJkXutbrSqYQ51H92uftb_vo88yf0w@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, Mar 14, 2013 at 8:07 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Daniel Farina <daniel(at)heroku(dot)com> writes:
>> On Tue, Mar 12, 2013 at 11:51 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> Yeah, watching the remote side's datestyle and intervalstyle and
>>> matching them (for both input and output) would probably work.
>
>> Alright, so I've been whacking at this and there's one interesting
>> thing to ask about: saving and restoring the local GUCs. There are a
>> bunch of things about GUCs besides their value that are maintained,
>> such as their 'source', so writing a little ad-hoc save/restore is not
>> going to do the right thing.
>
> Right, you should use NewGUCNestLevel/AtEOXact_GUC. Look at the fixes
> I committed in postgres_fdw a day or two ago for an example.

Thanks. Here's a patch. Here is the comments on top of the patch file inlined:

Similar in purpose to cc3f281ffb0a5d9b187e7a7b7de4a045809ff683, but
taking into account that a dblink caller may choose to cause arbitrary
changes to DateStyle and IntervalStyle. To handle this, it is
necessary to use PQparameterStatus before parsing any input, every
time. This is avoided in cases that do not include the two
problematic types treated -- interval and timestamptz -- by scanning
the TupleDesc's types first.

Although it has been suggested that extra_float_digits would need
similar treatment as IntervalStyle and DateStyle (as it's seen in
pg_dump), extra_float_digits is not an GUC_REPORT-ed GUC and is not
able to be checked in PQparameterStatus, and furthermore, the float4
and float8 parsers are not sensitive to the GUC anyway: both accept as
much precision as is provided in an unambiguous way.

>> So, I can add one more such use of AtEOXact_GUC for the dblink fix,
>> but would it also be appropriate to find some new names for the
>> concepts (instead of AtEOXact_GUC and isCommit) here to more
>> accurately express what's going on?
>
> Meh. I guess we could invent an "EndGUCNestLevel" that's a wrapper
> around AtEOXact_GUC, but I'm not that excited about it ...

Per your sentiment, I won't pursue that then.

Overall, the patch I think has reasonably thorough regression testing
(I tried to hit the several code paths whereby one would have to scan
TupleDescs, as well as a few other edge cases) and has some of the
obvious optimizations in place: it doesn't call PQparameterStatus more
than once per vulnerable type per TupleDesc scan, and avoids using the
parameter status procedure at all if there are no affected types.

The mechanisms may be overwrought though, since it was originally
intended to generalize to three types before I realized that
extra_float_digits is another kind of problem entirely, leaving just
IntervalStyle and DateStyle remaining, which perhaps could have been
treated even more specially to make the patch more terse.

I'll add it to the commitfest.

--
fdr

Attachment	Content-Type	Size
dblink-guc-sensitive-types-v1.patch	application/octet-stream	18.9 KB

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Daniel Farina <daniel(at)heroku(dot)com>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: postgres_fdw vs data formatting GUCs (was Re: [v9.3] writable foreign tables)
Date:	2013-03-19 20:16:14
Message-ID:	11141.1363724174@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Daniel Farina <daniel(at)heroku(dot)com> writes:
> Similar in purpose to cc3f281ffb0a5d9b187e7a7b7de4a045809ff683, but
> taking into account that a dblink caller may choose to cause arbitrary
> changes to DateStyle and IntervalStyle. To handle this, it is
> necessary to use PQparameterStatus before parsing any input, every
> time. This is avoided in cases that do not include the two
> problematic types treated -- interval and timestamptz -- by scanning
> the TupleDesc's types first.

Hm. Is that really a win? Examining the tupdesc wouldn't be free
either, and what's more, I think you're making false assumptions about
which types have behavior dependent on the GUCs. You definitely missed
out on date and plain timestamp, and what of domains over these types?

I'd be inclined to eat the cost of calling PQparameterStatus every time
(which won't be that much) and instead try to optimize by avoiding the
GUC-setting overhead if the current value matches the local setting.
But even that might be premature optimization. Did you do any
performance testing to see if there was a problem worth avoiding?

> Although it has been suggested that extra_float_digits would need
> similar treatment as IntervalStyle and DateStyle (as it's seen in
> pg_dump), extra_float_digits is not an GUC_REPORT-ed GUC and is not
> able to be checked in PQparameterStatus, and furthermore, the float4
> and float8 parsers are not sensitive to the GUC anyway: both accept as
> much precision as is provided in an unambiguous way.

Agreed, we don't need to worry so much about that one; or at least,
if we do need to worry, it's on the other end of the connection from
what we're fixing here.

regards, tom lane

From:	Daniel Farina <daniel(at)heroku(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: postgres_fdw vs data formatting GUCs (was Re: [v9.3] writable foreign tables)
Date:	2013-03-19 21:08:33
Message-ID:	CAAZKuFbY0Y+FkFpQ8aAvRCj00zvEqpxkNCGuXruRS_=8wXD3DQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, Mar 19, 2013 at 1:16 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Daniel Farina <daniel(at)heroku(dot)com> writes:
>> Similar in purpose to cc3f281ffb0a5d9b187e7a7b7de4a045809ff683, but
>> taking into account that a dblink caller may choose to cause arbitrary
>> changes to DateStyle and IntervalStyle. To handle this, it is
>> necessary to use PQparameterStatus before parsing any input, every
>> time. This is avoided in cases that do not include the two
>> problematic types treated -- interval and timestamptz -- by scanning
>> the TupleDesc's types first.
>
> Hm. Is that really a win? Examining the tupdesc wouldn't be free
> either, and what's more, I think you're making false assumptions about
> which types have behavior dependent on the GUCs. You definitely missed
> out on date and plain timestamp, and what of domains over these types?

Yes, I also forgot about arrays, and nested composites in arrays or
nested composites. As soon as recursive types are introduced the scan
is not sufficient for sure: it's necessary to copy every GUC unless
one wants to recurse through the catalogs which feels like a heavy
loss.

I presumed at the time that skimming over the tupdecs to compare a few
Oids would be significantly cheaper than the guts of
PQparameterStatus, which I believe to be a linked list traversal. I
mostly wrote that optimization because it was easy coincidentally from
how I chose to structure the code rather than belief in its absolute
necessity.

And, as you said, I forgot a few types even for the simple case, which
is a bit of a relief because some of the machinery I wrote for the n =
3 case will come in useful for that.

> I'd be inclined to eat the cost of calling PQparameterStatus every time
> (which won't be that much) and instead try to optimize by avoiding the
> GUC-setting overhead if the current value matches the local setting.
> But even that might be premature optimization. Did you do any
> performance testing to see if there was a problem worth avoiding?

Nope; should I invent a new way to do that, or would it be up to
commit standard based on inspection alone? I'm okay either way, let
me know.

I'll take into account these problems and post a new version.

--
fdr

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Daniel Farina <daniel(at)heroku(dot)com>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: postgres_fdw vs data formatting GUCs (was Re: [v9.3] writable foreign tables)
Date:	2013-03-19 21:41:13
Message-ID:	12848.1363729273@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Daniel Farina <daniel(at)heroku(dot)com> writes:
> On Tue, Mar 19, 2013 at 1:16 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> I'd be inclined to eat the cost of calling PQparameterStatus every time
>> (which won't be that much) and instead try to optimize by avoiding the
>> GUC-setting overhead if the current value matches the local setting.
>> But even that might be premature optimization. Did you do any
>> performance testing to see if there was a problem worth avoiding?

> Nope; should I invent a new way to do that, or would it be up to
> commit standard based on inspection alone? I'm okay either way, let
> me know.

Doesn't seem that hard to test: run a dblink query that pulls back a
bunch of data under best-case conditions (ie, local not remote server),
and time it with and without the proposed fix. If the difference
is marginal then it's not worth working hard to optimize.

regards, tom lane

From:	Daniel Farina <daniel(at)heroku(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: postgres_fdw vs data formatting GUCs (was Re: [v9.3] writable foreign tables)
Date:	2013-03-19 22:06:29
Message-ID:	CAAZKuFbOH4uc1oiq969D0+v04+=keNOX9tyjBjzF_VfMQ8VFag@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, Mar 19, 2013 at 2:41 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Daniel Farina <daniel(at)heroku(dot)com> writes:
>> On Tue, Mar 19, 2013 at 1:16 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> I'd be inclined to eat the cost of calling PQparameterStatus every time
>>> (which won't be that much) and instead try to optimize by avoiding the
>>> GUC-setting overhead if the current value matches the local setting.
>>> But even that might be premature optimization. Did you do any
>>> performance testing to see if there was a problem worth avoiding?
>
>> Nope; should I invent a new way to do that, or would it be up to
>> commit standard based on inspection alone? I'm okay either way, let
>> me know.
>
> Doesn't seem that hard to test: run a dblink query that pulls back a
> bunch of data under best-case conditions (ie, local not remote server),
> and time it with and without the proposed fix. If the difference
> is marginal then it's not worth working hard to optimize.

Okay, will do, and here's the shorter and less mechanically intensive
naive version that I think is the baseline: it doesn't try to optimize
out any GUC settings and sets up the GUCs before the two
materialization paths in dblink.

Something I forgot to ask about is about if an strangely-minded type
input function could whack around the GUC as records are being
remitted, which would necessitate per-tuple polling of
pqParameterStatus (e.g. in the inner loop of a materialization) .
That seemed pretty "out there," but I'm broaching it for completeness.

I'll see how much of a penalty it is vs. not applying any patch at all next.

--
fdr

Attachment	Content-Type	Size
dblink-guc-sensitive-types-v2.patch	application/octet-stream	14.0 KB

From:	Daniel Farina <daniel(at)heroku(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: postgres_fdw vs data formatting GUCs (was Re: [v9.3] writable foreign tables)
Date:	2013-03-20 01:10:06
Message-ID:	CAAZKuFa0H8XxRF9PgKcHj=ffBZNHibcBQJLJBX67nc0ozu-wMA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, Mar 19, 2013 at 3:06 PM, Daniel Farina <daniel(at)heroku(dot)com> wrote:
> On Tue, Mar 19, 2013 at 2:41 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Daniel Farina <daniel(at)heroku(dot)com> writes:
>>> On Tue, Mar 19, 2013 at 1:16 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>>> I'd be inclined to eat the cost of calling PQparameterStatus every time
>>>> (which won't be that much) and instead try to optimize by avoiding the
>>>> GUC-setting overhead if the current value matches the local setting.
>>>> But even that might be premature optimization. Did you do any
>>>> performance testing to see if there was a problem worth avoiding?
>>
>>> Nope; should I invent a new way to do that, or would it be up to
>>> commit standard based on inspection alone? I'm okay either way, let
>>> me know.
>>
>> Doesn't seem that hard to test: run a dblink query that pulls back a
>> bunch of data under best-case conditions (ie, local not remote server),
>> and time it with and without the proposed fix. If the difference
>> is marginal then it's not worth working hard to optimize.
>
> Okay, will do, and here's the shorter and less mechanically intensive
> naive version that I think is the baseline: it doesn't try to optimize
> out any GUC settings and sets up the GUCs before the two
> materialization paths in dblink.

The results. Summary: seems like grabbing the GUC and strcmp-ing is
worthwhile, but the amount of ping-ponging between processes adds some
noise to the timing results: utilization is far short of 100% on
either processor involved. Attached is a cumulative diff of the new
version, and also reproduced below are the changes to v2 that make up
v3.

## Benchmark

SELECT dblink_connect('benchconn','dbname=contrib_regression');

CREATE FUNCTION bench() RETURNS integer AS $$
DECLARE
iterations integer;
BEGIN
iterations := 0;

WHILE iterations < 300000 LOOP
PERFORM * FROM dblink('benchconn', 'SELECT 1') AS t(a int);
iterations := iterations + 1;
END LOOP;

RETURN iterations;
END;
$$ LANGUAGE plpgsql;

SELECT clock_timestamp() INTO TEMP beginning;
SELECT bench();
SELECT clock_timestamp() INTO TEMP ending;

SELECT 'dblink-benchmark-lines';
SELECT ending.clock_timestamp - beginning.clock_timestamp
FROM beginning, ending;

## Data Setup

CREATE TEMP TABLE bench_results (version text, span interval);

COPY bench_results FROM STDIN WITH CSV;
no-patch,@ 41.308122 secs
no-patch,@ 36.63597 secs
no-patch,@ 34.264119 secs
no-patch,@ 34.760179 secs
no-patch,@ 32.991257 secs
no-patch,@ 34.538258 secs
no-patch,@ 42.576354 secs
no-patch,@ 39.335557 secs
no-patch,@ 37.493206 secs
no-patch,@ 37.812205 secs
v2-applied,@ 36.550553 secs
v2-applied,@ 38.608723 secs
v2-applied,@ 39.415744 secs
v2-applied,@ 46.091052 secs
v2-applied,@ 45.425438 secs
v2-applied,@ 48.219506 secs
v2-applied,@ 43.514878 secs
v2-applied,@ 45.892302 secs
v2-applied,@ 48.479335 secs
v2-applied,@ 47.632041 secs
v3-strcmp,@ 32.524385 secs
v3-strcmp,@ 34.982098 secs
v3-strcmp,@ 34.487222 secs
v3-strcmp,@ 44.394681 secs
v3-strcmp,@ 44.638309 secs
v3-strcmp,@ 44.113088 secs
v3-strcmp,@ 45.497769 secs
v3-strcmp,@ 33.530176 secs
v3-strcmp,@ 32.9704 secs
v3-strcmp,@ 40.84764 secs
\.

=> SELECT version, avg(extract(epoch from span)), stddev(extract(epoch
from span))
FROM bench_results
GROUP BY version;
version | avg | stddev
------------+------------+------------------
no-patch | 37.1715227 | 3.17076487912923
v2-applied | 43.9829572 | 4.30572672565711
v3-strcmp | 38.7985768 | 5.54760393720725

## Changes to v2:

--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -2981,9 +2981,11 @@ initRemoteGucs(remoteGucs *rgs, PGconn *conn)
static void
applyRemoteGucs(remoteGucs *rgs)
{
- int i;
const int numGucs = sizeof parseAffectingGucs / sizeof *parseAffectingGucs;

+ int i;
+ int addedGucNesting = false;
+
/*
* Affected types require local GUC manipulations. Create a new
* GUC NestLevel to overlay the remote settings.
@@ -2992,14 +2994,27 @@ applyRemoteGucs(remoteGucs *rgs)
* structure, so expect it to come with an invalid NestLevel.
*/
Assert(rgs->localGUCNestLevel == -1);
- rgs->localGUCNestLevel = NewGUCNestLevel();

for (i = 0; i < numGucs; i += 1)
{
+ int gucApplyStatus;
+
const char *gucName = parseAffectingGucs[i];
const char *remoteVal = PQparameterStatus(rgs->conn, gucName);
+ const char *localVal = GetConfigOption(gucName, true, true);

- int gucApplyStatus;
+ /*
+ * Attempt to avoid GUC setting if the remote and local GUCs
+ * already have the same value.
+ */
+ if (strcmp(remoteVal, localVal) == 0)
+ continue;
+
+ if (!addedGucNesting)
+ {
+ rgs->localGUCNestLevel = NewGUCNestLevel();
+ addedGucNesting = true;
+ }

gucApplyStatus = set_config_option(gucName, remoteVal,
PGC_USERSET, PGC_S_SESSION,

--
fdr

Attachment	Content-Type	Size
dblink-guc-sensitive-types-v3.patch	application/octet-stream	14.3 KB

From:	Daniel Farina <daniel(at)heroku(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: postgres_fdw vs data formatting GUCs (was Re: [v9.3] writable foreign tables)
Date:	2013-03-20 05:37:59
Message-ID:	CAAZKuFaATwkh9Z9_+RjJuroH=j-s+Jted8uSk+03JTfOFbbkOg@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, Mar 19, 2013 at 6:10 PM, Daniel Farina <daniel(at)heroku(dot)com> wrote:
> On Tue, Mar 19, 2013 at 3:06 PM, Daniel Farina <daniel(at)heroku(dot)com> wrote:
>> On Tue, Mar 19, 2013 at 2:41 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> Daniel Farina <daniel(at)heroku(dot)com> writes:
>>>> On Tue, Mar 19, 2013 at 1:16 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>>>> I'd be inclined to eat the cost of calling PQparameterStatus every time
>>>>> (which won't be that much) and instead try to optimize by avoiding the
>>>>> GUC-setting overhead if the current value matches the local setting.
>>>>> But even that might be premature optimization. Did you do any
>>>>> performance testing to see if there was a problem worth avoiding?
>>>
>>>> Nope; should I invent a new way to do that, or would it be up to
>>>> commit standard based on inspection alone? I'm okay either way, let
>>>> me know.
>>>
>>> Doesn't seem that hard to test: run a dblink query that pulls back a
>>> bunch of data under best-case conditions (ie, local not remote server),
>>> and time it with and without the proposed fix. If the difference
>>> is marginal then it's not worth working hard to optimize.
>>
>> Okay, will do, and here's the shorter and less mechanically intensive
>> naive version that I think is the baseline: it doesn't try to optimize
>> out any GUC settings and sets up the GUCs before the two
>> materialization paths in dblink.
>
> The results. Summary: seems like grabbing the GUC and strcmp-ing is
> worthwhile, but the amount of ping-ponging between processes adds some
> noise to the timing results: utilization is far short of 100% on
> either processor involved. Attached is a cumulative diff of the new
> version, and also reproduced below are the changes to v2 that make up
> v3.

I added programming around various NULL returns reading GUCs in this
revision, v4.

The non-cumulative changes:

--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -3005,8 +3005,22 @@ applyRemoteGucs(remoteGucs *rgs)
/*
* Attempt to avoid GUC setting if the remote and local GUCs
* already have the same value.
+ *
+ * NB: Must error if the GUC is not found.
*/
- localVal = GetConfigOption(gucName, true, true);
+ localVal = GetConfigOption(gucName, false, true);
+
+ if (remoteVal == NULL)
+ ereport(ERROR,
+ (errmsg("could not load parameter status of %s",
+ gucName)));
+
+ /*
+ * An error must have been raised by now if GUC values could
+ * not be loaded for any reason.
+ */
+ Assert(localVal != NULL);
+ Assert(remoteVal != NULL);

if (strcmp(remoteVal, localVal) == 0)
continue;

--
fdr

Attachment	Content-Type	Size
dblink-guc-sensitive-types-v4.patch	application/octet-stream	14.4 KB

From:	Daniel Farina <daniel(at)heroku(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: postgres_fdw vs data formatting GUCs (was Re: [v9.3] writable foreign tables)
Date:	2013-03-20 06:28:26
Message-ID:	CAAZKuFYtCZCQ+ROmigfdAyMxi682ZMrHyqA9cBsQrP-DRqKSaw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, Mar 19, 2013 at 10:37 PM, Daniel Farina <daniel(at)heroku(dot)com> wrote:
> I added programming around various NULL returns reading GUCs in this
> revision, v4.

Okay, one more of those fridge-logic bugs. Sorry for the noise. v5.

A missing PG_RETHROW to get the properly finally-esque semantics:

--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -642,7 +642,10 @@ dblink_fetch(PG_FUNCTION_ARGS)
}
PG_CATCH();
{
+ /* Pop any set GUCs, if necessary */
restoreLocalGucs(&rgs);
+
+ PG_RE_THROW();
}
PG_END_TRY();

This was easy to add a regression test to exercise, and so I did (not
displayed here).

--
fdr

Attachment	Content-Type	Size
dblink-guc-sensitive-types-v5.patch	application/octet-stream	15.2 KB

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Daniel Farina <daniel(at)heroku(dot)com>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: postgres_fdw vs data formatting GUCs (was Re: [v9.3] writable foreign tables)
Date:	2013-03-20 14:43:57
Message-ID:	2529.1363790637@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Daniel Farina <daniel(at)heroku(dot)com> writes:
> Okay, one more of those fridge-logic bugs. Sorry for the noise. v5.

> A missing PG_RETHROW to get the properly finally-esque semantics:

> --- a/contrib/dblink/dblink.c
> +++ b/contrib/dblink/dblink.c
> @@ -642,7 +642,10 @@ dblink_fetch(PG_FUNCTION_ARGS)
> }
> PG_CATCH();
> {
> + /* Pop any set GUCs, if necessary */
> restoreLocalGucs(&rgs);
> +
> + PG_RE_THROW();
> }
> PG_END_TRY();

Um ... you shouldn't need a PG_TRY for that at all. guc.c will take
care of popping the values on transaction abort --- that's really rather
the whole point of having that mechanism.

regards, tom lane

From:	Daniel Farina <daniel(at)heroku(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: postgres_fdw vs data formatting GUCs (was Re: [v9.3] writable foreign tables)
Date:	2013-03-20 16:52:54
Message-ID:	CAAZKuFacgYEUD9mF7TSspLwnYgWLf40fOu3QORWSYN1eu=+-6Q@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, Mar 20, 2013 at 7:43 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Daniel Farina <daniel(at)heroku(dot)com> writes:
>> Okay, one more of those fridge-logic bugs. Sorry for the noise. v5.
>
>> A missing PG_RETHROW to get the properly finally-esque semantics:
>
>> --- a/contrib/dblink/dblink.c
>> +++ b/contrib/dblink/dblink.c
>> @@ -642,7 +642,10 @@ dblink_fetch(PG_FUNCTION_ARGS)
>> }
>> PG_CATCH();
>> {
>> + /* Pop any set GUCs, if necessary */
>> restoreLocalGucs(&rgs);
>> +
>> + PG_RE_THROW();
>> }
>> PG_END_TRY();
>
> Um ... you shouldn't need a PG_TRY for that at all. guc.c will take
> care of popping the values on transaction abort --- that's really rather
> the whole point of having that mechanism.

Hmm, well, merely raising the error doesn't reset the GUCs, so I was
rather thinking that this was a good idea to compose more neatly in
the case of nested exception processing, e.g.:

PG_TRY();
{
PG_TRY();
{
elog(NOTICE, "pre: %s",
GetConfigOption("DateStyle", false, true));
materializeResult(fcinfo, res);
}
PG_CATCH();
{
elog(NOTICE, "inner catch: %s",
GetConfigOption("DateStyle", false, true));
PG_RE_THROW();
}
PG_END_TRY();
}
PG_CATCH();
{
elog(NOTICE, "outer catch: %s",
GetConfigOption("DateStyle", false, true));
restoreLocalGucs(&rgs);
elog(NOTICE, "restored: %s",
GetConfigOption("DateStyle", false, true));
PG_RE_THROW();
}
PG_END_TRY();

I don't think this paranoia is made in other call sites for AtEOXact,
so included is a version that takes the same stance. This one shows up
best with the whitespace-insensitive option for git:

--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -634,21 +634,8 @@ dblink_fetch(PG_FUNCTION_ARGS)
* affect parsing and then un-set them afterwards.
*/
initRemoteGucs(&rgs, conn);
-
- PG_TRY();
- {
applyRemoteGucs(&rgs);
materializeResult(fcinfo, res);
- }
- PG_CATCH();
- {
- /* Pop any set GUCs, if necessary */
- restoreLocalGucs(&rgs);
-
- PG_RE_THROW();
- }
- PG_END_TRY();
-
restoreLocalGucs(&rgs);

return (Datum) 0;
@@ -823,9 +810,6 @@ dblink_record_internal(FunctionCallInfo fcinfo,
bool is_async)
if (freeconn)
PQfinish(conn);

- /* Pop any set GUCs, if necessary */
- restoreLocalGucs(&rgs);
-
PG_RE_THROW();
}
PG_END_TRY();

--
fdr

Attachment	Content-Type	Size
dblink-guc-sensitive-types-v6.patch	application/octet-stream	14.9 KB

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Daniel Farina <daniel(at)heroku(dot)com>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: postgres_fdw vs data formatting GUCs (was Re: [v9.3] writable foreign tables)
Date:	2013-03-20 17:13:30
Message-ID:	6278.1363799610@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Daniel Farina <daniel(at)heroku(dot)com> writes:
> On Wed, Mar 20, 2013 at 7:43 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Um ... you shouldn't need a PG_TRY for that at all. guc.c will take
>> care of popping the values on transaction abort --- that's really rather
>> the whole point of having that mechanism.

> Hmm, well, merely raising the error doesn't reset the GUCs, so I was
> rather thinking that this was a good idea to compose more neatly in
> the case of nested exception processing, e.g.:

In general, we don't allow processing to resume after an error until
transaction or subtransaction abort cleanup has been done. It's true
that if you look at the GUC state in a PG_CATCH block, you'll see it
hasn't been reset yet, but that's not very relevant.

regards, tom lane

From:	Daniel Farina <daniel(at)heroku(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: postgres_fdw vs data formatting GUCs (was Re: [v9.3] writable foreign tables)
Date:	2013-03-22 02:04:16
Message-ID:	CAAZKuFYKh-9+WkrWoBi6ZQ2ZVfZ2mcjrqEte2GUsrtEWN_=itg@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

This contains some edits to comments that referred to the obsolete and
bogus TupleDesc scanning. No mechanical alterations.

--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -2961,9 +2961,8 @@ initRemoteGucs(remoteGucs *rgs, PGconn *conn)
}

/*
- * Scan a TupleDesc and, should it contain types that are sensitive to
- * GUCs, acquire remote GUCs and set them in a new GUC nesting level.
- * This is undone with restoreLocalGucs.
+ * Acquire remote GUCs that may affect type parsing and set them in a
+ * new GUC nesting level.
*/
static void
applyRemoteGucs(remoteGucs *rgs)
@@ -2974,11 +2973,8 @@ applyRemoteGucs(remoteGucs *rgs)
int addedGucNesting = false;

/*
- * Affected types require local GUC manipulations. Create a new
- * GUC NestLevel to overlay the remote settings.
- *
- * Also, this nesting is done exactly once per remoteGucInfo
- * structure, so expect it to come with an invalid NestLevel.
+ * This nesting is done exactly once per remoteGucInfo structure,
+ * so expect it to come with an invalid NestLevel.
*/
Assert(rgs->localGUCNestLevel == -1);

diff --git a/contrib/dblink/expected/dblink.out
b/contrib/dblink/expected/dblink.out
index 3946485..579664e 100644
--- a/contrib/dblink/expected/dblink.out
+++ b/contrib/dblink/expected/dblink.out
@@ -930,9 +930,8 @@ SELECT dblink_exec('myconn', 'SET datestyle =
GERMAN, DMY;');
SET
(1 row)

--- The following attempt test various paths at which TupleDescs are
--- formed and inspected for containment of types requiring local GUC
--- setting.
+-- The following attempt test various paths at which tuples are formed
+-- and inspected for containment of types requiring local GUC setting.
-- single row synchronous case
SELECT *
FROM dblink('myconn',
diff --git a/contrib/dblink/sql/dblink.sql b/contrib/dblink/sql/dblink.sql
index de925eb..7ff43fd 100644
--- a/contrib/dblink/sql/dblink.sql
+++ b/contrib/dblink/sql/dblink.sql
@@ -435,9 +435,8 @@ SET timezone = UTC;
SELECT dblink_connect('myconn','dbname=contrib_regression');
SELECT dblink_exec('myconn', 'SET datestyle = GERMAN, DMY;');

-- single row synchronous case
SELECT *

--
fdr

Attachment	Content-Type	Size
dblink-guc-sensitive-types-v7.patch	application/octet-stream	14.7 KB

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Daniel Farina <daniel(at)heroku(dot)com>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: postgres_fdw vs data formatting GUCs (was Re: [v9.3] writable foreign tables)
Date:	2013-03-22 19:29:59
Message-ID:	28625.1363980599@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Daniel Farina <daniel(at)heroku(dot)com> writes:
> This contains some edits to comments that referred to the obsolete and
> bogus TupleDesc scanning. No mechanical alterations.

Applied with some substantial revisions. I didn't like where you'd put
the apply/restore calls, for one thing --- we need to wait to do the
applies until we have the PGresult in hand, else we might be applying
stale values of the remote's GUCs. Also, adding a call that could throw
errors right before materializeResult() won't do, because that would
result in leaking the PGresult on error. The struct for state seemed a
bit of a mess too, given that you couldn't always initialize it in one
place. (In hindsight I could have left that alone given where I ended
up putting the calls, but it didn't seem to be providing any useful
isolation.)

regards, tom lane

From:	Daniel Farina <daniel(at)heroku(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: postgres_fdw vs data formatting GUCs (was Re: [v9.3] writable foreign tables)
Date:	2013-03-22 20:37:34
Message-ID:	CAAZKuFaeZTpO3irvkzsv3Jgwfi3emgvK8qe0_gjHOxduv4k+-A@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Fri, Mar 22, 2013 at 12:29 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Daniel Farina <daniel(at)heroku(dot)com> writes:
>> This contains some edits to comments that referred to the obsolete and
>> bogus TupleDesc scanning. No mechanical alterations.
>
> Applied with some substantial revisions. I didn't like where you'd put
> the apply/restore calls, for one thing --- we need to wait to do the
> applies until we have the PGresult in hand, else we might be applying
> stale values of the remote's GUCs. Also, adding a call that could throw
> errors right before materializeResult() won't do, because that would
> result in leaking the PGresult on error.

Good catches.

> The struct for state seemed a
> bit of a mess too, given that you couldn't always initialize it in one
> place.

Yeah, I had to give that up when pushing things around, unless I
wanted to push more state down. It used to be neater.

> (In hindsight I could have left that alone given where I ended
> up putting the calls, but it didn't seem to be providing any useful
> isolation.)

I studied your commit.

Yeah, the idea I had was to try to avoid pushing down a loaded a value
as a PGconn into the lower level helper functions, but perhaps that
economy was false one after the modifications. Earlier versions used
to push down the RemoteGucs struct instead of a full-blown conn to
hint to the restricted purpose of that reference. By conceding to this
pushdown I think the struct could have remained, as you said, but the
difference to clarity is likely marginal. I thought I found a way to
not have to widen the parameter list at all, so I preferred that one,
but clearly it is wrong, w.r.t. leaks and the not up-to-date protocol
state.

Sorry you had to root around so much in there to get something you
liked, but thanks for going through it.

--
fdr