Re: IMPORT FOREIGN SCHEMA statement

Lists: pgsql-hackers
From: Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: IMPORT FOREIGN SCHEMA statement
Date: 2014-05-23 20:08:06
Message-ID: 7379810.3t5YpdInZt@ronan.dunklau.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello,

Since my last proposal didn't get any strong rebuttal, please find attached a
more complete version of the IMPORT FOREIGN SCHEMA statement.

I tried to follow the SQL-MED specification as closely as possible.

This adds discoverability to foreign servers. The structure of the statement
as I understand it is simple enough:

IMPORT FOREIGN SCHEMA remote_schema FROM SERVER some_server [ (LIMIT TO |
EXCEPT) table_list ] INTO local_schema.

The import_foreign_schema patch adds the infrastructure, and a new FDW
routine:

typedef List *(*ImportForeignSchema_function) (ForeignServer *server,
ImportForeignSchemaStmt * parsetree);

This routine must return a list of CreateForeignTableStmt mirroring whatever
tables were found on the remote side, which will then be executed.

The import_foreign_schema_postgres_fdw patch proposes an implementation of
this API for postgres_fdw. It will import a foreign schema using the right
types as well as nullable information.

Regarding documentation, I don't really know where it should have been put. If
I missed something, let me know and I'll try to correct it.

--
Ronan Dunklau
http://dalibo.com - http://dalibo.org

Attachment Content-Type Size
import_foreign_schema_v1.patch text/x-patch 20.4 KB
import_foreign_schema_postgres_fdw_v1.patch text/x-patch 10.9 KB

From: David Fetter <david(at)fetter(dot)org>
To: Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: IMPORT FOREIGN SCHEMA statement
Date: 2014-05-25 19:41:18
Message-ID: 20140525194118.GB32355@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, May 23, 2014 at 10:08:06PM +0200, Ronan Dunklau wrote:
> Hello,
>
> Since my last proposal didn't get any strong rebuttal, please find attached a
> more complete version of the IMPORT FOREIGN SCHEMA statement.

Thanks!

Please to send future patches to this thread so people can track them
in their mail.

> I tried to follow the SQL-MED specification as closely as possible.
>
> This adds discoverability to foreign servers. The structure of the statement
> as I understand it is simple enough:
>
> IMPORT FOREIGN SCHEMA remote_schema FROM SERVER some_server [ (LIMIT TO |
> EXCEPT) table_list ] INTO local_schema.
>
> The import_foreign_schema patch adds the infrastructure, and a new FDW
> routine:
>
> typedef List *(*ImportForeignSchema_function) (ForeignServer *server,
> ImportForeignSchemaStmt * parsetree);
>
> This routine must return a list of CreateForeignTableStmt mirroring whatever
> tables were found on the remote side, which will then be executed.
>
> The import_foreign_schema_postgres_fdw patch proposes an implementation of
> this API for postgres_fdw. It will import a foreign schema using the right
> types as well as nullable information.

In the case of PostgreSQL, "the right types" are obvious until there's
a user-defined one. What do you plan to do in that case?

> Regarding documentation, I don't really know where it should have been put. If
> I missed something, let me know and I'll try to correct it.

It's not exactly something you missed, but I need to bring it up
anyway before we go too far. The standard missed two crucial concepts
when this part of it was written:

1. No single per-database-type universal type mapping can be correct.

People will have differing goals for type mapping, and writing a whole
new FDW for each of those goals is, to put it mildly, wasteful. I
will illustrate with a concrete and common example.

MySQL's datetime type encourages usages which PostgreSQL's
corresponding type, timestamptz, simply disallows, namely '0000-00-00
00:00:00' as its idea of UNKNOWN or NULL.

One way PostgreSQL's mapping could work is to map it to TEXT, which
would preserve the strings exactly and be in some sense an identity
map. It would also make the type somewhat useless in its original
intended form.

Another one would map the type is to a composite data type
mysql_datetime(tstz timestamptz, is_wacky boolean) which would
capture, for example, ('2014-04-01 00:00:00+00', false) for the UTC
start of April Fools' Day this year, and (NULL, true) for '0000-00-00
00:00:00'.

There are doubtless others, and there is no principled way to assign
any one of them as universally correct.

This brings me to the next crucial concept the standard missed:

2. The correct mapping may not be the identity, and furthermore, the
inbound and outbound mappings might in general not be mere inversions
of each other.

MySQL (no aspersions intended) again provides a concrete example with
its unsigned integer types. Yes, it's possible to create a domain
over INT8 which simulates UINT4, a domain over NUMERIC which simulates
UINT8, etc., but again this process's correctness depends on
circumstances.

To address these problems, I propose the following:

- We make type mappings settable at the level of:
- FDW
- Instance (a.k.a. cluster)
- Database
- Schema
- Table
- Column

using the existing ALTER command and some way of spelling out how
a remote type maps to a local type.

This would consist of:
- The remote type
- The local type to which it maps
- The inbound transformation (default identity)
- The outbound transformation (default identity)

At any given level, the remote type would need to be unique. To
communicate this to the system, we either invent new syntax, with
all the hazards attendant thereto, or we could use JSON or similar
serialization.

ALTER FOREIGN TABLE foo
ADD TYPE MAPPING
FROM "datetime"
TO TEXT
WITH (
INBOUND TRANSFORMATION IDENTITY,
OUTBOUND TRANSFORMATION IDENTITY
) /* Ugh!!! */

vs.

ALTER FOREIGN TABLE foo ADD (mapping '{
"datetime": "text",
"inbound": "IDENTITY",
outbound: "IDENTITY"
}')

Each FDW would have some set of default mappings and some way to
override them as above.

What say?

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


From: Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: David Fetter <david(at)fetter(dot)org>
Subject: Re: IMPORT FOREIGN SCHEMA statement
Date: 2014-05-25 21:23:41
Message-ID: 3057545.cu6kBAGtju@ronan.dunklau.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Le dimanche 25 mai 2014 12:41:18 David Fetter a écrit :
> On Fri, May 23, 2014 at 10:08:06PM +0200, Ronan Dunklau wrote:
> > Hello,
> >
> > Since my last proposal didn't get any strong rebuttal, please find
> > attached a more complete version of the IMPORT FOREIGN SCHEMA statement.
>
> Thanks!
>
> Please to send future patches to this thread so people can track them
> in their mail.

I'll do.

I didn't for the previous one because it was a few months ago, and no patch
had been added to the commit fest.

>
> > I tried to follow the SQL-MED specification as closely as possible.
> >
> > This adds discoverability to foreign servers. The structure of the
> > statement as I understand it is simple enough:
> >
> > IMPORT FOREIGN SCHEMA remote_schema FROM SERVER some_server [ (LIMIT TO |
> > EXCEPT) table_list ] INTO local_schema.
> >
> > The import_foreign_schema patch adds the infrastructure, and a new FDW
> > routine:
> >
> > typedef List *(*ImportForeignSchema_function) (ForeignServer *server,
> > ImportForeignSchemaStmt * parsetree);
> >
> > This routine must return a list of CreateForeignTableStmt mirroring
> > whatever tables were found on the remote side, which will then be
> > executed.
> >
> > The import_foreign_schema_postgres_fdw patch proposes an implementation of
> > this API for postgres_fdw. It will import a foreign schema using the right
> > types as well as nullable information.
>
> In the case of PostgreSQL, "the right types" are obvious until there's
> a user-defined one. What do you plan to do in that case ?
>

The current implementation fetches the types as regtype, and when receiving a
custom type, two things can happen:

- the type is defined locally: everything will work as expected
- the type is not defined locally: the conversion function will fail, and
raise an error of the form: ERROR: type "schema.typname" does not exist

Should I add that to the regression test suite ?

> > Regarding documentation, I don't really know where it should have been
> > put. If I missed something, let me know and I'll try to correct it.
>
> It's not exactly something you missed, but I need to bring it up
> anyway before we go too far. The standard missed two crucial concepts
> when this part of it was written:
>
> 1. No single per-database-type universal type mapping can be correct.
>
> People will have differing goals for type mapping, and writing a whole
> new FDW for each of those goals is, to put it mildly, wasteful. I
> will illustrate with a concrete and common example.
>
> MySQL's datetime type encourages usages which PostgreSQL's
> corresponding type, timestamptz, simply disallows, namely '0000-00-00
> 00:00:00' as its idea of UNKNOWN or NULL.
>
> One way PostgreSQL's mapping could work is to map it to TEXT, which
> would preserve the strings exactly and be in some sense an identity
> map. It would also make the type somewhat useless in its original
> intended form.
>
> Another one would map the type is to a composite data type
> mysql_datetime(tstz timestamptz, is_wacky boolean) which would
> capture, for example, ('2014-04-01 00:00:00+00', false) for the UTC
> start of April Fools' Day this year, and (NULL, true) for '0000-00-00
> 00:00:00'.
>
> There are doubtless others, and there is no principled way to assign
> any one of them as universally correct.
>
> This brings me to the next crucial concept the standard missed:
>
> 2. The correct mapping may not be the identity, and furthermore, the
> inbound and outbound mappings might in general not be mere inversions
> of each other.
>
> MySQL (no aspersions intended) again provides a concrete example with
> its unsigned integer types. Yes, it's possible to create a domain
> over INT8 which simulates UINT4, a domain over NUMERIC which simulates
> UINT8, etc., but again this process's correctness depends on
> circumstances.
>
> To address these problems, I propose the following:
>
> - We make type mappings settable at the level of:
> - FDW
> - Instance (a.k.a. cluster)
> - Database
> - Schema
> - Table
> - Column
>
> using the existing ALTER command and some way of spelling out how
> a remote type maps to a local type.
>
> This would consist of:
> - The remote type
> - The local type to which it maps
> - The inbound transformation (default identity)
> - The outbound transformation (default identity)
>
> At any given level, the remote type would need to be unique. To
> communicate this to the system, we either invent new syntax, with
> all the hazards attendant thereto, or we could use JSON or similar
> serialization.
>
> ALTER FOREIGN TABLE foo
> ADD TYPE MAPPING
> FROM "datetime"
> TO TEXT
> WITH (
> INBOUND TRANSFORMATION IDENTITY,
> OUTBOUND TRANSFORMATION IDENTITY
> ) /* Ugh!!! */
>
> vs.
>
> ALTER FOREIGN TABLE foo ADD (mapping '{
> "datetime": "text",
> "inbound": "IDENTITY",
> outbound: "IDENTITY"
> }')
>
> Each FDW would have some set of default mappings and some way to
> override them as above.

I understand your points, but I'm not really comfortable with the concept,
unless there is something that I missed.

We can already support this use case through specific-fdw options. Should I
add that to postgres_fdw ?

Additionally, I don't really see how that would be useful in a general case.
With an "in-core" defined meaning of type transformation, any FDW that doesn't
fit exactly into the model would have a hard time. For example, what happens
if an FDW is only ever capable of returning text ? Or if a mapping can only be
set at the server or FDW model because it depends on some connection parameter
? The bulk of the code for managing type mappings would be FDW-specific
anyway. What you're proposing looks like a "universal option", with a specific
syntax, that should therefore be supported by all fdws, with well-defined
semantics.

Moreover, extending the spec seems a bit dangerous to me, since if the spec
decides to address this specific point in the future, there is a risk that our
behavior will not be compatible.

Thank you for your feedback,

--
Ronan Dunklau
http://dalibo.com - http://dalibo.org


From: David Fetter <david(at)fetter(dot)org>
To: Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: IMPORT FOREIGN SCHEMA statement
Date: 2014-05-26 04:38:44
Message-ID: 20140526043844.GC18543@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, May 25, 2014 at 11:23:41PM +0200, Ronan Dunklau wrote:
> Le dimanche 25 mai 2014 12:41:18 David Fetter a écrit :
> > On Fri, May 23, 2014 at 10:08:06PM +0200, Ronan Dunklau wrote:
> > > Hello,
> > >
> > > Since my last proposal didn't get any strong rebuttal, please find
> > > attached a more complete version of the IMPORT FOREIGN SCHEMA statement.
> >
> > Thanks!
> >
> > Please to send future patches to this thread so people can track them
> > in their mail.
>
> I'll do.
>
> I didn't for the previous one because it was a few months ago, and no patch
> had been added to the commit fest.

Thanks for adding this one :)

> > > The import_foreign_schema_postgres_fdw patch proposes an implementation of
> > > this API for postgres_fdw. It will import a foreign schema using the right
> > > types as well as nullable information.
> >
> > In the case of PostgreSQL, "the right types" are obvious until there's
> > a user-defined one. What do you plan to do in that case ?
>
> The current implementation fetches the types as regtype, and when receiving a
> custom type, two things can happen:
>
> - the type is defined locally: everything will work as expected
> - the type is not defined locally: the conversion function will fail, and
> raise an error of the form: ERROR: type "schema.typname" does not exist
>
> Should I add that to the regression test suite ?

Yes.

In the "easy" case of PostgreSQL, you might also be able to establish
whether the UDT in the "already defined locally" case above is
identical to the one defined remotely, but I don't think it's possible
even in principle for non-PostgreSQL remote systems, possibly not even
for ones with non-identical architecture, PostgreSQL major version,
etc.

> > > Regarding documentation, I don't really know where it should have been
> > > put. If I missed something, let me know and I'll try to correct it.
> >
> > It's not exactly something you missed, but I need to bring it up
> > anyway before we go too far. The standard missed two crucial concepts
> > when this part of it was written:
> >
> > 1. No single per-database-type universal type mapping can be correct.
> >
> > [snip]
> >
> > To address these problems, I propose the following:
> >
> > - We make type mappings settable at the level of:
> > - FDW
> > - Instance (a.k.a. cluster)
> > - Database
> > - Schema
> > - Table
> > - Column
> >
> > using the existing ALTER command and some way of spelling out how

Oops. Forgot to include CREATE in the above.

> > a remote type maps to a local type.
> >
> > This would consist of:
> > - The remote type
> > - The local type to which it maps
> > - The inbound transformation (default identity)
> > - The outbound transformation (default identity)
> >
> > At any given level, the remote type would need to be unique. To
> > communicate this to the system, we either invent new syntax, with
> > all the hazards attendant thereto, or we could use JSON or similar
> > serialization.
> >
> > ALTER FOREIGN TABLE foo
> > ADD TYPE MAPPING
> > FROM "datetime"
> > TO TEXT
> > WITH (
> > INBOUND TRANSFORMATION IDENTITY,
> > OUTBOUND TRANSFORMATION IDENTITY
> > ) /* Ugh!!! */

"Ugh!!!" means I don't think we should do it this way.

> > vs.
> >
> > ALTER FOREIGN TABLE foo ADD (mapping '{
> > "datetime": "text",
> > "inbound": "IDENTITY",
> > outbound: "IDENTITY"
> > }')
> >
> > Each FDW would have some set of default mappings and some way to
> > override them as above.
>
> I understand your points, but I'm not really comfortable with the
> concept, unless there is something that I missed.

My poor communication ability might have a lot to do with it. I
assure you my explanation would have been even worse if I had tried it
in French, though. :P

> We can already support this use case through specific-fdw options. Should I
> add that to postgres_fdw ?

I believe the capability belongs in our FDW API with the decision of
whether to implement it up to FDW authors. They know (or should know)
how to throw ERRCODE_FEATURE_NOT_SUPPORTED.

> Additionally, I don't really see how that would be useful in a general case.
> With an "in-core" defined meaning of type transformation, any FDW that doesn't
> fit exactly into the model would have a hard time. For example, what happens
> if an FDW is only ever capable of returning text ?

That's actually the case where it's most important to have the feature
all the way down to the column level.

> Or if a mapping can only be set at the server or FDW model because
> it depends on some connection parameter ?

ERRCODE_FEATURE_NOT_SUPPORTED.

> The bulk of the code for managing type mappings would be
> FDW-specific anyway.

The part that actually does the transformations would necessarily be
part of each FDW. I omitted opining on whether such transformations
should be assumed to be local or remote because I can imagine cases
where only local (or only remote) could be correct.

> What you're proposing looks like a "universal option", with a
> specific syntax, that should therefore be supported by all fdws,
> with well-defined semantics.

At the DDL level, yes.

> Moreover, extending the spec seems a bit dangerous to me, since if the spec
> decides to address this specific point in the future, there is a risk that our
> behavior will not be compatible.

That's why I suggested doing it via CREATE/ALTER with JSONB or similar
containing the details rather than inventing new SQL grammar, an
option I included only to dismiss it.

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


From: Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
To: "Ronan Dunklau *EXTERN*" <ronan(dot)dunklau(at)dalibo(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: IMPORT FOREIGN SCHEMA statement
Date: 2014-05-26 10:33:12
Message-ID: A737B7A37273E048B164557ADEF4A58B17CFE029@ntex2010i.host.magwien.gv.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Ronan Dunklau wrote:
> Since my last proposal didn't get any strong rebuttal, please find attached a
> more complete version of the IMPORT FOREIGN SCHEMA statement.
>
> I tried to follow the SQL-MED specification as closely as possible.
>
> This adds discoverability to foreign servers. The structure of the statement
> as I understand it is simple enough:
>
> IMPORT FOREIGN SCHEMA remote_schema FROM SERVER some_server [ (LIMIT TO |
> EXCEPT) table_list ] INTO local_schema.
>
> The import_foreign_schema patch adds the infrastructure, and a new FDW
> routine:
>
> typedef List *(*ImportForeignSchema_function) (ForeignServer *server,
> ImportForeignSchemaStmt * parsetree);
>
> This routine must return a list of CreateForeignTableStmt mirroring whatever
> tables were found on the remote side, which will then be executed.

In addition to data type mapping questions (which David already raised)
I have one problem when I think of the Oracle FDW:

Oracle follows the SQL standard in folding table names to upper case.
So this would normally result in a lot of PostgreSQL foreign tables
with upper case names, which would be odd and unpleasant.

I cannot see a way out of that, but I thought I should mention it.

Yours,
Laurenz Albe


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Cc: "Ronan Dunklau *EXTERN*" <ronan(dot)dunklau(at)dalibo(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: IMPORT FOREIGN SCHEMA statement
Date: 2014-05-26 16:17:18
Message-ID: 27834.1401121038@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at> writes:
> In addition to data type mapping questions (which David already raised)
> I have one problem when I think of the Oracle FDW:

> Oracle follows the SQL standard in folding table names to upper case.
> So this would normally result in a lot of PostgreSQL foreign tables
> with upper case names, which would be odd and unpleasant.

> I cannot see a way out of that, but I thought I should mention it.

It seems like it would often be desirable for the Oracle FDW to smash
all-upper-case names to all-lower-case while importing, so that no quoting
is needed on either side. I doubt though that this is so desirable that
it should happen unconditionally.

Between this and the type-mapping questions, it seems likely that
we're going to need a way for IMPORT FOREIGN SCHEMA to accept
user-supplied control options, which would in general be specific
to the FDW being used. (Another thing the SQL committee failed to
think about.)

regards, tom lane


From: Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, david(at)fetter(dot)org
Subject: Re: IMPORT FOREIGN SCHEMA statement
Date: 2014-05-27 11:49:03
Message-ID: 2329137.kgO6EUde5Z@ronan.dunklau.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>> Additionally, I don't really see how that would be useful in a general
case.
>> With an "in-core" defined meaning of type transformation, any FDW that
doesn't
>> fit exactly into the model would have a hard time. For example, what
happens
>> if an FDW is only ever capable of returning text ?

> That's actually the case where it's most important to have the feature
> all the way down to the column level.

I'm not sure configuration about specific columns from specific tables would
be that useful: if you already know the structure of the table, you can either
create it yourself, or run an alter column statement just after creating it.

Alternativeley, with the arbitrary options clause proposed by Tom and detailed
below, one could use the LIMIT TO / EXCEPT clauses to fine-tune what options
should apply.

> That's why I suggested doing it via CREATE/ALTER with JSONB or similar
> containing the details rather than inventing new SQL grammar, an
> option I included only to dismiss it.

Hum, adding a simple TYPE MAPPING is already inventing new SQL grammar, but
its less invasive.

> Between this and the type-mapping questions, it seems likely that
> we're going to need a way for IMPORT FOREIGN SCHEMA to accept
> user-supplied control options, which would in general be specific
> to the FDW being used. (Another thing the SQL committee failed to
> think about.)

So, without extending the syntax too much, we could add options:

IMPORT FOREIGN SCHEMA remote_schema FROM SERVER server_name INTO local_schema
[ { LIMIT TO | EXCEPT } (table_name [, ...])]
[ OPTIONS (option_list) ]

This option list could then contain fdw-specific options, including type
mapping.

Or we could add a specific clause:

IMPORT FOREIGN SCHEMA remote_schema FROM SERVER server_name INTO local_schema
[ { LIMIT TO | EXCEPT } (table_name [, ...])]
[ OPTIONS (option_list) ]
[ TYPE MAPPING some mapping_definition ]

With mapping_definition being either a tuple, or well-defined jsonb format
common accross FDWs.

A third solution, which I don't like but doesn't modify the SQL grammar, would
be to encode the options in the remote_schema name.

Would one of those solutions be acceptable ?

--
Ronan Dunklau
http://dalibo.com - http://dalibo.org


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: David Fetter <david(at)fetter(dot)org>
Cc: Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: IMPORT FOREIGN SCHEMA statement
Date: 2014-05-27 13:41:06
Message-ID: 20140527134106.GG2556@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* David Fetter (david(at)fetter(dot)org) wrote:
> - We make type mappings settable at the level of:
> - FDW
> - Instance (a.k.a. cluster)
> - Database
> - Schema
> - Table
> - Column

While I like the general idea, you seem to be making this appear much
more complex than it actually is. For example, I see no value in a
table-level "uint4 -> int8" definition. If you want something which is
not the default, just set it on the individual columns of the foreign
table, exactly how we handle column name differences.

There might be some value in being able to redefine what the default is
at the FOREIGN SERVER level, as perhaps MySQL DB X and MySQL DB Y could
have different default mappings for some reason, but adding in the rest
of those levels doesn't add value imv.

> This would consist of:
> - The remote type
> - The local type to which it maps
> - The inbound transformation (default identity)
> - The outbound transformation (default identity)

The remote type and the local type are known already to the FDW, no?
The FDW will need to know how to do whatever conversions we're talking
about also, right? (If you want to do it in core PG instead of the FDW
then I'm thinking you should probably use a view over top of the
foreign table..). What's nice is that this all falls under what an FDW
can do *already*, if it wants (and, actually, it could do it on the
table-level technically too, if the FDW supports that, but schema?
database? these things don't make sense...).

For the IMPORT bit, it should just be doing whatever the defaults are
unless you've set some different defaults for a given foreign server
(also something which could be set using our existing system...).

> ALTER FOREIGN TABLE foo ADD (mapping '{
> "datetime": "text",
> "inbound": "IDENTITY",
> outbound: "IDENTITY"
> }')

I'm very much against having two different command languages where one
is used "when convenient". If we wanted to do this, we should build a
full-spec mapping from whatever JSON language you come up with for our
utility commands to what we actually implement in the grammar.

Thanks,

Stephen


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: David Fetter <david(at)fetter(dot)org>
Cc: Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: IMPORT FOREIGN SCHEMA statement
Date: 2014-05-27 13:52:27
Message-ID: 20140527135227.GH2556@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* David Fetter (david(at)fetter(dot)org) wrote:
> In the "easy" case of PostgreSQL, you might also be able to establish
> whether the UDT in the "already defined locally" case above is
> identical to the one defined remotely, but I don't think it's possible
> even in principle for non-PostgreSQL remote systems, possibly not even
> for ones with non-identical architecture, PostgreSQL major version,
> etc.

For my 2c, it'd be very handy if IMPORT had an option or similar to also
copy over all (in the schema) or at least any depended-upon UDTs. It'd
really be nice if we had an ability to check the remote UDT vs. the
local one at some point also, since otherwise you're going to get bitten
at run-time when querying the foreign table.

Thanks,

Stephen


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, Ronan Dunklau *EXTERN* <ronan(dot)dunklau(at)dalibo(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: IMPORT FOREIGN SCHEMA statement
Date: 2014-05-27 13:59:03
Message-ID: 20140527135903.GI2556@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
> Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at> writes:
> > Oracle follows the SQL standard in folding table names to upper case.
> > So this would normally result in a lot of PostgreSQL foreign tables
> > with upper case names, which would be odd and unpleasant.
>
> > I cannot see a way out of that, but I thought I should mention it.
>
> It seems like it would often be desirable for the Oracle FDW to smash
> all-upper-case names to all-lower-case while importing, so that no quoting
> is needed on either side. I doubt though that this is so desirable that
> it should happen unconditionally.

The oracle FDW would simply need a foreign-server level option which
says "smash everything to lowercase on import".

> Between this and the type-mapping questions, it seems likely that
> we're going to need a way for IMPORT FOREIGN SCHEMA to accept
> user-supplied control options, which would in general be specific
> to the FDW being used. (Another thing the SQL committee failed to
> think about.)

I can see value in having specific mappings defined at the foreign
server level for what IMPORT does by default, but as IMPORT is intended
to be a bulk process, I don't see the value in trying to teach it how to
do a specific mapping for a specific table or column inside the schema.
You run the IMPORT for the entire schema and then go fix up any special
cases or things you wanted differently.

That said, I'm not against having a way to pass to IMPORT a similar
key/value pair set which we already support for the FDW options on
tables, columns, etc, by way of safe-guarding any potential use case for
such.

Thanks,

Stephen


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, david(at)fetter(dot)org
Subject: Re: IMPORT FOREIGN SCHEMA statement
Date: 2014-05-27 14:00:31
Message-ID: 20140527140031.GJ2556@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Ronan Dunklau (ronan(dot)dunklau(at)dalibo(dot)com) wrote:
> So, without extending the syntax too much, we could add options:
>
> IMPORT FOREIGN SCHEMA remote_schema FROM SERVER server_name INTO local_schema
> [ { LIMIT TO | EXCEPT } (table_name [, ...])]
> [ OPTIONS (option_list) ]
>
> This option list could then contain fdw-specific options, including type
> mapping.

Yup, that looks reasonable to me.

> Or we could add a specific clause:
>
> IMPORT FOREIGN SCHEMA remote_schema FROM SERVER server_name INTO local_schema
> [ { LIMIT TO | EXCEPT } (table_name [, ...])]
> [ OPTIONS (option_list) ]
> [ TYPE MAPPING some mapping_definition ]

-1 on this one.

> With mapping_definition being either a tuple, or well-defined jsonb format
> common accross FDWs.
>
> A third solution, which I don't like but doesn't modify the SQL grammar, would
> be to encode the options in the remote_schema name.

Yeah, don't like that one either.

Thanks,

Stephen


From: Petr Jelinek <petr(at)2ndquadrant(dot)com>
To: Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>, pgsql-hackers(at)postgresql(dot)org
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, david(at)fetter(dot)org
Subject: Re: IMPORT FOREIGN SCHEMA statement
Date: 2014-05-27 14:04:39
Message-ID: 53849B77.1060303@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 27/05/14 13:49, Ronan Dunklau wrote:
>> Between this and the type-mapping questions, it seems likely that
>> we're going to need a way for IMPORT FOREIGN SCHEMA to accept
>> user-supplied control options, which would in general be specific
>> to the FDW being used. (Another thing the SQL committee failed to
>> think about.)
>
> So, without extending the syntax too much, we could add options:
>
> IMPORT FOREIGN SCHEMA remote_schema FROM SERVER server_name INTO local_schema
> [ { LIMIT TO | EXCEPT } (table_name [, ...])]
> [ OPTIONS (option_list) ]
>
> This option list could then contain fdw-specific options, including type
> mapping.

This one looks like good option to me.

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, Ronan Dunklau *EXTERN* <ronan(dot)dunklau(at)dalibo(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: IMPORT FOREIGN SCHEMA statement
Date: 2014-05-27 14:23:57
Message-ID: 20140527142357.GE7857@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Stephen Frost wrote:
> * Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
> > Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at> writes:
> > > Oracle follows the SQL standard in folding table names to upper case.
> > > So this would normally result in a lot of PostgreSQL foreign tables
> > > with upper case names, which would be odd and unpleasant.
> >
> > > I cannot see a way out of that, but I thought I should mention it.
> >
> > It seems like it would often be desirable for the Oracle FDW to smash
> > all-upper-case names to all-lower-case while importing, so that no quoting
> > is needed on either side. I doubt though that this is so desirable that
> > it should happen unconditionally.
>
> The oracle FDW would simply need a foreign-server level option which
> says "smash everything to lowercase on import".

That's not the same thing though -- consider what happens to the quoting
needs for names with mixed case. If you change mixed case to
all-lowercase, references to such objects using quotes in the
application code would fail because the name is now all-lowercase in
Postgres. A tri-valued enum could do the trick:
lowercase_names={wholly_uppercase_only, all, none}
with the first one being the most sensible and default.

> > Between this and the type-mapping questions, it seems likely that
> > we're going to need a way for IMPORT FOREIGN SCHEMA to accept
> > user-supplied control options, which would in general be specific
> > to the FDW being used. (Another thing the SQL committee failed to
> > think about.)
>
> I can see value in having specific mappings defined at the foreign
> server level for what IMPORT does by default, but as IMPORT is intended
> to be a bulk process, I don't see the value in trying to teach it how to
> do a specific mapping for a specific table or column inside the schema.
> You run the IMPORT for the entire schema and then go fix up any special
> cases or things you wanted differently.

Yes, it seems better to offer the ability to create transactional
scripts around IMPORT (so it must be able to run in a transaction block
-- IMPORT first, then do a bunch of ALTERs), than complicating IMPORT
itself infinitely to support hundreds of possible options.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, Ronan Dunklau *EXTERN* <ronan(dot)dunklau(at)dalibo(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: IMPORT FOREIGN SCHEMA statement
Date: 2014-05-27 14:29:44
Message-ID: 20140527142944.GK2556@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Alvaro Herrera (alvherre(at)2ndquadrant(dot)com) wrote:
> Stephen Frost wrote:
> > > It seems like it would often be desirable for the Oracle FDW to smash
> > > all-upper-case names to all-lower-case while importing, so that no quoting
> > > is needed on either side. I doubt though that this is so desirable that
> > > it should happen unconditionally.
> >
> > The oracle FDW would simply need a foreign-server level option which
> > says "smash everything to lowercase on import".
>
> That's not the same thing though -- consider what happens to the quoting
> needs for names with mixed case. If you change mixed case to
> all-lowercase, references to such objects using quotes in the
> application code would fail because the name is now all-lowercase in
> Postgres. A tri-valued enum could do the trick:
> lowercase_names={wholly_uppercase_only, all, none}
> with the first one being the most sensible and default.

Sure, I was being a bit over-simplistic. As was mentioned up-thread,
the option would rather be "flip all-uppercase to lowercase and all-
lowercase to uppercase, quote any which are mixed", or something along
those lines. What I was trying to get at is that it's up to the FDW
what options it wants to support in this regard and we already have a
way for the admin to pass in useful information to the FDW by way of the
FOREIGN SERVER FDW options.

This, plus the generic ability to pass an OPTIONS clause to the IMPORT
(allowing you to have different defaults for different IMPORT
statements) and having it be transactional, as you mention, appears to
be covering all the relevant bases.

Thanks,

stephen


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, Ronan Dunklau *EXTERN* <ronan(dot)dunklau(at)dalibo(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: IMPORT FOREIGN SCHEMA statement
Date: 2014-05-27 14:36:57
Message-ID: 9562.1401201417@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Stephen Frost <sfrost(at)snowman(dot)net> writes:
> Sure, I was being a bit over-simplistic. As was mentioned up-thread,
> the option would rather be "flip all-uppercase to lowercase and all-
> lowercase to uppercase, quote any which are mixed", or something along
> those lines. What I was trying to get at is that it's up to the FDW
> what options it wants to support in this regard and we already have a
> way for the admin to pass in useful information to the FDW by way of the
> FOREIGN SERVER FDW options.

> This, plus the generic ability to pass an OPTIONS clause to the IMPORT
> (allowing you to have different defaults for different IMPORT
> statements) and having it be transactional, as you mention, appears to
> be covering all the relevant bases.

Yeah, as far as the example of coping with differing case goes, I agree
that we'd want IMPORT to just follow whatever the FDW's default or
configured behavior is, since obviously the FDW will have to know how to
reverse the conversion when sending queries later on. So there's no
apparent need for an IMPORT-level option *for that example*. I'm not
sure if we need OPTIONS for IMPORT for any other uses.

regards, tom lane


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, Ronan Dunklau *EXTERN* <ronan(dot)dunklau(at)dalibo(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: IMPORT FOREIGN SCHEMA statement
Date: 2014-05-27 14:53:36
Message-ID: 20140527145336.GM2556@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
> Stephen Frost <sfrost(at)snowman(dot)net> writes:
> > This, plus the generic ability to pass an OPTIONS clause to the IMPORT
> > (allowing you to have different defaults for different IMPORT
> > statements) and having it be transactional, as you mention, appears to
> > be covering all the relevant bases.
>
> Yeah, as far as the example of coping with differing case goes, I agree
> that we'd want IMPORT to just follow whatever the FDW's default or
> configured behavior is, since obviously the FDW will have to know how to
> reverse the conversion when sending queries later on. So there's no
> apparent need for an IMPORT-level option *for that example*. I'm not
> sure if we need OPTIONS for IMPORT for any other uses.

I'm waffling a bit on this particular point. IMPORT is for bulk
operations, of course, but perhaps on one remote server you want to
import everything from the dimension tables using the default mapping
but everything from the fact or perhaps aggregate tables in some schema
as text. You could do that with something like- import #1, change the
server-level options, import #2, or you could do just one big import and
then go back and fix everything, but...

I'd rather introduce this options clause for IMPORT *now*, which means
we don't need to care about if this specific example is really relevant
or not, than not have it in 9.5 and have FDW authors unhappy because
it's missing (whether they need it for this use case or some other).

Thanks,

Stephen


From: Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, David Fetter <david(at)fetter(dot)org>
Subject: Re: IMPORT FOREIGN SCHEMA statement
Date: 2014-05-27 17:14:59
Message-ID: 1992735.voKrHdJZSg@ronan.dunklau.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Le mardi 27 mai 2014 09:52:27 Stephen Frost a écrit :
> * David Fetter (david(at)fetter(dot)org) wrote:
> > In the "easy" case of PostgreSQL, you might also be able to establish
> > whether the UDT in the "already defined locally" case above is
> > identical to the one defined remotely, but I don't think it's possible
> > even in principle for non-PostgreSQL remote systems, possibly not even
> > for ones with non-identical architecture, PostgreSQL major version,
> > etc.
>
> For my 2c, it'd be very handy if IMPORT had an option or similar to also
> copy over all (in the schema) or at least any depended-upon UDTs. It'd
> really be nice if we had an ability to check the remote UDT vs. the
> local one at some point also, since otherwise you're going to get bitten
> at run-time when querying the foreign table.

The specification only talks about importing tables, not types, views or (why
not ?) foreign tables.

If we want to extend the spec further, we could accept more than
CreateForeignTableStmt as return values from the API:
- CreateDomainStmt
- CompositeTypeStmt
- AlterTableCmd

The core code would be responsible for executing them, and making sure the
destination schema is correctly positioned on all of those.

The postgres_fdw behaviour could then be controlled with options such as
include_types (tri-valued enum accepting "none", "all", "as_needed"),
relation_kinds (default to 'r', can support multiple kinds with a string 'rfv'
for tables, foreign tables and views).

I think we're drifting further away from the standard with that kind of stuff,
but I'd be happy to implement it if that's the path we choose.

--
Ronan Dunklau
http://dalibo.com - http://dalibo.org


From: David Fetter <david(at)fetter(dot)org>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: IMPORT FOREIGN SCHEMA statement
Date: 2014-05-28 14:28:14
Message-ID: 20140528142814.GB3585@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, May 27, 2014 at 09:41:06AM -0400, Stephen Frost wrote:
> * David Fetter (david(at)fetter(dot)org) wrote:
> > - We make type mappings settable at the level of:
> > - FDW
> > - Instance (a.k.a. cluster)
> > - Database
> > - Schema
> > - Table
> > - Column
>
> While I like the general idea, you seem to be making this appear much
> more complex than it actually is. For example, I see no value in a
> table-level "uint4 -> int8" definition.

You chose a particularly silly example, so I have to assume it's a
straw man. My point was that since we don't know what things are
relevant to preserve and transform here in the design stage, we need
to leave them settable by people who have needs we don't know about,
i.e. the users of the feature.

> If you want something which is
> not the default, just set it on the individual columns of the foreign
> table, exactly how we handle column name differences.
>
> There might be some value in being able to redefine what the default is
> at the FOREIGN SERVER level, as perhaps MySQL DB X and MySQL DB Y could
> have different default mappings for some reason, but adding in the rest
> of those levels doesn't add value imv.
>
> > This would consist of:
> > - The remote type
> > - The local type to which it maps
> > - The inbound transformation (default identity)
> > - The outbound transformation (default identity)
>
> The remote type and the local type are known already to the FDW, no?

The above aren't separable items. They all have to be there, even if
for user convenience we have two default transformations which are the
identity.

> The FDW will need to know how to do whatever conversions we're talking
> about also, right?

No. The writer of the FDW is not the same person as the user of the
FDW, and the former is not omniscient. My idea here is to allow FDW
users to supply transformation code at these spots.

> (If you want to do it in core PG instead of the FDW
> then I'm thinking you should probably use a view over top of the
> foreign table..).

We can't know in advance what to preserve and what to jettison. We
can't even know which server is responsible for doing the
transformation.

> What's nice is that this all falls under what an FDW
> can do *already*, if it wants (and, actually, it could do it on the
> table-level technically too, if the FDW supports that, but schema?
> database? these things don't make sense...).

Not to you yet, but I've seen deployed applications with exactly these
requirements.

> For the IMPORT bit, it should just be doing whatever the defaults are
> unless you've set some different defaults for a given foreign server
> (also something which could be set using our existing system...).
>
> > ALTER FOREIGN TABLE foo ADD (mapping '{
> > "datetime": "text",
> > "inbound": "IDENTITY",
> > outbound: "IDENTITY"
> > }')
>
> I'm very much against having two different command languages where one
> is used "when convenient".

I did not suggest that we use two different ways to specify this. I
did sketch out one path--adding a bunch of SQL grammar--which I made
it explicitly clear we shouldn't tread. Apparently, I wasn't quite
explicit enough.

> If we wanted to do this, we should build a full-spec mapping from
> whatever JSON language you come up with for our utility commands to
> what we actually implement in the grammar.

Excellent plan.

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: David Fetter <david(at)fetter(dot)org>
Cc: Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: IMPORT FOREIGN SCHEMA statement
Date: 2014-05-28 18:13:25
Message-ID: 20140528181325.GB2556@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* David Fetter (david(at)fetter(dot)org) wrote:
> On Tue, May 27, 2014 at 09:41:06AM -0400, Stephen Frost wrote:
> > * David Fetter (david(at)fetter(dot)org) wrote:
> > > - We make type mappings settable at the level of:
> > > - FDW
> > > - Instance (a.k.a. cluster)
> > > - Database
> > > - Schema
> > > - Table
> > > - Column
> >
> > While I like the general idea, you seem to be making this appear much
> > more complex than it actually is. For example, I see no value in a
> > table-level "uint4 -> int8" definition.
>
> You chose a particularly silly example, so I have to assume it's a
> straw man.

... I used your example from 20140525194118(dot)GB32355(at)fetter(dot)org and your
suggestion that we support that on a per-table basis. If that
combination is silly then let's not support it.

> My point was that since we don't know what things are
> relevant to preserve and transform here in the design stage, we need
> to leave them settable by people who have needs we don't know about,
> i.e. the users of the feature.

This is not relevant as far as I can tell. Users can already make
changes to the definitions, or create them however they want the first
time by using CREATE FOREIGN TABLE. The point of IMPORT is that it
should be for bulk operations- if you have a lot of very specific
transformations, then use CREATE instead. I don't see value in
rebuilding the flexibility of what a script of CREATEs gives you into
a single IMPORT command.

> > The FDW will need to know how to do whatever conversions we're talking
> > about also, right?
>
> No. The writer of the FDW is not the same person as the user of the
> FDW, and the former is not omniscient. My idea here is to allow FDW
> users to supply transformation code at these spots.

Then we're not talking about something which should be IMPORT's job, or
at least it goes far beyond it and that's what we're discussing here.
This sounds a bit like you're looking to rebuild what ETLs provide in a
streaming fashion- and I'm all for that as an interesting project that
isn't about adding IMPORT. I still think you could use views for this,
or ETL, if you really want to implement it using core instead of the
FDW.

> > (If you want to do it in core PG instead of the FDW
> > then I'm thinking you should probably use a view over top of the
> > foreign table..).
>
> We can't know in advance what to preserve and what to jettison. We
> can't even know which server is responsible for doing the
> transformation.

Either side could handle the transformation using views, or there could
be transformations on both sides.

> > What's nice is that this all falls under what an FDW
> > can do *already*, if it wants (and, actually, it could do it on the
> > table-level technically too, if the FDW supports that, but schema?
> > database? these things don't make sense...).
>
> Not to you yet, but I've seen deployed applications with exactly these
> requirements.

I don't view IMPORT as being part of a deployed application. It's a way
of simplifying the process of setting up FDWs, not an ETL mechanism.

> > For the IMPORT bit, it should just be doing whatever the defaults are
> > unless you've set some different defaults for a given foreign server
> > (also something which could be set using our existing system...).
> >
> > > ALTER FOREIGN TABLE foo ADD (mapping '{
> > > "datetime": "text",
> > > "inbound": "IDENTITY",
> > > outbound: "IDENTITY"
> > > }')
> >
> > I'm very much against having two different command languages where one
> > is used "when convenient".
>
> I did not suggest that we use two different ways to specify this. I
> did sketch out one path--adding a bunch of SQL grammar--which I made
> it explicitly clear we shouldn't tread. Apparently, I wasn't quite
> explicit enough.

What you missed here is that I'd find it objectionable to have some
portion of the system has a JSON-based grammar while the rest is an SQL
grammar. I'm not entirely thrilled with the jsquery approach for that
reason, but at least that's a very contained case which is about a logic
expression (and we have logic expressions already in SQL). That's not
what you're describing here.

> > If we wanted to do this, we should build a full-spec mapping from
> > whatever JSON language you come up with for our utility commands to
> > what we actually implement in the grammar.
>
> Excellent plan.

Go for it. From here, I don't see anything stopping you from making a
wrapper around PG which converts your JSON syntax into SQL (indeed,
people have already done this..). I wouldn't get your hopes up about
having it ever part of core PG though, and it certainly won't be until
it's as complete and capable as the existing SQL grammar.

Thanks,

Stephen


From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>
Cc: PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>, David Fetter <david(at)fetter(dot)org>
Subject: Re: IMPORT FOREIGN SCHEMA statement
Date: 2014-06-16 05:58:00
Message-ID: CAB7nPqT3d1+ANMqB39QvZ0FDfUJADiZXGMEtuXx7vJfe86NVYQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, May 26, 2014 at 6:23 AM, Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com> wrote:
> Le dimanche 25 mai 2014 12:41:18 David Fetter a écrit :
>> On Fri, May 23, 2014 at 10:08:06PM +0200, Ronan Dunklau wrote:
>> > Hello,
>> >
>> > Since my last proposal didn't get any strong rebuttal, please find
>> > attached a more complete version of the IMPORT FOREIGN SCHEMA statement.
>>
>> Thanks!
>>
>> Please to send future patches to this thread so people can track them
>> in their mail.
>
> I'll do.
>
> I didn't for the previous one because it was a few months ago, and no patch
> had been added to the commit fest.
>
>>
>> > I tried to follow the SQL-MED specification as closely as possible.
>> >
>> > This adds discoverability to foreign servers. The structure of the
>> > statement as I understand it is simple enough:
>> >
>> > IMPORT FOREIGN SCHEMA remote_schema FROM SERVER some_server [ (LIMIT TO |
>> > EXCEPT) table_list ] INTO local_schema.
>> >
>> > The import_foreign_schema patch adds the infrastructure, and a new FDW
>> > routine:
>> >
>> > typedef List *(*ImportForeignSchema_function) (ForeignServer *server,
>> > ImportForeignSchemaStmt * parsetree);
>> >
>> > This routine must return a list of CreateForeignTableStmt mirroring
>> > whatever tables were found on the remote side, which will then be
>> > executed.
>> >
>> > The import_foreign_schema_postgres_fdw patch proposes an implementation of
>> > this API for postgres_fdw. It will import a foreign schema using the right
>> > types as well as nullable information.
>>
>> In the case of PostgreSQL, "the right types" are obvious until there's
>> a user-defined one. What do you plan to do in that case ?
>>
>
> The current implementation fetches the types as regtype, and when receiving a
> custom type, two things can happen:
>
> - the type is defined locally: everything will work as expected
> - the type is not defined locally: the conversion function will fail, and
> raise an error of the form: ERROR: type "schema.typname" does not exist

Just wondering: what about the case where the same data type is
defined on both local and remote, but with *different* definitions? Is
it the responsibility of the fdw to check for type incompatibilities?
--
Michael


From: Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>, David Fetter <david(at)fetter(dot)org>
Subject: Re: IMPORT FOREIGN SCHEMA statement
Date: 2014-06-16 06:02:38
Message-ID: CAOeZVic02JNKRW+suBadC70nb+4E929BXtuwL-nhA9f0uoQfFA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jun 16, 2014 at 11:28 AM, Michael Paquier <michael(dot)paquier(at)gmail(dot)com
> wrote:

> On Mon, May 26, 2014 at 6:23 AM, Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>
> wrote:
> > Le dimanche 25 mai 2014 12:41:18 David Fetter a écrit :
> >> On Fri, May 23, 2014 at 10:08:06PM +0200, Ronan Dunklau wrote:
> >> > Hello,
> >> >
> >> > Since my last proposal didn't get any strong rebuttal, please find
> >> > attached a more complete version of the IMPORT FOREIGN SCHEMA
> statement.
> >>
> >> Thanks!
> >>
> >> Please to send future patches to this thread so people can track them
> >> in their mail.
> >
> > I'll do.
> >
> > I didn't for the previous one because it was a few months ago, and no
> patch
> > had been added to the commit fest.
> >
> >>
> >> > I tried to follow the SQL-MED specification as closely as possible.
> >> >
> >> > This adds discoverability to foreign servers. The structure of the
> >> > statement as I understand it is simple enough:
> >> >
> >> > IMPORT FOREIGN SCHEMA remote_schema FROM SERVER some_server [ (LIMIT
> TO |
> >> > EXCEPT) table_list ] INTO local_schema.
> >> >
> >> > The import_foreign_schema patch adds the infrastructure, and a new FDW
> >> > routine:
> >> >
> >> > typedef List *(*ImportForeignSchema_function) (ForeignServer *server,
> >> > ImportForeignSchemaStmt * parsetree);
> >> >
> >> > This routine must return a list of CreateForeignTableStmt mirroring
> >> > whatever tables were found on the remote side, which will then be
> >> > executed.
> >> >
> >> > The import_foreign_schema_postgres_fdw patch proposes an
> implementation of
> >> > this API for postgres_fdw. It will import a foreign schema using the
> right
> >> > types as well as nullable information.
> >>
> >> In the case of PostgreSQL, "the right types" are obvious until there's
> >> a user-defined one. What do you plan to do in that case ?
> >>
> >
> > The current implementation fetches the types as regtype, and when
> receiving a
> > custom type, two things can happen:
> >
> > - the type is defined locally: everything will work as expected
> > - the type is not defined locally: the conversion function will fail,
> and
> > raise an error of the form: ERROR: type "schema.typname" does not exist
>
> Just wondering: what about the case where the same data type is
> defined on both local and remote, but with *different* definitions? Is
> it the responsibility of the fdw to check for type incompatibilities?
>

Logically, should be.

Just wondering, cant we extend the above proposed function typedef List
*(*ImportForeignSchema_function) (ForeignServer *server,
ImportForeignSchemaStmt * parsetree); be changed a bit to give exact type
definitions from the remote side as well?

Regards,

Atri

Regards,

Atri
*l'apprenant*


From: Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>
To: Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>, David Fetter <david(at)fetter(dot)org>
Subject: Re: IMPORT FOREIGN SCHEMA statement
Date: 2014-06-16 06:41:21
Message-ID: 1888790.XN3b8S2hov@ronan.dunklau.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Le lundi 16 juin 2014 11:32:38 Atri Sharma a écrit :
> On Mon, Jun 16, 2014 at 11:28 AM, Michael Paquier <michael(dot)paquier(at)gmail(dot)com
> > wrote:
> > Just wondering: what about the case where the same data type is
> > defined on both local and remote, but with *different* definitions? Is
> > it the responsibility of the fdw to check for type incompatibilities?
>
> Logically, should be.
>

This is akin to what Stephen proposed, to allow IMPORT FOREIGN SCHEMA to also
import types.

The problem with checking if the type is the same is deciding where to stop.
For composite types, sure it should be easy. But what about built-in types ?
Or types provided by an extension / a native library ? These could theorically
change from one release to another.

> Just wondering, cant we extend the above proposed function typedef List
> *(*ImportForeignSchema_function) (ForeignServer *server,
> ImportForeignSchemaStmt * parsetree); be changed a bit to give exact type
> definitions from the remote side as well?

I toyed with this idea, but the more I think about it the less I'm sure what
the API should look like, should we ever decide to go beyond the standard and
import more than tables. Should the proposed function return value be changed
to void, letting the FDW execute any DDL statement ? The advantage of
returning a list of statement was to make it clear that tables should be
imported, and letting the core enforce "INTO local_schema" part of the clause.

I would prefer if the API is limited by design to import tables. This
limitation can always be bypassed by executing arbitrary statements before
returning the list of ImportForeignSchemaStmt*.

For the postgres_fdw specific case, we could add two IMPORT options (since it
looked like we had a consensus on this):

- import_types
- check_types

Import types would import simple, composite types, issuing the corresponding
statements before returning to core.

Check types would compare the local and remote definition for composite types.
For types installed by an extension, it would check that the local type has
also been created by an extension of the same name, installed in the same
schema, raising a warning if the local and remote version differ.
For built-in types, a warning would be raised if the local and remote versions
of PostgreSQL differ.

However, I don't know what we should be doing about types located in a
different schema. For example, if the remote table s1.t1 has a column of
composite type s2.typ1, should we import typ1 in s1 ? In S2, optionnaly
creating the non-existing schema ? Raise an error ?

Regards,

--
Ronan Dunklau
http://dalibo.com - http://dalibo.org


From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>
Cc: PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: IMPORT FOREIGN SCHEMA statement
Date: 2014-06-16 07:07:51
Message-ID: CAB7nPqTeD08QYeb8XmqRSaA8rk-9aBq47mti70a2RfY-V+NogA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, May 24, 2014 at 5:08 AM, Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com> wrote:
> Hello,
>
> Since my last proposal didn't get any strong rebuttal, please find attached a
> more complete version of the IMPORT FOREIGN SCHEMA statement.
>
> I tried to follow the SQL-MED specification as closely as possible.
>
> This adds discoverability to foreign servers. The structure of the statement
> as I understand it is simple enough:
>
> IMPORT FOREIGN SCHEMA remote_schema FROM SERVER some_server [ (LIMIT TO |
> EXCEPT) table_list ] INTO local_schema.
>
> The import_foreign_schema patch adds the infrastructure, and a new FDW
> routine:
>
> typedef List *(*ImportForeignSchema_function) (ForeignServer *server,
> ImportForeignSchemaStmt * parsetree);
>
> This routine must return a list of CreateForeignTableStmt mirroring whatever
> tables were found on the remote side, which will then be executed.
>
> The import_foreign_schema_postgres_fdw patch proposes an implementation of
> this API for postgres_fdw. It will import a foreign schema using the right
> types as well as nullable information.
>
> Regarding documentation, I don't really know where it should have been put. If
> I missed something, let me know and I'll try to correct it.

I have been playing a bit with this patch... And got a couple of comments.

1) Perhaps somebody could guide me to a better spec but by looking at
the spec here => http://farrago.sourceforge.net/design/sqlmed.html, it
seems that the FROM SERVER clause is misplaced, it should be placed
after the table list.
2) The query I am seeing on this spec offers the possiblitily to query
TABLE_NAME LIKE 'pattern'. Is this part of the spec as well? If yes,
are you planning to add it as well. I imagine that this could be quite
handy for users importing from a foreign schema tables that have the
same prefix name for example. ImportForeignSchemaRestrictionType could
be extended with an IMPORT_LIKE type. Extending a bit the spec...
There could be a list of LIKE patterns, this matches more your code by
adding all of them in table_names. I am not voting to add TABLE_NAME
in the list of reserved keywords though, so something like "TABLE LIKE
'pattern'" would be nice.
3) This has been already raised in this thread, but something missing
with this patch is the possiblity to pass options when executing an
import. This could allow a FDW to do more fine-grained operations on
the underlying objects of the table. There would be two level of
options: table level and schema level. For example, with the current
feature it is not possible to rename imported tables on-the-fly. I
imagine that this would be useful.
4) Something not mandatory with the core patch but always nice to
have: tab completion for this command in psql.
5) The error message returned to user when import fails because of a
missing type is rather unfriendly. Let's imagine with two nodes and
postgres_fdw... Node 1:
create type toto as (a int, b int);
create table toto_tab (a toto);
Node 2:
=# import foreign schema public from server postgres_server into public;
ERROR: 42704: type "public.toto" does not exist
LOCATION: parseTypeString, parse_type.c:797
I would rather imagine something like "IMPORT failed because of
missing type defined on remote but not locally".
6) Import does not fail if a table specified in LIMIT TO is not
defined on remote-side:
=# import foreign schema public from server postgres_server limit to
(tab_not_there) into public;
IMPORT FOREIGN SCHEMA
=# \d
No relations found.
7) A small thing: code comments do not respect the project format:
http://www.postgresql.org/docs/9.1/interactive/source-format.html
8) In fetch_remote_tables(at)postgres_fdw(dot)c, you are missing materialized views:
WHERE relkind IN ('r', 'v', 'f')
9) The changes in pg_type.h adding new OID defines should be a
separate patch IMO
10) Code has many whitespaces.
11) Sometimes the import goes strangely:
11-1) After some testing I noticed that tables with incorrect names
were imported when using LIMIT TO. For example on remote a table
called "ab" is present, IMPORT SCHEMA LIMIT TO 'other_ab' created a
local entry called "other_ab" while it should create nothing.
11-2) Tables with empty names can be created locally. On foreign server:
=# \d
List of relations
Schema | Name | Type | Owner
--------+----------+-------+--------
public | aa | table | ioltas
public | toto_tab | table | ioltas
(2 rows)
On local node:
=# import foreign schema public from server postgres_server limit to
(toto_tab, "NULL", aa) into public;
-- (Forget about NULL here, I could reproduce it without as well)
IMPORT FOREIGN SCHEMA
Time: 13.589 ms
=# \d
List of relations
Schema | Name | Type | Owner
--------+----------+---------------+--------
public | | foreign table | ioltas
public | toto_tab | foreign table | ioltas
(2 rows)
Some more testing showed me that as well:
=# \d

List
of relations
Schema |
Name

| Type |
Owner
--------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------+--------
public |

| foreign table |
ioltas
public | toto_tab

| foreign table |
ioltas
public | \x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F
| foreign table | ioltas

That's all I have for now.
Regards,
--
Michael


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>
Cc: Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>, David Fetter <david(at)fetter(dot)org>
Subject: Re: IMPORT FOREIGN SCHEMA statement
Date: 2014-06-16 19:13:43
Message-ID: 20140616191343.GI29243@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Ronan Dunklau (ronan(dot)dunklau(at)dalibo(dot)com) wrote:
> The problem with checking if the type is the same is deciding where to stop.
> For composite types, sure it should be easy. But what about built-in types ?

Haven't we already decided to trust these based on server version
information? Look at what quals we find acceptable to push down to the
remote side... Certainly, composites built off of built-in types should
work.

> Or types provided by an extension / a native library ? These could theorically
> change from one release to another.

This is definitely an issue which we really need to figure out how to
address. I don't have any great ideas off-hand, but it feels like we'll
need a catalog and appropriate SQL-fu to allow users and extensions to
add other types and functions which can be pushed down.. For example,
it'd be great if PostGIS queries could be pushed down to the remote
server- and have that set up by the postgis extension on installation
(perhaps via a call-back hook which gives the extension a handle to the
remote server where it could interrogate the server and then a way to
store the information about what is allowed to be pushed to the remote
side, and how?).

That's certainly a rather complicated bit to address and I don't think
we should hold this up for that- but let's definitely be thinking about
how to add these things later and try to avoid putting anything in place
which would cause problems for us later...

Will try to come back to the rest of your questions later.. :)

Thanks,

Stephen


From: Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: IMPORT FOREIGN SCHEMA statement
Date: 2014-06-16 19:36:05
Message-ID: 1671910.CnuF1cYn0o@ronan.dunklau.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Le lundi 16 juin 2014 16:07:51 Michael Paquier a écrit :
> On Sat, May 24, 2014 at 5:08 AM, Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>
wrote:
> > Hello,
> >
> > Since my last proposal didn't get any strong rebuttal, please find
> > attached a more complete version of the IMPORT FOREIGN SCHEMA statement.
> >
> > I tried to follow the SQL-MED specification as closely as possible.
> >
> > This adds discoverability to foreign servers. The structure of the
> > statement as I understand it is simple enough:
> >
> > IMPORT FOREIGN SCHEMA remote_schema FROM SERVER some_server [ (LIMIT TO |
> > EXCEPT) table_list ] INTO local_schema.
> >
> > The import_foreign_schema patch adds the infrastructure, and a new FDW
> > routine:
> >
> > typedef List *(*ImportForeignSchema_function) (ForeignServer *server,
> > ImportForeignSchemaStmt * parsetree);
> >
> > This routine must return a list of CreateForeignTableStmt mirroring
> > whatever tables were found on the remote side, which will then be
> > executed.
> >
> > The import_foreign_schema_postgres_fdw patch proposes an implementation of
> > this API for postgres_fdw. It will import a foreign schema using the right
> > types as well as nullable information.
> >
> > Regarding documentation, I don't really know where it should have been
> > put. If I missed something, let me know and I'll try to correct it.
>
> I have been playing a bit with this patch... And got a couple of comments.
>
> 1) Perhaps somebody could guide me to a better spec but by looking at
> the spec here => http://farrago.sourceforge.net/design/sqlmed.html, it
> seems that the FROM SERVER clause is misplaced, it should be placed
> after the table list.

You're right, I misplaced it. Corrected in the attached patch.
The spec I'm using is the SQL:201X linked from this page of the wiki:
http://wiki.postgresql.org/wiki/Developer_FAQ#Where_can_I_get_a_copy_of_the_SQL_standards.3F

> 2) The query I am seeing on this spec offers the possiblitily to query
> TABLE_NAME LIKE 'pattern'. Is this part of the spec as well? If yes,
> are you planning to add it as well. I imagine that this could be quite
> handy for users importing from a foreign schema tables that have the
> same prefix name for example. ImportForeignSchemaRestrictionType could
> be extended with an IMPORT_LIKE type. Extending a bit the spec...
> There could be a list of LIKE patterns, this matches more your code by
> adding all of them in table_names. I am not voting to add TABLE_NAME
> in the list of reserved keywords though, so something like "TABLE LIKE
> 'pattern'" would be nice.

From the spec I have been reading, the import qualification is defined as :

<table name list> ::= <table name> [ { <comma> <table name> }... ]

where table name is defined as a simple table name, without any qualifier.

It seems to me the example you linked extended the standard with this
particular implementation. I don't think it would be a problem to add such a
syntax later. In the meantime, FDW implementors can abuse the specification by
implementing a DSL in the table name, for example: LIMIT TO ("LIKE
'pattern*'");

> 3) This has been already raised in this thread, but something missing
> with this patch is the possiblity to pass options when executing an
> import. This could allow a FDW to do more fine-grained operations on
> the underlying objects of the table. There would be two level of
> options: table level and schema level. For example, with the current
> feature it is not possible to rename imported tables on-the-fly. I
> imagine that this would be useful.

The attached patch implements options. Now, we should think about what options
may be desirable for postgres_fdw :)

I don't understand your point about table-level and schema-level options,
though. There is no concept of "foreign schema" allowing to set options on it,
AFAIK.

Renaming imported tables on the fly should, IMO, be done after the import
statement. Since it is transactional, nothing prevents you from running the
ALTER TABLE statements after that.

> 4) Something not mandatory with the core patch but always nice to
> have: tab completion for this command in psql.

I will look into it.

> 5) The error message returned to user when import fails because of a
> missing type is rather unfriendly. Let's imagine with two nodes and
> postgres_fdw... Node 1:
> create type toto as (a int, b int);
> create table toto_tab (a toto);
> Node 2:
> =# import foreign schema public from server postgres_server into public;
> ERROR: 42704: type "public.toto" does not exist
> LOCATION: parseTypeString, parse_type.c:797
> I would rather imagine something like "IMPORT failed because of
> missing type defined on remote but not locally".

I simply prepended the suggested error message to the previous one:

# IMPORT FOREIGN SCHEMA import_source FROM SERVER remote1 INTO
import_destination;
ERROR: IMPORT of table t1 failed because of missing type defined on remote but
not locally: type "public.typ1" does not exist
Time: 11,305 ms

> 6) Import does not fail if a table specified in LIMIT TO is not
> defined on remote-side:
> =# import foreign schema public from server postgres_server limit to
> (tab_not_there) into public;
> IMPORT FOREIGN SCHEMA
> =# \d
> No relations found.

Thanks, this is corrected in the attached patch. The error detection code is
O(n²), but I assumed LIMIT TO clauses should be small enough for it not to be
a problem.

> 7) A small thing: code comments do not respect the project format:
> http://www.postgresql.org/docs/9.1/interactive/source-format.html

This should be fixed too, sorry for the inconvenience.

> 8) In fetch_remote_tables(at)postgres_fdw(dot)c, you are missing materialized
> views: WHERE relkind IN ('r', 'v', 'f')

Fixed in the attached patch.

> 9) The changes in pg_type.h adding new OID defines should be a
> separate patch IMO

It has been broken into its separate patch.

> 10) Code has many whitespaces.

I hope this is corrected in the attached patch, if not, could you please point
me at specific examples ?

> 11) Sometimes the import goes strangely:
> 11-1) After some testing I noticed that tables with incorrect names
> were imported when using LIMIT TO. For example on remote a table
> called "ab" is present, IMPORT SCHEMA LIMIT TO 'other_ab' created a
> local entry called "other_ab" while it should create nothing
> 11-2) Tables with empty names can be created locally. On foreign server:
> =# \d
> List of relations
> Schema | Name | Type | Owner
> --------+----------+-------+--------
> public | aa | table | ioltas
> public | toto_tab | table | ioltas
> (2 rows)
> On local node:
> =# import foreign schema public from server postgres_server limit to
> (toto_tab, "NULL", aa) into public;
> -- (Forget about NULL here, I could reproduce it without as well)
> IMPORT FOREIGN SCHEMA
> Time: 13.589 ms
> =# \d
> List of relations
> Schema | Name | Type | Owner
> --------+----------+---------------+--------
> public | | foreign table | ioltas
> public | toto_tab | foreign table | ioltas
> (2 rows)
> Some more testing showed me that as well:
> =# \d
>
> List
> of relations
> Schema |
> Name
>
> | Type |
>
> Owner
> --------+-------------------------------------------------------------------
> ----------------------------------------------------------------------------
> ----------------------------------------------------------------------------
> -----------------------------------+---------------+-------- public |
>
> | foreign table |
>
> ioltas
> public | toto_tab
>
> | foreign table |
>
> ioltas
> public |F\x7F\x7
> F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7
> F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7
> F\x7F\x7F\x7F\x7F\x7F\x7F
> | foreign table | ioltas
> \x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7

This seems to be related to re-using the table-name between invocations. The
attached patch should fix point 2. As for point 1, I don't know the cause for
it. Do you have a reproducible test case ?

> That's all I have for now.

Thank you very much for this review.

> Regards,

--
Ronan Dunklau
http://dalibo.com - http://dalibo.org

Attachment Content-Type Size
import_foreign_schema_definearray_types_v2.patch text/x-patch 1.4 KB
import_foreign_schema_postgres_fdw_v2.patch text/x-patch 14.1 KB
import_foreign_schema_v2.patch text/x-patch 19.4 KB

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>
Cc: PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: IMPORT FOREIGN SCHEMA statement
Date: 2014-06-17 08:06:55
Message-ID: CAB7nPqSKMOP75HbPEddnypLd5EWz85PfQSBPHY6AaP2O0Xbrew@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Jun 17, 2014 at 4:36 AM, Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com> wrote:
> Le lundi 16 juin 2014 16:07:51 Michael Paquier a écrit :
>> On Sat, May 24, 2014 at 5:08 AM, Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>
>> 2) The query I am seeing on this spec offers the possiblitily to query
>> TABLE_NAME LIKE 'pattern'. Is this part of the spec as well? If yes,
>> are you planning to add it as well. I imagine that this could be quite
>> handy for users importing from a foreign schema tables that have the
>> same prefix name for example. ImportForeignSchemaRestrictionType could
>> be extended with an IMPORT_LIKE type. Extending a bit the spec...
>> There could be a list of LIKE patterns, this matches more your code by
>> adding all of them in table_names. I am not voting to add TABLE_NAME
>> in the list of reserved keywords though, so something like "TABLE LIKE
>> 'pattern'" would be nice.
>
> From the spec I have been reading, the import qualification is defined as :
>
> <table name list> ::= <table name> [ { <comma> <table name> }... ]
>
> where table name is defined as a simple table name, without any qualifier.
Indeed. Let's stick to that for simplicity, and use only LIMIT TO/EXCEPT.

>> 3) This has been already raised in this thread, but something missing
>> with this patch is the possiblity to pass options when executing an
>> import. This could allow a FDW to do more fine-grained operations on
>> the underlying objects of the table. There would be two level of
>> options: table level and schema level. For example, with the current
>> feature it is not possible to rename imported tables on-the-fly. I
>> imagine that this would be useful.
>
> The attached patch implements options.
Thanks.

> Now, we should think about what options may be desirable for postgres_fdw.
An option to include triggers in what is fetched? I am sure you know
they can be created on foreign tables now :)

> I don't understand your point about table-level and schema-level options,
> though. There is no concept of "foreign schema" allowing to set options on it,
> AFAIK.
My point is to extend items of the table list to accept options:
[ { LIMIT TO | EXCEPT } ( table_item [ , table_item ] ) ]
where table_item:
table_name [ OPTIONS ( option_item [ , option_item ] ) ]

This would make possible post-operations on the tables fetched before
the FDW create individual CreateForeignTableStmt entries in
ImportForeignSchema. In the case of postgres_fdw, a potential option
would be to allow a renaming of the table after fetching them from
remote.

> Renaming imported tables on the fly should, IMO, be done after the import
> statement. Since it is transactional, nothing prevents you from running the
> ALTER TABLE statements after that.
True as well.

>> 5) The error message returned to user when import fails because of a
>> missing type is rather unfriendly. Let's imagine with two nodes and
>> postgres_fdw... Node 1:
>> create type toto as (a int, b int);
>> create table toto_tab (a toto);
>> Node 2:
>> =# import foreign schema public from server postgres_server into public;
>> ERROR: 42704: type "public.toto" does not exist
>> LOCATION: parseTypeString, parse_type.c:797
>> I would rather imagine something like "IMPORT failed because of
>> missing type defined on remote but not locally".
>
> I simply prepended the suggested error message to the previous one:
>
> # IMPORT FOREIGN SCHEMA import_source FROM SERVER remote1 INTO
> import_destination;
> ERROR: IMPORT of table t1 failed because of missing type defined on remote but
> not locally: type "public.typ1" does not exist
> Time: 11,305 ms
Thanks for the addition.

>> 6) Import does not fail if a table specified in LIMIT TO is not
>> defined on remote-side:
>> =# import foreign schema public from server postgres_server limit to
>> (tab_not_there) into public;
>> IMPORT FOREIGN SCHEMA
>> =# \d
>> No relations found.
>
> Thanks, this is corrected in the attached patch. The error detection code is
> O(n²), but I assumed LIMIT TO clauses should be small enough for it not to be
> a problem.

>> 10) Code has many whitespaces.
>
> I hope this is corrected in the attached patch, if not, could you please point
> me at specific examples ?
>
>> 11) Sometimes the import goes strangely:
>> 11-1) After some testing I noticed that tables with incorrect names
>> were imported when using LIMIT TO. For example on remote a table
>> called "ab" is present, IMPORT SCHEMA LIMIT TO 'other_ab' created a
>> local entry called "other_ab" while it should create nothing
>> 11-2) Tables with empty names can be created locally. On foreign server:
>> =# \d
>> List of relations
>> Schema | Name | Type | Owner
>> --------+----------+-------+--------
>> public | aa | table | ioltas
>> public | toto_tab | table | ioltas
>> (2 rows)
>> On local node:
>> =# import foreign schema public from server postgres_server limit to
>> (toto_tab, "NULL", aa) into public;
>> -- (Forget about NULL here, I could reproduce it without as well)
>> IMPORT FOREIGN SCHEMA
>> Time: 13.589 ms
>> =# \d
>> List of relations
>> Schema | Name | Type | Owner
>> --------+----------+---------------+--------
>> public | | foreign table | ioltas
>> public | toto_tab | foreign table | ioltas
>> (2 rows)
>> Some more testing showed me that as well:
>> =# \d
>>
>> List
>> of relations
>> Schema |
>> Name
>>
>> | Type |
>>
>> Owner
>> --------+-------------------------------------------------------------------
>> ----------------------------------------------------------------------------
>> ----------------------------------------------------------------------------
>> -----------------------------------+---------------+-------- public |
>>
>> | foreign table |
>>
>> ioltas
>> public | toto_tab
>>
>> | foreign table |
>>
>> ioltas
>> public |F\x7F\x7
>> F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7
>> F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7
>> F\x7F\x7F\x7F\x7F\x7F\x7F
>> | foreign table | ioltas
>> \x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7
>
> This seems to be related to re-using the table-name between invocations. The
> attached patch should fix point 2. As for point 1, I don't know the cause for
> it. Do you have a reproducible test case ?
Sure. I'll try harder when looking in more details at the patch for
postgres_fdw :)

Here are extra comments for the core patch:
- Patch has no whitespaces, compiles without warnings. Regression tests pass.
- indentation is not correct in the documentation, 1 space is missing
for the block <para></para> ;) This block is also strangely formatted.
Looking at the other FDW APIs documented, you could use to <para>
blocks: one for the input data (server name, etc), a second for the
output result (which is the list of CreateForeignTableStmt created).
Please check as well spaces between blocks.
- Instead of remote "schema", let's say simply foreign schema.
Renaming the documentation section to "FDW Routine For Importing
Foreign Schemas" would be more consistent.
- Documentation refers to postgresImportForeignSchema, which is an
incorrect name. It should be ImportForeignSchema. The arguments of the
API listed in the docs are incorrect as well.
- Once exiting from the call of fdw_routine->ImportForeignSchema, it
would be check the types of the items returned. An assertion on
IsA(stmt, CreateForeignTableStmt) would be fine I think.
- Renaming ImportForeignSchemaRestrictionType to
ImportForeignSchemaType would be nice
- Both ImportForeignSchemaRestriction and ImportForeignSchemaStmt can
contain the table name list. Only one is necessary.
- The API of ImportForeignSchema is very rough now, I don't think that
it should expose the parse tree of the IMPORT FOREIGN SCHEMA command
as it is the case now. Exposing the server makes sense as the table(s)
is(are) not created yet though. Necessary information to give to the
FDW is actually:
-- server information
-- Import type (except, limit to, all)
-- list of table names
-- FDW options
-- remote schema
Hence the following API would make more sense IMO:
ImportForeignSchema(ForeignServer *server,
const char *remote_schema,
ImportForeignSchemaRestrictionType importType,
List *table_names,
List *options)

I'll look into the patch for postgres_fdw later.
Regards,
--
Michael


From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>
Cc: PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: IMPORT FOREIGN SCHEMA statement
Date: 2014-06-17 09:30:30
Message-ID: CAB7nPqQ907WM1j9wOWXKK1zaGopV50wsF0pLpTNxWGCSuyMYfg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Jun 17, 2014 at 5:06 PM, Michael Paquier
<michael(dot)paquier(at)gmail(dot)com> wrote:
> On Tue, Jun 17, 2014 at 4:36 AM, Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com> wrote:
>> Now, we should think about what options may be desirable for postgres_fdw.
> An option to include triggers in what is fetched? I am sure you know
> they can be created on foreign tables now :)
Please forget that. This makes no sense as long as ImportForeignSchema
returns only a list of CreateForeighTableStmt, and even with that,
there are considerations like the order on which the objects created
should be applied through a potential call of standard_ProcessUtility
or similar.
--
Michael


From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>
Cc: PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: IMPORT FOREIGN SCHEMA statement
Date: 2014-06-19 14:00:35
Message-ID: CAB7nPqSn4__BfBZtJ4JdWvqMBx31ALYxrx5mxgiL4i+WttFcNA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>> This seems to be related to re-using the table-name between invocations. The
>> attached patch should fix point 2. As for point 1, I don't know the cause for
>> it. Do you have a reproducible test case ?
> Sure. I'll try harder when looking in more details at the patch for
> postgres_fdw :)
With v2, I have tried randomly some of the scenarios of v1 plus some
extras, but was not able to reproduce it.

> I'll look into the patch for postgres_fdw later.
And here are some comments about it, when applied on top of the other 2 patches.
1) Code compiles without warning, regression tests pass.
2) In fetch_remote_tables, the palloc for the parameters should be
done after the number of parameters is determined. In the case of
IMPORT_ALL, some memory is wasted for nothing.
3) Current code is not able to import default expressions for a table.
A little bit of processing with pg_get_expr would be necessary:
select pg_catalog.pg_get_expr(adbin, adrelid) from pg_attrdef;
There are of course bonus cases like SERIAL columns coming immediately
to my mind but it would be possible to determine if a given column is
serial with pg_get_serial_sequence.
This would be a good addition for the FDW IMO.
4) The same applies of course to the constraint name: CREATE FOREIGN
TABLE foobar (a int CONSTRAINT toto NOT NULL) for example.
5) A bonus idea: enable default expression obtention and null/not null
switch by default but add options to disable their respective
obtention.
6) Defining once PGFDW_IMPORTRESULT_NUMCOLS at the top of
postgres_fdw.c without undefining would be perfectly fine.
7) In postgresImportForeignSchema, the palloc calls and the
definitions of the variables used to save the results should be done
within the for loop.
8) At quick glance, the logic of postgresImportForeignSchema looks
awkward... I'll have a second look with a fresher mind later on this
one.
--
Michael


From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>
Cc: PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: IMPORT FOREIGN SCHEMA statement
Date: 2014-06-30 07:43:38
Message-ID: CAB7nPqSy9kZbNKvtCE0TtCRr0eD91ohV6W449bHkqrtKD8+5Ng@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Jun 19, 2014 at 11:00 PM, Michael Paquier <michael(dot)paquier(at)gmail(dot)com
> wrote:

> >> This seems to be related to re-using the table-name between
> invocations. The
> >> attached patch should fix point 2. As for point 1, I don't know the
> cause for
> >> it. Do you have a reproducible test case ?
> > Sure. I'll try harder when looking in more details at the patch for
> > postgres_fdw :)
> With v2, I have tried randomly some of the scenarios of v1 plus some
> extras, but was not able to reproduce it.
>
> > I'll look into the patch for postgres_fdw later.
> And here are some comments about it, when applied on top of the other 2
> patches.
> 1) Code compiles without warning, regression tests pass.
> 2) In fetch_remote_tables, the palloc for the parameters should be
> done after the number of parameters is determined. In the case of
> IMPORT_ALL, some memory is wasted for nothing.
> 3) Current code is not able to import default expressions for a table.
> A little bit of processing with pg_get_expr would be necessary:
> select pg_catalog.pg_get_expr(adbin, adrelid) from pg_attrdef;
> There are of course bonus cases like SERIAL columns coming immediately
> to my mind but it would be possible to determine if a given column is
> serial with pg_get_serial_sequence.
> This would be a good addition for the FDW IMO.
> 4) The same applies of course to the constraint name: CREATE FOREIGN
> TABLE foobar (a int CONSTRAINT toto NOT NULL) for example.
> 5) A bonus idea: enable default expression obtention and null/not null
> switch by default but add options to disable their respective
> obtention.
> 6) Defining once PGFDW_IMPORTRESULT_NUMCOLS at the top of
> postgres_fdw.c without undefining would be perfectly fine.
> 7) In postgresImportForeignSchema, the palloc calls and the
> definitions of the variables used to save the results should be done
> within the for loop.
> 8) At quick glance, the logic of postgresImportForeignSchema looks
> awkward... I'll have a second look with a fresher mind later on this
> one.
>
While having a second look at the core patch, I have found myself
re-hacking it, fixing a couple of bugs and adding things that have been
missing in the former implementation:
- Deletions of unnecessary structures to simplify code and make it cleaner
- Fixed a bug related to the management of local schema name. A FDW was
free to set the schema name where local tables are created, this should not
be the case.
- Improved documentation, examples and other things, fixed doc padding for
example
- Added some missing stuff in equalfuncs.c and copyfuncs.c
- Some other things.
With that, core patch looks pretty nice actually, and I think that we
should let a committer have a look at this part at least for this CF.

Also, the postgres_fdw portion has been updated based on the previous core
patch modified, using a version that Ronan sent me, which has addressed the
remarks I sent before. This patch is still lacking documentation, some
cleanup, and regression tests are broken, but it can be used to test the
core feature. I unfortunately don't have much time today but I am sending
this patch either way to let people play with IMPORT SCHEMA if I don't come
back to it before.
Regards,
--
Michael

Attachment Content-Type Size
0001-Implement-IMPORT-FOREIGN-SCHEMA-in-core.patch text/x-patch 26.8 KB
0002-Add-support-of-IMPORT-SCHEMA-for-postgres_fdw.patch text/x-patch 16.8 KB

From: Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: IMPORT FOREIGN SCHEMA statement
Date: 2014-06-30 09:01:48
Message-ID: 11329488.XWoF3hTIhl@ronan.dunklau.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Le lundi 30 juin 2014 16:43:38 Michael Paquier a écrit :
> On Thu, Jun 19, 2014 at 11:00 PM, Michael Paquier <michael(dot)paquier(at)gmail(dot)com
> > wrote:
> > >> This seems to be related to re-using the table-name between
> >
> > invocations. The
> >
> > >> attached patch should fix point 2. As for point 1, I don't know the
> >
> > cause for
> >
> > >> it. Do you have a reproducible test case ?
> > >
> > > Sure. I'll try harder when looking in more details at the patch for
> > > postgres_fdw :)
> >
> > With v2, I have tried randomly some of the scenarios of v1 plus some
> > extras, but was not able to reproduce it.
> >
> > > I'll look into the patch for postgres_fdw later.
> >
> > And here are some comments about it, when applied on top of the other 2
> > patches.
> > 1) Code compiles without warning, regression tests pass.
> > 2) In fetch_remote_tables, the palloc for the parameters should be
> > done after the number of parameters is determined. In the case of
> > IMPORT_ALL, some memory is wasted for nothing.
> > 3) Current code is not able to import default expressions for a table.
> > A little bit of processing with pg_get_expr would be necessary:
> > select pg_catalog.pg_get_expr(adbin, adrelid) from pg_attrdef;
> > There are of course bonus cases like SERIAL columns coming immediately
> > to my mind but it would be possible to determine if a given column is
> > serial with pg_get_serial_sequence.
> > This would be a good addition for the FDW IMO.
> > 4) The same applies of course to the constraint name: CREATE FOREIGN
> > TABLE foobar (a int CONSTRAINT toto NOT NULL) for example.
> > 5) A bonus idea: enable default expression obtention and null/not null
> > switch by default but add options to disable their respective
> > obtention.
> > 6) Defining once PGFDW_IMPORTRESULT_NUMCOLS at the top of
> > postgres_fdw.c without undefining would be perfectly fine.
> > 7) In postgresImportForeignSchema, the palloc calls and the
> > definitions of the variables used to save the results should be done
> > within the for loop.
> > 8) At quick glance, the logic of postgresImportForeignSchema looks
> > awkward... I'll have a second look with a fresher mind later on this
> > one.
>
> While having a second look at the core patch, I have found myself
> re-hacking it, fixing a couple of bugs and adding things that have been
> missing in the former implementation:
> - Deletions of unnecessary structures to simplify code and make it cleaner
> - Fixed a bug related to the management of local schema name. A FDW was
> free to set the schema name where local tables are created, this should not
> be the case.
> - Improved documentation, examples and other things, fixed doc padding for
> example
> - Added some missing stuff in equalfuncs.c and copyfuncs.c
> - Some other things.
> With that, core patch looks pretty nice actually, and I think that we
> should let a committer have a look at this part at least for this CF.
>
> Also, the postgres_fdw portion has been updated based on the previous core
> patch modified, using a version that Ronan sent me, which has addressed the
> remarks I sent before. This patch is still lacking documentation, some
> cleanup, and regression tests are broken, but it can be used to test the
> core feature. I unfortunately don't have much time today but I am sending
> this patch either way to let people play with IMPORT SCHEMA if I don't come
> back to it before.

The regression tests fail because of a typo in pg_type.h: BOOLARRAYOID should
be defined to 1000, not 1003 (which clashes against NAMEARRAYOID).

What do you think should be documented, and where ?

> Regards,

--
Ronan Dunklau
http://dalibo.com - http://dalibo.org

Attachment Content-Type Size
0002-Add-support-of-IMPORT-SCHEMA-for-postgres_fdw.patch text/x-patch 16.8 KB

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>
Cc: PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: IMPORT FOREIGN SCHEMA statement
Date: 2014-06-30 12:54:27
Message-ID: CAB7nPqQGWMOBDt6FuascGdcOUSvso4FCVNvhNVXTEEi9RQJzug@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jun 30, 2014 at 6:01 PM, Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>
wrote:

> BOOLARRAYOID should be defined to 1000, not 1003 (which clashes against
> NAMEARRAYOID).
>
Oops, a bad copy-paste. Thanks for catching that.

I had a more detailed look at the postgres_fdw patch:
1) Having an example of options usable with postgres_fdw would be a good
thing to test the FDW API more extensively. What about simply disable_nulls
to disable NULL/NOT NULL creation on the tables imported.
2) (Should have noticed that earlier, sorry) I don't see any advantages but
only complications in using PQexecParams to launch the queries checking
schema existence remotely and fetching back table definitions as we use
them only once per import. Why not removing the parameter part and build
only plain query strings using StringInfo? This would have the merit to
really simplify fetch_remote_tables.
3) If you add an option, let's get rid of PGFDW_IMPORTRESULT_NUMCOLS as the
number of result columns would change.
4) Instead of using a double-layer of arrays when processing results, I
think that it would make the whole process more readable to have one array
for each result column (attname, atttype, atttypmod, attnotnull)
initialized each time a new row is processed and then process through them
to get the array items. I imagine that a small function doing the array
parsing called on each array result would bring more readability as well to
the process flow.
5) This could be costly with large sets of tables in LIMIT TO... I imagine
that we can live with this O(n log(n)) process as LIMIT TO should not get
big..
foreach(lc1, table_names)
{
found = false;
looked_up = ((RangeVar *) lfirst(lc1))->relname;
foreach(lc2, tables)

What do you think should be documented, and where ?
>

Two things coming to my mind:
- If postgres_fdw can use options with IMPORT, they should be explicitly
listed in the section "FDW Options of postgres_fdw".
- Documenting that postgres_fdw support IMPORT FOREIGN SCHEMA. A simple
paragraph in the main section of postgres_fdw containing descriptions would
be enough.

Once we get that done, I'll do another round of review on this patch and I
think that we will be able to mark it as ready for committer.

Regards,
--
Michael


From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>
Cc: PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: IMPORT FOREIGN SCHEMA statement
Date: 2014-07-01 06:36:20
Message-ID: CAB7nPqRhNiCWuxvt5HoiwtSdiOmcfSH0g-gyPaSWa40KqKrx0A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jun 30, 2014 at 9:54 PM, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
wrote:

> Once we get that done, I'll do another round of review on this patch and I
> think that we will be able to mark it as ready for committer.
>

After sleeping on it, I have put my hands on the postgres_fdw portion and
came up with a largely simplified flow, resulting in the patch attached.
The following things are done:
- Removal of all the array stuff, resulting in a more simplified, and
readable code, without performance impact.
- Removal of the parameter stuff to simplify code
- Addition of an option in postgres_fdw to ignore NULL/NOT NULL
- Addition of docs
- Fixed a bug related to the use of ::regtype, let's use directly typmod
and typname in pg_type instead.
- The addition of new OID defines is now unnecessary
Ronan, what do you think of those patches? I have nothing more to add, and
I think that they should be looked by a committer. Particularly the FDW API
that is perhaps not the best fit, but let's see some extra opinions about
that.
--
Michael

Attachment Content-Type Size
0001-Implement-IMPORT-FOREIGN-SCHEMA-in-core.patch text/x-diff 26.8 KB
0002-Add-support-of-IMPORT-FOREIGN-SCHEMA-in-postgres_fdw.patch text/x-diff 18.4 KB

From: Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
To: "Michael Paquier *EXTERN*" <michael(dot)paquier(at)gmail(dot)com>, Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>
Cc: PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: IMPORT FOREIGN SCHEMA statement
Date: 2014-07-01 06:59:49
Message-ID: A737B7A37273E048B164557ADEF4A58B17D15644@ntex2010i.host.magwien.gv.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Michael Paquier wrote:
> After sleeping on it, I have put my hands on the postgres_fdw portion and came up with a largely
> simplified flow, resulting in the patch attached.

[...]

> Ronan, what do you think of those patches? I have nothing more to add, and I think that they should be
> looked by a committer. Particularly the FDW API that is perhaps not the best fit, but let's see some
> extra opinions about that.

I looked the the API and ist documentation, and while I saw no problem with the API,
I think that the documentation still needs some attention:

It mentions a "local_schema", which doesn't exist (any more?).
It should be mentioned that the CreateForeignTableStmt's
base.relation->schemaname should be set to NULL.
Also, it would be nice to find a few words for "options",
maybe explaining a potential application.

Yours,
Laurenz Albe


From: Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>
To: Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Cc: Michael Paquier *EXTERN* <michael(dot)paquier(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: IMPORT FOREIGN SCHEMA statement
Date: 2014-07-01 07:23:31
Message-ID: 10108103.NNT0LSQ2rN@ronan.dunklau.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Le mardi 1 juillet 2014 06:59:49 Albe Laurenz a écrit :
> Michael Paquier wrote:
>
> > After sleeping on it, I have put my hands on the postgres_fdw portion and
> > came up with a largely
simplified flow, resulting in the patch attached.
>
>
> [...]
>
>
> > Ronan, what do you think of those patches? I have nothing more to add, and
> > I think that they should be
looked by a committer. Particularly the FDW
> > API that is perhaps not the best fit, but let's see some extra opinions
> > about that.

The remote_schema parameter can be used for SQL injection. Either we should go
back to using parameters, or be extra careful. Since the remote schema is
parsed as a name, it is limited to 64 characters which is not that useful for
an SQL injection, but still.

The new query as you wrote it returns the typname (was cast to regtype before)
This is not schema qualified, and will fail when importing tables with columns
from a type not in search_path.

The regression tests don't pass: a user name is hard-coded in the result of
DROP USER MAPPING. Should we expect the tests to be run as postgres ?

>
>
> I looked the the API and ist documentation, and while I saw no problem with
> the API,
> I think that the documentation still needs some attention:
>
> It mentions a "local_schema", which doesn't exist (any more?).
> It should be mentioned that the CreateForeignTableStmt's
> base.relation->schemaname should be set to NULL.
> Also, it would be nice to find a few words for "options",
> maybe explaining a potential application.
>
> Yours,
> Laurenz Albe

--
Ronan Dunklau
http://dalibo.com - http://dalibo.org


From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>
Cc: Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: IMPORT FOREIGN SCHEMA statement
Date: 2014-07-01 12:09:55
Message-ID: CAB7nPqShL0AAhA0JOASLP9Q0Vif6ECcVMekN5E4mCMiMniAi-w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Jul 1, 2014 at 4:23 PM, Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>
wrote:

> The remote_schema parameter can be used for SQL injection. Either we
> should go
> back to using parameters, or be extra careful. Since the remote schema is
> parsed as a name, it is limited to 64 characters which is not that useful
> for
> an SQL injection, but still.
>
Yes, right, I completely forgot that. IMPORT SCHEMA could be used by a
non-superuser so controlling only the size of relation name is not enough.

The new query as you wrote it returns the typname (was cast to regtype
> before)
> This is not schema qualified, and will fail when importing tables with
> columns
> from a type not in search_path.
>
Hm. The current solution with simply LookupTypeNameOid is more elegant than
relying on InputFunctionCall to fetch a Datum, that is then converted back
into TypeName... In all cases I am wondering about the use of regtype where
the same type name is used across multiple schemas. We should perhaps
document correctly that search_path influences the custom type chosen when
rebuilding foreign tables and that postgres_fdw does its best but that it
may not be compatible with type on foreign server.

> The regression tests don't pass: a user name is hard-coded in the result of
> DROP USER MAPPING. Should we expect the tests to be run as postgres?
>
I think that we need a cleaner solution for this test case, I've let it for
the time being but a make installcheck would let an extra database as it
cannot be removed in postgres_fdw.sql (an extra test case just to clean up
would work but I don't think that it is worth the complication). We could
abuse of search_path not tracking types located on certain schemas to
trigger this error :)

Want to give a shot on the following things? I wouldn't mind changing back
the query construction part :)
--
Michael


From: Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: IMPORT FOREIGN SCHEMA statement
Date: 2014-07-03 06:37:27
Message-ID: 3750934.RHJSuBr5IZ@ronan.dunklau.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Le mardi 1 juillet 2014 21:09:55 Michael Paquier a écrit :
> On Tue, Jul 1, 2014 at 4:23 PM, Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>
>
> wrote:
> > The remote_schema parameter can be used for SQL injection. Either we
> > should go
> > back to using parameters, or be extra careful. Since the remote schema is
> > parsed as a name, it is limited to 64 characters which is not that useful
> > for
> > an SQL injection, but still.
>
> Yes, right, I completely forgot that. IMPORT SCHEMA could be used by a
> non-superuser so controlling only the size of relation name is not enough.
>

I reintroduced PQExecParams, trying to keep it simple enough.

> The new query as you wrote it returns the typname (was cast to regtype
>
> > before)
> > This is not schema qualified, and will fail when importing tables with
> > columns
> > from a type not in search_path.
>
> Hm. The current solution with simply LookupTypeNameOid is more elegant than
> relying on InputFunctionCall to fetch a Datum, that is then converted back
> into TypeName... In all cases I am wondering about the use of regtype where
> the same type name is used across multiple schemas. We should perhaps
> document correctly that search_path influences the custom type chosen when
> rebuilding foreign tables and that postgres_fdw does its best but that it
> may not be compatible with type on foreign server.

I think that it would be better to qualify the type name. The attached patch
does that by fetching the type schema name in another column, and using
LookupTypeNameOid.

>
> > The regression tests don't pass: a user name is hard-coded in the result
> > of
> > DROP USER MAPPING. Should we expect the tests to be run as postgres?
>
> I think that we need a cleaner solution for this test case, I've let it for
> the time being but a make installcheck would let an extra database as it
> cannot be removed in postgres_fdw.sql (an extra test case just to clean up
> would work but I don't think that it is worth the complication). We could
> abuse of search_path not tracking types located on certain schemas to
> trigger this error :)
>
> Want to give a shot on the following things? I wouldn't mind changing back
> the query construction part :)

Also attached in this patch:
- more robust way for cleaning things up in regression tests
- small documentation change for options in the fdwhandler API

> --
> Michael

--
Ronan Dunklau
http://dalibo.com - http://dalibo.org

Attachment Content-Type Size
0001-Implement-IMPORT-FOREIGN-SCHEMA-in-core.patch text/x-patch 26.8 KB
0002-Add-support-of-IMPORT-FOREIGN-SCHEMA-in-postgres_fdw.patch text/x-patch 17.4 KB

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>
Cc: Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: IMPORT FOREIGN SCHEMA statement
Date: 2014-07-03 07:21:19
Message-ID: CAB7nPqSQRfabxo9YH74u7z1E-Yvd9-YRHDB+oSBHQmtnzEPxBg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Jul 3, 2014 at 3:37 PM, Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com> wrote:
> Also attached in this patch:
> - more robust way for cleaning things up in regression tests
> - small documentation change for options in the fdwhandler API
Those changes look good to me. I have found some minor issues, nothing
really huge so I am updating the patches myself:
- patch 0001 contains the documentation for postgres_fdw: it should be
in 0002, which is missing its docs.
- Some word-smithing as mentioned by Albe previously.
- PQclear before throwing the error "IMPORT of table \"%s\" failed
because of missing type blablah"
With that, I am marking this patch as ready for committer.
--
Michael

Attachment Content-Type Size
0001-Implement-IMPORT-FOREIGN-SCHEMA-in-core.patch text/x-patch 27.4 KB
0002-Add-support-of-IMPORT-FOREIGN-SCHEMA-in-postgres_fdw.patch text/x-patch 19.3 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, "Ronan Dunklau *EXTERN*" <ronan(dot)dunklau(at)dalibo(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: IMPORT FOREIGN SCHEMA statement
Date: 2014-07-07 11:58:33
Message-ID: CA+TgmobuXG2YTmEQdFtZ=60dNBhbNnRsN_7OV2KACdQ5pNxJKg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, May 26, 2014 at 12:17 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at> writes:
>> In addition to data type mapping questions (which David already raised)
>> I have one problem when I think of the Oracle FDW:
>
>> Oracle follows the SQL standard in folding table names to upper case.
>> So this would normally result in a lot of PostgreSQL foreign tables
>> with upper case names, which would be odd and unpleasant.
>
>> I cannot see a way out of that, but I thought I should mention it.
>
> It seems like it would often be desirable for the Oracle FDW to smash
> all-upper-case names to all-lower-case while importing, so that no quoting
> is needed on either side. I doubt though that this is so desirable that
> it should happen unconditionally.
>
> Between this and the type-mapping questions, it seems likely that
> we're going to need a way for IMPORT FOREIGN SCHEMA to accept
> user-supplied control options, which would in general be specific
> to the FDW being used. (Another thing the SQL committee failed to
> think about.)

Is this part of the SQL standard? What is it defined to do about
non-table objects?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: IMPORT FOREIGN SCHEMA statement
Date: 2014-07-07 12:00:27
Message-ID: 4850817.cuqnKY7HUU@ronan.dunklau.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Le lundi 7 juillet 2014 07:58:33 Robert Haas a écrit :
> On Mon, May 26, 2014 at 12:17 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at> writes:
> >> In addition to data type mapping questions (which David already raised)
> >> I have one problem when I think of the Oracle FDW:
> >>
> >> Oracle follows the SQL standard in folding table names to upper case.
> >> So this would normally result in a lot of PostgreSQL foreign tables
> >> with upper case names, which would be odd and unpleasant.
> >>
> >> I cannot see a way out of that, but I thought I should mention it.
> >
> > It seems like it would often be desirable for the Oracle FDW to smash
> > all-upper-case names to all-lower-case while importing, so that no quoting
> > is needed on either side. I doubt though that this is so desirable that
> > it should happen unconditionally.
> >
> > Between this and the type-mapping questions, it seems likely that
> > we're going to need a way for IMPORT FOREIGN SCHEMA to accept
> > user-supplied control options, which would in general be specific
> > to the FDW being used. (Another thing the SQL committee failed to
> > think about.)
>
> Is this part of the SQL standard? What is it defined to do about
> non-table objects?

The OPTIONS clause is not part of the SQL Standard.

Regarding non-table objects, the standard only talks about tables, and nothing
else.

--
Ronan Dunklau
http://dalibo.com - http://dalibo.org


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: IMPORT FOREIGN SCHEMA statement
Date: 2014-07-09 17:30:40
Message-ID: 18849.1404927040@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Michael Paquier <michael(dot)paquier(at)gmail(dot)com> writes:
> With that, I am marking this patch as ready for committer.

I've started looking at this patch. I wonder whether it's really such
a great idea to expect the FDW to return a list of parsetrees for
CREATE FOREIGN TABLE commands; that seems like a recipe for breakage
anytime we change the parsetree representation, say add a field to
ColumnDef. The alternative I'm thinking about is to have the FDW pass
back a list of strings, which would be textual CREATE FOREIGN TABLE
commands. This seems like it'd be more robust and in most cases not
any harder for the FDW to generate; moreover, it would greatly improve
the quality of error reporting anytime there was anything wrong with
what the FDW did.

As against that, you could point out that we make FDWs deal with
parsetrees when doing planning. But the important difference there
is that they're mostly *reading* the parsetrees, not building new
ones from scratch, so there's much less opportunity for errors of
omission.

Comments?

regards, tom lane


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: IMPORT FOREIGN SCHEMA statement
Date: 2014-07-09 18:29:39
Message-ID: 20140709182939.GP6390@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Michael Paquier <michael(dot)paquier(at)gmail(dot)com> writes:
> > With that, I am marking this patch as ready for committer.
>
> I've started looking at this patch. I wonder whether it's really such
> a great idea to expect the FDW to return a list of parsetrees for
> CREATE FOREIGN TABLE commands; that seems like a recipe for breakage
> anytime we change the parsetree representation, say add a field to
> ColumnDef. The alternative I'm thinking about is to have the FDW pass
> back a list of strings, which would be textual CREATE FOREIGN TABLE
> commands.

.oO(json blobs as in the DDL deparse patch ...)

(I don't know if they are really suitable. I have no idea how this
patch works.)

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: IMPORT FOREIGN SCHEMA statement
Date: 2014-07-09 23:11:22
Message-ID: CAB7nPqSt8UbU3jzxmKKNBo=M33z0KsaEs4zs_hV+Sp38sD8d8w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Jul 10, 2014 at 2:30 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Michael Paquier <michael(dot)paquier(at)gmail(dot)com> writes:
> > With that, I am marking this patch as ready for committer.
>
> I've started looking at this patch. I wonder whether it's really such
> a great idea to expect the FDW to return a list of parsetrees for
> CREATE FOREIGN TABLE commands; that seems like a recipe for breakage
> anytime we change the parsetree representation, say add a field to
> ColumnDef. The alternative I'm thinking about is to have the FDW pass
> back a list of strings, which would be textual CREATE FOREIGN TABLE
> commands. This seems like it'd be more robust and in most cases not
> any harder for the FDW to generate; moreover, it would greatly improve
> the quality of error reporting anytime there was anything wrong with
> what the FDW did.
>

Agreed. Modifying postgres_fdw portion to do so would not take long.

As against that, you could point out that we make FDWs deal with
> parsetrees when doing planning. But the important difference there
> is that they're mostly *reading* the parsetrees, not building new
> ones from scratch, so there's much less opportunity for errors of
> omission.
>
> Comments?
>
The SQL-MED spec talks only about foreign tables when importing, It would
be good to keep the checks on CreateTableForeignStmt after parsing the
strings, which is what I imagine server would do after taking back the list
of strings from FDW before creating the objects.

Regards,
--
Michael


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: IMPORT FOREIGN SCHEMA statement
Date: 2014-07-09 23:37:00
Message-ID: 13743.1404949020@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Another thing ...

The SQL-MED standard says that "IMPORT FOREIGN SCHEMA ... LIMIT TO (a,b,c)"
should throw an error if not all of a,b,c exist as tables in the remote
server. It's rather sloppily worded, though, such that it's not clear
whether an error is also expected for "EXCEPT (a,b,c)" when not all of
those names exist. It seems to me that the whole thing is badly thought
out and it would be better not to complain for nonexistent names in either
type of list. The use of "LIMIT" seems to me to imply that the list is
a filter, not an exact set of names that must be present.

Comments?

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: IMPORT FOREIGN SCHEMA statement
Date: 2014-07-10 00:52:23
Message-ID: 16977.1404953543@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I wrote:
> I've started looking at this patch. I wonder whether it's really such
> a great idea to expect the FDW to return a list of parsetrees for
> CREATE FOREIGN TABLE commands; that seems like a recipe for breakage
> anytime we change the parsetree representation, say add a field to
> ColumnDef. The alternative I'm thinking about is to have the FDW pass
> back a list of strings, which would be textual CREATE FOREIGN TABLE
> commands.

Here's a rather-heavily-editorialized draft patch that does it like that.

This patch takes the viewpoint I espoused nearby of allowing names in
the LIMIT TO clause that aren't present on the remote server. If we
decide we want to hew to the letter of the standard on that, I'd be
inclined to enforce this in the core code, not in individual FDWs as
the submitted patch did. (I didn't much like that implementation
anyway, since it didn't tell you which name it was unhappy about.)

regards, tom lane

Attachment Content-Type Size
import-foreign-schema-2.patch text/x-diff 57.6 KB

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: IMPORT FOREIGN SCHEMA statement
Date: 2014-07-10 01:28:08
Message-ID: CAB7nPqShYqd5OOA185pVe0hZsiEPeu-85GjicXjxkmQZ8jK6ew@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Jul 10, 2014 at 9:52 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> I wrote:
> > I've started looking at this patch. I wonder whether it's really such
> > a great idea to expect the FDW to return a list of parsetrees for
> > CREATE FOREIGN TABLE commands; that seems like a recipe for breakage
> > anytime we change the parsetree representation, say add a field to
> > ColumnDef. The alternative I'm thinking about is to have the FDW pass
> > back a list of strings, which would be textual CREATE FOREIGN TABLE
> > commands.
>
> Here's a rather-heavily-editorialized draft patch that does it like that.
>
> This patch takes the viewpoint I espoused nearby of allowing names in
> the LIMIT TO clause that aren't present on the remote server. If we
> decide we want to hew to the letter of the standard on that, I'd be
> inclined to enforce this in the core code, not in individual FDWs as
> the submitted patch did. (I didn't much like that implementation
> anyway, since it didn't tell you which name it was unhappy about.)
>

Woah, thanks. I've just been through this patch and it looks great.

I guess that this implementation is enough as a first shot, particularly
regarding the complexity that default expression import would bring up for
postgres_fdw (point raised during the review, but not really discussed
afterwards).

Regards,
--
Michael


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: IMPORT FOREIGN SCHEMA statement
Date: 2014-07-10 01:30:05
Message-ID: 17980.1404955805@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Michael Paquier <michael(dot)paquier(at)gmail(dot)com> writes:
> I guess that this implementation is enough as a first shot, particularly
> regarding the complexity that default expression import would bring up for
> postgres_fdw (point raised during the review, but not really discussed
> afterwards).

Yeah. I'm actually more concerned about the lack of collation import,
but that's unfixable unless we can figure out how to identify collations
in a platform-independent way.

regards, tom lane


From: Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: IMPORT FOREIGN SCHEMA statement
Date: 2014-07-10 07:06:37
Message-ID: 42581191.QguB3CtMD8@ronan.dunklau.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Le mercredi 9 juillet 2014 21:30:05 Tom Lane a écrit :
> Michael Paquier <michael(dot)paquier(at)gmail(dot)com> writes:
> > I guess that this implementation is enough as a first shot, particularly
> > regarding the complexity that default expression import would bring up for
> > postgres_fdw (point raised during the review, but not really discussed
> > afterwards).
>
> Yeah. I'm actually more concerned about the lack of collation import,
> but that's unfixable unless we can figure out how to identify collations
> in a platform-independent way.
>
> regards, tom lane

Thank you for working on this patch, I'm not really fond of building strings
in a FDW for the core to parse them again but I don't see any other solution
to the problem you mentioned.

Similarly to what we do for the schema, we should also check that the server
in the parsed statement is the one we are importing from.

--
Ronan Dunklau
http://dalibo.com - http://dalibo.org


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: IMPORT FOREIGN SCHEMA statement
Date: 2014-07-10 14:26:33
Message-ID: 4662.1405002393@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I wrote:
> Michael Paquier <michael(dot)paquier(at)gmail(dot)com> writes:
>> I guess that this implementation is enough as a first shot, particularly
>> regarding the complexity that default expression import would bring up for
>> postgres_fdw (point raised during the review, but not really discussed
>> afterwards).

> Yeah. I'm actually more concerned about the lack of collation import,
> but that's unfixable unless we can figure out how to identify collations
> in a platform-independent way.

I had second thoughts about this: now that the code is generating a
textual CREATE FOREIGN TABLE command, it would actually be just about
trivial to pull back the text representation of the remote's default
expression and attempt to apply it to the foreign table. Likewise we
could ignore the platform-dependency issues in collations and just
attempt to apply whatever collation name is reported by the remote.

Arguably both of these things would be too risky to do by default,
but if we make them be controlled by IMPORT options, why not?

In the case of a default expression, AFAICS there are two possible
failure modes. The local server might not have some function or
operator used by the remote; in which case, you'd get a pretty easy
to understand error message, and it would be obvious what to do if
you wanted to fix it. Or we might have a similarly-named function
or operator that behaves differently (an important special case
is where the function is volatile and you'd get a different answer
locally, eg nextval()). Well, that's a problem, and we'd need to
document it, but I don't see that that risk is so bad that we should
refuse to import any default expressions ever. Especially when
common cases like "0" or "now()" can be expected to work fine.

Similarly, in the case of a collation name, we might not have the
same locale name, which would be annoying but not dangerous. Or
we might have a similarly-named locale that actually has sorting
rules different from the remote's. But that would only be a real
risk when pulling from a remote that's on a different OS from the
local server, and in any case whatever discrepancy might exist is
unlikely to be worse than failing to import a collation spec at all.
Not importing the remote's spec is *guaranteed* to be wrong.

So I propose we invent a couple more import options, say
import_default and import_collate, and make the postgres_fdw
code do the obvious thing with them. import_default should
probably default to false, but I'm about halfway tempted to
say that import_collate should default to true --- if you're
importing a collatable column and you don't have a matching
locale locally, it seems like it'd be better if we complain
about that by default.

Comments?

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: IMPORT FOREIGN SCHEMA statement
Date: 2014-07-10 14:37:06
Message-ID: 4919.1405003026@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com> writes:
> Similarly to what we do for the schema, we should also check that the server
> in the parsed statement is the one we are importing from.

Hm. The FDW would really have to go out of its way to put the wrong thing
there, so is this worth the trouble? It's not like the creation schema
where you can easily omit it from the command; the syntax requires it.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: IMPORT FOREIGN SCHEMA statement
Date: 2014-07-10 19:06:02
Message-ID: 21643.1405019162@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I wrote:
> So I propose we invent a couple more import options, say
> import_default and import_collate, and make the postgres_fdw
> code do the obvious thing with them. import_default should
> probably default to false, but I'm about halfway tempted to
> say that import_collate should default to true --- if you're
> importing a collatable column and you don't have a matching
> locale locally, it seems like it'd be better if we complain
> about that by default.

I've committed this patch with that addition and some minor additional
cleanup.

regards, tom lane


From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: IMPORT FOREIGN SCHEMA statement
Date: 2014-07-10 23:12:15
Message-ID: CAB7nPqR=s4sJ26ZRNTJEqM4etgZUp_sX=ELT00+nwiUxNMNR4Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Jul 11, 2014 at 4:06 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> I wrote:
> > So I propose we invent a couple more import options, say
> > import_default and import_collate, and make the postgres_fdw
> > code do the obvious thing with them. import_default should
> > probably default to false, but I'm about halfway tempted to
> > say that import_collate should default to true --- if you're
> > importing a collatable column and you don't have a matching
> > locale locally, it seems like it'd be better if we complain
> > about that by default.
>
> I've committed this patch with that addition and some minor additional
> cleanup.
>
Thanks!
--
Michael