Re: Proposal: IMPORT FOREIGN SCHEMA statement.

Lists: pgsql-hackers
From: Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Proposal: IMPORT FOREIGN SCHEMA statement.
Date: 2014-02-21 10:31:43
Message-ID: 4798120.L0Pg0ZRAhq@ronan.dunklau.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello,

The SQL-MED specification defines the IMPORT FOREIGN SCHEMA statement.

This adds discoverability to foreign servers. The structure of the statement
as I understand it is simple enough:

IMPORT FOREIGN SCHEMA remote_schema FROM SERVER some_server [ (LIMIT TO |
EXCEPT) table_list ] INTO local_schema.

Is anyone working on this? I found a reference to this from 2010 in the
archive, stating that work should be focused on core functionality, but
nothing more recent.

This would be very useful for postgres_fdw and other RDBMS-backed fdws, but I
think even file_fdw could benefit from it if it was able to create a foreign
table for every csv-with-header file in a directory.

I can see a simple API working for that. A new function would be added to the
fdw routine, which is responsible for crafting CreateForeignTableStmt. It
could have the following signature:

typedef List *(*ImportForeignSchema_function) (ForeignServer *server,
ImportForeignSchemaStmt * parsetree);

I experimented with this idea, and came up with the attached two patches: one
for the core, and the other for actually implementing the API in postgres_fdw.

Maybe those can serve as a proof-of-concept for discussing the design?

--
Ronan Dunklau
http://dalibo.com - http://dalibo.org

Attachment Content-Type Size
import_foreign_schema.patch text/x-patch 11.1 KB
import_foreign_schema_postgres_fdw.patch text/x-patch 7.8 KB

From: Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
To: Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal: IMPORT FOREIGN SCHEMA statement.
Date: 2014-02-21 11:00:30
Message-ID: CAOeZVic8WnfozvpPs2pACLM39XLV69H7b0-LtNSAjLfTFVEvog@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 21, 2014 at 4:01 PM, Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>wrote:

> Hello,
>
> The SQL-MED specification defines the IMPORT FOREIGN SCHEMA statement.
>
> This adds discoverability to foreign servers. The structure of the
> statement
> as I understand it is simple enough:
>
> IMPORT FOREIGN SCHEMA remote_schema FROM SERVER some_server [ (LIMIT TO |
> EXCEPT) table_list ] INTO local_schema.
>
> Is anyone working on this? I found a reference to this from 2010 in the
> archive, stating that work should be focused on core functionality, but
> nothing more recent.
>
> This would be very useful for postgres_fdw and other RDBMS-backed fdws,
> but I
> think even file_fdw could benefit from it if it was able to create a
> foreign
> table for every csv-with-header file in a directory.
>
> I can see a simple API working for that. A new function would be added to
> the
> fdw routine, which is responsible for crafting CreateForeignTableStmt. It
> could have the following signature:
>
> typedef List *(*ImportForeignSchema_function) (ForeignServer *server,
> ImportForeignSchemaStmt * parsetree);
>
>
> I experimented with this idea, and came up with the attached two patches:
> one
> for the core, and the other for actually implementing the API in
> postgres_fdw.
>
> Maybe those can serve as a proof-of-concept for discussing the design?
>

I havent had a look at the patch yet since I dont have a nice editor right
now, but how do you handle inter operability between datatypes?
Specifically, how do you handle those datatypes which have a different name
from the PostgreSQL name for them and/or are stored in a different manner?

Regards,

Atri


From: Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>
To: Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal: IMPORT FOREIGN SCHEMA statement.
Date: 2014-02-21 11:09:44
Message-ID: 2968748.F0FJzsitqg@ronan.dunklau.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> I havent had a look at the patch yet since I dont have a nice editor right
> now, but how do you handle inter operability between datatypes?
> Specifically, how do you handle those datatypes which have a different name
> from the PostgreSQL name for them and/or are stored in a different manner?

Do you mean in general, or for the postgres_fdw specifically ?

In general, only valid column types should be accepted in the
CreateForeignTableStmt. The CreateForeignTableStmt is passed through
DefineRelation, which takes care of looking up the actual data types.

For the postgres_fdw POC implementation, this is done by parsing the
attributes type from the query result with the regtype input functions. The
attribute typmod is injected too.

>
> Regards,
>
> Atri

--
Ronan Dunklau
http://dalibo.com - http://dalibo.org


From: Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
To: Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal: IMPORT FOREIGN SCHEMA statement.
Date: 2014-02-21 11:15:20
Message-ID: CAOeZVieqmnH4atF=X_JqnfeMTQZXCW9057jZruu7wgSQxnkq8g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 21, 2014 at 4:39 PM, Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>wrote:

> > I havent had a look at the patch yet since I dont have a nice editor
> right
> > now, but how do you handle inter operability between datatypes?
> > Specifically, how do you handle those datatypes which have a different
> name
> > from the PostgreSQL name for them and/or are stored in a different
> manner?
>
> Do you mean in general, or for the postgres_fdw specifically ?
>
> In general, only valid column types should be accepted in the
> CreateForeignTableStmt. The CreateForeignTableStmt is passed through
> DefineRelation, which takes care of looking up the actual data types.
>
> For the postgres_fdw POC implementation, this is done by parsing the
> attributes type from the query result with the regtype input functions. The
> attribute typmod is injected too.
>

I actually meant in general. Thanks for the reply.

So please help me understand here. How exactly does CreateForeignTableStmt
help in type compatibility? A statement may be valid on a foreign server
but may not be compatible.

What am I missing here naively?

Regards,

Atri

--
Regards,

Atri
*l'apprenant*


From: Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>
To: Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal: IMPORT FOREIGN SCHEMA statement.
Date: 2014-02-21 11:26:05
Message-ID: 9175607.p1EUfFB45D@ronan.dunklau.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Le vendredi 21 février 2014 16:45:20 Atri Sharma a écrit :
> On Fri, Feb 21, 2014 at 4:39 PM, Ronan Dunklau
<ronan(dot)dunklau(at)dalibo(dot)com>wrote:
> > > I havent had a look at the patch yet since I dont have a nice editor
> >
> > right
> >
> > > now, but how do you handle inter operability between datatypes?
> > > Specifically, how do you handle those datatypes which have a different
> >
> > name
> >
> > > from the PostgreSQL name for them and/or are stored in a different
> >
> > manner?
> >
> > Do you mean in general, or for the postgres_fdw specifically ?
> >
> > In general, only valid column types should be accepted in the
> > CreateForeignTableStmt. The CreateForeignTableStmt is passed through
> > DefineRelation, which takes care of looking up the actual data types.
> >
> > For the postgres_fdw POC implementation, this is done by parsing the
> > attributes type from the query result with the regtype input functions.
> > The
> > attribute typmod is injected too.
>
> I actually meant in general. Thanks for the reply.
>
> So please help me understand here. How exactly does CreateForeignTableStmt
> help in type compatibility?

I'm not sure I understand your concern. It doesn't help in type compatibility,
it is still the responsibility of the FDW to convert local types to remote
ones.

The CreateForeignTableStmt defines the column, with their types. It is
executed locally to create a new foreign table according to a remote
description of the table. The only difference with a user-written
CreateForeignTable statement is that the structure is crafted by the FDW
instead of the parser.

> A statement may be valid on a foreign server but may not be compatible.

Do you mean the CreateForeignTableStmt ? It has to be valid locally, or it
won't be executed. It is the FDW responsibility to build this statement in
such a way that it is valid locally.

>
> Regards,
>
> Atri

--
Ronan Dunklau
http://dalibo.com - http://dalibo.org


From: Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
To: Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal: IMPORT FOREIGN SCHEMA statement.
Date: 2014-02-21 11:28:22
Message-ID: CAOeZViedtd0RUHCT9u=wR6URvrjTssPG-Py8s9vdzXT+Jxrb7w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 21, 2014 at 4:56 PM, Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>wrote:

> Le vendredi 21 février 2014 16:45:20 Atri Sharma a écrit :
> > On Fri, Feb 21, 2014 at 4:39 PM, Ronan Dunklau
> <ronan(dot)dunklau(at)dalibo(dot)com>wrote:
> > > > I havent had a look at the patch yet since I dont have a nice editor
> > >
> > > right
> > >
> > > > now, but how do you handle inter operability between datatypes?
> > > > Specifically, how do you handle those datatypes which have a
> different
> > >
> > > name
> > >
> > > > from the PostgreSQL name for them and/or are stored in a different
> > >
> > > manner?
> > >
> > > Do you mean in general, or for the postgres_fdw specifically ?
> > >
> > > In general, only valid column types should be accepted in the
> > > CreateForeignTableStmt. The CreateForeignTableStmt is passed through
> > > DefineRelation, which takes care of looking up the actual data types.
> > >
> > > For the postgres_fdw POC implementation, this is done by parsing the
> > > attributes type from the query result with the regtype input functions.
> > > The
> > > attribute typmod is injected too.
> >
> > I actually meant in general. Thanks for the reply.
> >
> > So please help me understand here. How exactly does
> CreateForeignTableStmt
> > help in type compatibility?
>
> I'm not sure I understand your concern. It doesn't help in type
> compatibility,
> it is still the responsibility of the FDW to convert local types to remote
> ones.
>

Yeah, thats what I wondered. Ok, now I get it. The responsibility of FDW
shall suffice for us.

>
> The CreateForeignTableStmt defines the column, with their types. It is
> executed locally to create a new foreign table according to a remote
> description of the table. The only difference with a user-written
> CreateForeignTable statement is that the structure is crafted by the FDW
> instead of the parser.
>

Got that.

>
> > A statement may be valid on a foreign server but may not be compatible.
>
> Do you mean the CreateForeignTableStmt ? It has to be valid locally, or it
> won't be executed. It is the FDW responsibility to build this statement in
> such a way that it is valid locally.
>

Yes, I understand it now. My concerns are not valid anymore.

Thanks for the detailed description.

Regards,

Atri

--
Regards,

Atri
*l'apprenant*


From: Florian Pflug <fgp(at)phlo(dot)org>
To: Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>
Cc: Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal: IMPORT FOREIGN SCHEMA statement.
Date: 2014-02-21 12:15:01
Message-ID: 61E6C22F-ED00-4952-9330-1292EB2DE477@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Feb21, 2014, at 12:09 , Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com> wrote:
>> I havent had a look at the patch yet since I dont have a nice editor right
>> now, but how do you handle inter operability between datatypes?
>> Specifically, how do you handle those datatypes which have a different name
>> from the PostgreSQL name for them and/or are stored in a different manner?
>
> For the postgres_fdw POC implementation, this is done by parsing the
> attributes type from the query result with the regtype input functions. The
> attribute typmod is injected too.

Who says that the OIDs are the same on the local and the remove postgres
instance? For user-defined types, that's certainly not going to be true...

Also, why do you aggregate the lists of columns, types and oids into arrays
when querying them from the remote server? Just producing a query that returns
one row per table column seems much simpler, both conceptually and implementation
wise.

Finally, I think there are a few missing pieces. For example, you quite easily
could copy over not-NULL flags, but you currently don't. Similarly, what about
inheritance relationships between remote tables? There's a patch in the current
CF, I believe, which adds support for inheritance to foreign tables, so all you'd
have to do is to make the foreign table's inheritance structure match the remote
table's.

best regards,
Florian Pflug


From: Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Florian Pflug <fgp(at)phlo(dot)org>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
Subject: Re: Proposal: IMPORT FOREIGN SCHEMA statement.
Date: 2014-02-21 12:36:17
Message-ID: 3184341.7osFGsTT6L@ronan.dunklau.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


> Who says that the OIDs are the same on the local and the remove postgres
> instance? For user-defined types, that's certainly not going to be true...

That's why the the result is casted as regtype[], and parsed as such. The oid
is not transmitted over the wire, but set by regtype_in.

>
> Also, why do you aggregate the lists of columns, types and oids into arrays
> when querying them from the remote server? Just producing a query that
> returns one row per table column seems much simpler, both conceptually and
> implementation wise.

As I said, this is a Proof-Of-Concept. It is not meant to be a fully
functional, optimal implementation, but to serve as basis for discussion of
the API and the feature themselves.

The query could indeed be replaced by what you suggest, performing the
grouping locally. I have absolutely no opinion on which implementation is
better, this one seemed the most "natural" to me.

>
> Finally, I think there are a few missing pieces. For example, you quite
> easily could copy over not-NULL flags, but you currently don't. Similarly,
> what about inheritance relationships between remote tables? There's a patch
> in the current CF, I believe, which adds support for inheritance to foreign
> tables, so all you'd have to do is to make the foreign table's inheritance
> structure match the remote table's.

Duly noted, we could probably import NOT NULL flags, and maybe even the
table's inheritance structure. I'll look into that if the feature and the API
are deemed worthy.

Thank you for the review.

--
Ronan Dunklau
http://dalibo.com - http://dalibo.org