Re: IMPORT FOREIGN SCHEMA statement

From: Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: David Fetter <david(at)fetter(dot)org>
Subject: Re: IMPORT FOREIGN SCHEMA statement
Date: 2014-05-25 21:23:41
Message-ID: 3057545.cu6kBAGtju@ronan.dunklau.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Le dimanche 25 mai 2014 12:41:18 David Fetter a écrit :
> On Fri, May 23, 2014 at 10:08:06PM +0200, Ronan Dunklau wrote:
> > Hello,
> >
> > Since my last proposal didn't get any strong rebuttal, please find
> > attached a more complete version of the IMPORT FOREIGN SCHEMA statement.
>
> Thanks!
>
> Please to send future patches to this thread so people can track them
> in their mail.

I'll do.

I didn't for the previous one because it was a few months ago, and no patch
had been added to the commit fest.

>
> > I tried to follow the SQL-MED specification as closely as possible.
> >
> > This adds discoverability to foreign servers. The structure of the
> > statement as I understand it is simple enough:
> >
> > IMPORT FOREIGN SCHEMA remote_schema FROM SERVER some_server [ (LIMIT TO |
> > EXCEPT) table_list ] INTO local_schema.
> >
> > The import_foreign_schema patch adds the infrastructure, and a new FDW
> > routine:
> >
> > typedef List *(*ImportForeignSchema_function) (ForeignServer *server,
> > ImportForeignSchemaStmt * parsetree);
> >
> > This routine must return a list of CreateForeignTableStmt mirroring
> > whatever tables were found on the remote side, which will then be
> > executed.
> >
> > The import_foreign_schema_postgres_fdw patch proposes an implementation of
> > this API for postgres_fdw. It will import a foreign schema using the right
> > types as well as nullable information.
>
> In the case of PostgreSQL, "the right types" are obvious until there's
> a user-defined one. What do you plan to do in that case ?
>

The current implementation fetches the types as regtype, and when receiving a
custom type, two things can happen:

- the type is defined locally: everything will work as expected
- the type is not defined locally: the conversion function will fail, and
raise an error of the form: ERROR: type "schema.typname" does not exist

Should I add that to the regression test suite ?

> > Regarding documentation, I don't really know where it should have been
> > put. If I missed something, let me know and I'll try to correct it.
>
> It's not exactly something you missed, but I need to bring it up
> anyway before we go too far. The standard missed two crucial concepts
> when this part of it was written:
>
> 1. No single per-database-type universal type mapping can be correct.
>
> People will have differing goals for type mapping, and writing a whole
> new FDW for each of those goals is, to put it mildly, wasteful. I
> will illustrate with a concrete and common example.
>
> MySQL's datetime type encourages usages which PostgreSQL's
> corresponding type, timestamptz, simply disallows, namely '0000-00-00
> 00:00:00' as its idea of UNKNOWN or NULL.
>
> One way PostgreSQL's mapping could work is to map it to TEXT, which
> would preserve the strings exactly and be in some sense an identity
> map. It would also make the type somewhat useless in its original
> intended form.
>
> Another one would map the type is to a composite data type
> mysql_datetime(tstz timestamptz, is_wacky boolean) which would
> capture, for example, ('2014-04-01 00:00:00+00', false) for the UTC
> start of April Fools' Day this year, and (NULL, true) for '0000-00-00
> 00:00:00'.
>
> There are doubtless others, and there is no principled way to assign
> any one of them as universally correct.
>
> This brings me to the next crucial concept the standard missed:
>
> 2. The correct mapping may not be the identity, and furthermore, the
> inbound and outbound mappings might in general not be mere inversions
> of each other.
>
> MySQL (no aspersions intended) again provides a concrete example with
> its unsigned integer types. Yes, it's possible to create a domain
> over INT8 which simulates UINT4, a domain over NUMERIC which simulates
> UINT8, etc., but again this process's correctness depends on
> circumstances.
>
> To address these problems, I propose the following:
>
> - We make type mappings settable at the level of:
> - FDW
> - Instance (a.k.a. cluster)
> - Database
> - Schema
> - Table
> - Column
>
> using the existing ALTER command and some way of spelling out how
> a remote type maps to a local type.
>
> This would consist of:
> - The remote type
> - The local type to which it maps
> - The inbound transformation (default identity)
> - The outbound transformation (default identity)
>
> At any given level, the remote type would need to be unique. To
> communicate this to the system, we either invent new syntax, with
> all the hazards attendant thereto, or we could use JSON or similar
> serialization.
>
> ALTER FOREIGN TABLE foo
> ADD TYPE MAPPING
> FROM "datetime"
> TO TEXT
> WITH (
> INBOUND TRANSFORMATION IDENTITY,
> OUTBOUND TRANSFORMATION IDENTITY
> ) /* Ugh!!! */
>
> vs.
>
> ALTER FOREIGN TABLE foo ADD (mapping '{
> "datetime": "text",
> "inbound": "IDENTITY",
> outbound: "IDENTITY"
> }')
>
> Each FDW would have some set of default mappings and some way to
> override them as above.

I understand your points, but I'm not really comfortable with the concept,
unless there is something that I missed.

We can already support this use case through specific-fdw options. Should I
add that to postgres_fdw ?

Additionally, I don't really see how that would be useful in a general case.
With an "in-core" defined meaning of type transformation, any FDW that doesn't
fit exactly into the model would have a hard time. For example, what happens
if an FDW is only ever capable of returning text ? Or if a mapping can only be
set at the server or FDW model because it depends on some connection parameter
? The bulk of the code for managing type mappings would be FDW-specific
anyway. What you're proposing looks like a "universal option", with a specific
syntax, that should therefore be supported by all fdws, with well-defined
semantics.

Moreover, extending the spec seems a bit dangerous to me, since if the spec
decides to address this specific point in the future, there is a risk that our
behavior will not be compatible.

Thank you for your feedback,

--
Ronan Dunklau
http://dalibo.com - http://dalibo.org

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Janes 2014-05-25 21:45:38 Re: 9.4 btree index corruption
Previous Message Tom Lane 2014-05-25 20:58:39 Re: Sending out a request for more buildfarm animals?