Re: IMPORT FOREIGN SCHEMA statement

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>
Cc: PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: IMPORT FOREIGN SCHEMA statement
Date: 2014-06-17 08:06:55
Message-ID: CAB7nPqSKMOP75HbPEddnypLd5EWz85PfQSBPHY6AaP2O0Xbrew@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jun 17, 2014 at 4:36 AM, Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com> wrote:
> Le lundi 16 juin 2014 16:07:51 Michael Paquier a écrit :
>> On Sat, May 24, 2014 at 5:08 AM, Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>
>> 2) The query I am seeing on this spec offers the possiblitily to query
>> TABLE_NAME LIKE 'pattern'. Is this part of the spec as well? If yes,
>> are you planning to add it as well. I imagine that this could be quite
>> handy for users importing from a foreign schema tables that have the
>> same prefix name for example. ImportForeignSchemaRestrictionType could
>> be extended with an IMPORT_LIKE type. Extending a bit the spec...
>> There could be a list of LIKE patterns, this matches more your code by
>> adding all of them in table_names. I am not voting to add TABLE_NAME
>> in the list of reserved keywords though, so something like "TABLE LIKE
>> 'pattern'" would be nice.
>
> From the spec I have been reading, the import qualification is defined as :
>
> <table name list> ::= <table name> [ { <comma> <table name> }... ]
>
> where table name is defined as a simple table name, without any qualifier.
Indeed. Let's stick to that for simplicity, and use only LIMIT TO/EXCEPT.

>> 3) This has been already raised in this thread, but something missing
>> with this patch is the possiblity to pass options when executing an
>> import. This could allow a FDW to do more fine-grained operations on
>> the underlying objects of the table. There would be two level of
>> options: table level and schema level. For example, with the current
>> feature it is not possible to rename imported tables on-the-fly. I
>> imagine that this would be useful.
>
> The attached patch implements options.
Thanks.

> Now, we should think about what options may be desirable for postgres_fdw.
An option to include triggers in what is fetched? I am sure you know
they can be created on foreign tables now :)

> I don't understand your point about table-level and schema-level options,
> though. There is no concept of "foreign schema" allowing to set options on it,
> AFAIK.
My point is to extend items of the table list to accept options:
[ { LIMIT TO | EXCEPT } ( table_item [ , table_item ] ) ]
where table_item:
table_name [ OPTIONS ( option_item [ , option_item ] ) ]

This would make possible post-operations on the tables fetched before
the FDW create individual CreateForeignTableStmt entries in
ImportForeignSchema. In the case of postgres_fdw, a potential option
would be to allow a renaming of the table after fetching them from
remote.

> Renaming imported tables on the fly should, IMO, be done after the import
> statement. Since it is transactional, nothing prevents you from running the
> ALTER TABLE statements after that.
True as well.

>> 5) The error message returned to user when import fails because of a
>> missing type is rather unfriendly. Let's imagine with two nodes and
>> postgres_fdw... Node 1:
>> create type toto as (a int, b int);
>> create table toto_tab (a toto);
>> Node 2:
>> =# import foreign schema public from server postgres_server into public;
>> ERROR: 42704: type "public.toto" does not exist
>> LOCATION: parseTypeString, parse_type.c:797
>> I would rather imagine something like "IMPORT failed because of
>> missing type defined on remote but not locally".
>
> I simply prepended the suggested error message to the previous one:
>
> # IMPORT FOREIGN SCHEMA import_source FROM SERVER remote1 INTO
> import_destination;
> ERROR: IMPORT of table t1 failed because of missing type defined on remote but
> not locally: type "public.typ1" does not exist
> Time: 11,305 ms
Thanks for the addition.

>> 6) Import does not fail if a table specified in LIMIT TO is not
>> defined on remote-side:
>> =# import foreign schema public from server postgres_server limit to
>> (tab_not_there) into public;
>> IMPORT FOREIGN SCHEMA
>> =# \d
>> No relations found.
>
> Thanks, this is corrected in the attached patch. The error detection code is
> O(n²), but I assumed LIMIT TO clauses should be small enough for it not to be
> a problem.

>> 10) Code has many whitespaces.
>
> I hope this is corrected in the attached patch, if not, could you please point
> me at specific examples ?
>
>> 11) Sometimes the import goes strangely:
>> 11-1) After some testing I noticed that tables with incorrect names
>> were imported when using LIMIT TO. For example on remote a table
>> called "ab" is present, IMPORT SCHEMA LIMIT TO 'other_ab' created a
>> local entry called "other_ab" while it should create nothing
>> 11-2) Tables with empty names can be created locally. On foreign server:
>> =# \d
>> List of relations
>> Schema | Name | Type | Owner
>> --------+----------+-------+--------
>> public | aa | table | ioltas
>> public | toto_tab | table | ioltas
>> (2 rows)
>> On local node:
>> =# import foreign schema public from server postgres_server limit to
>> (toto_tab, "NULL", aa) into public;
>> -- (Forget about NULL here, I could reproduce it without as well)
>> IMPORT FOREIGN SCHEMA
>> Time: 13.589 ms
>> =# \d
>> List of relations
>> Schema | Name | Type | Owner
>> --------+----------+---------------+--------
>> public | | foreign table | ioltas
>> public | toto_tab | foreign table | ioltas
>> (2 rows)
>> Some more testing showed me that as well:
>> =# \d
>>
>> List
>> of relations
>> Schema |
>> Name
>>
>> | Type |
>>
>> Owner
>> --------+-------------------------------------------------------------------
>> ----------------------------------------------------------------------------
>> ----------------------------------------------------------------------------
>> -----------------------------------+---------------+-------- public |
>>
>> | foreign table |
>>
>> ioltas
>> public | toto_tab
>>
>> | foreign table |
>>
>> ioltas
>> public |F\x7F\x7
>> F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7
>> F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7
>> F\x7F\x7F\x7F\x7F\x7F\x7F
>> | foreign table | ioltas
>> \x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7F\x7
>
> This seems to be related to re-using the table-name between invocations. The
> attached patch should fix point 2. As for point 1, I don't know the cause for
> it. Do you have a reproducible test case ?
Sure. I'll try harder when looking in more details at the patch for
postgres_fdw :)

Here are extra comments for the core patch:
- Patch has no whitespaces, compiles without warnings. Regression tests pass.
- indentation is not correct in the documentation, 1 space is missing
for the block <para></para> ;) This block is also strangely formatted.
Looking at the other FDW APIs documented, you could use to <para>
blocks: one for the input data (server name, etc), a second for the
output result (which is the list of CreateForeignTableStmt created).
Please check as well spaces between blocks.
- Instead of remote "schema", let's say simply foreign schema.
Renaming the documentation section to "FDW Routine For Importing
Foreign Schemas" would be more consistent.
- Documentation refers to postgresImportForeignSchema, which is an
incorrect name. It should be ImportForeignSchema. The arguments of the
API listed in the docs are incorrect as well.
- Once exiting from the call of fdw_routine->ImportForeignSchema, it
would be check the types of the items returned. An assertion on
IsA(stmt, CreateForeignTableStmt) would be fine I think.
- Renaming ImportForeignSchemaRestrictionType to
ImportForeignSchemaType would be nice
- Both ImportForeignSchemaRestriction and ImportForeignSchemaStmt can
contain the table name list. Only one is necessary.
- The API of ImportForeignSchema is very rough now, I don't think that
it should expose the parse tree of the IMPORT FOREIGN SCHEMA command
as it is the case now. Exposing the server makes sense as the table(s)
is(are) not created yet though. Necessary information to give to the
FDW is actually:
-- server information
-- Import type (except, limit to, all)
-- list of table names
-- FDW options
-- remote schema
Hence the following API would make more sense IMO:
ImportForeignSchema(ForeignServer *server,
const char *remote_schema,
ImportForeignSchemaRestrictionType importType,
List *table_names,
List *options)

I'll look into the patch for postgres_fdw later.
Regards,
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2014-06-17 08:40:46 Re: How to implement the skip errors for copy from ?
Previous Message Jeevan Chalke 2014-06-17 07:51:34 Re: pg_dump reporing version of server & pg_dump as comments in the output