Re: inherit support for foreign tables

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: inherit support for foreign tables
Date: 2014-12-10 05:47:54
Message-ID: CAFjFpRf1a=ZqpT69BBhq11gypNEyVqFxhkrCRL=Ry8Xo657MBg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

We haven't heard anything from Horiguchi-san and Hanada-san for almost a
week. So, I am fine marking it as "ready for committer". What do you say?

On Wed, Dec 10, 2014 at 8:48 AM, Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
wrote:

> Hi Ashutosh,
>
> Thanks for the review!
>
> (2014/11/28 18:14), Ashutosh Bapat wrote:
>
>> On Thu, Nov 27, 2014 at 3:52 PM, Etsuro Fujita
>> <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp <mailto:fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>> wrote:
>> (2014/11/17 17:55), Ashutosh Bapat wrote:
>> Here are my review comments for patch fdw-inh-3.patch.
>>
>
> Tests
>> -------
>> 1. It seems like you have copied from testcase inherit.sql to
>> postgres_fdw testcase. That's a good thing, but it makes the
>> test quite
>> long. May be we should have two tests in postgres_fdw contrib
>> module,
>> one for simple cases, and other for inheritance. What do you say?
>>
>
> IMO, the test is not so time-consuming, so it doesn't seem worth
>> splitting it into two.
>>
>
> I am not worried about the timing but I am worried about the length of
>> the file and hence ease of debugging in case we find any issues there.
>> We will leave that for the commiter to decide.
>>
>
> OK
>
>
> Documentation
>> --------------------
>> 1. The change in ddl.sgml
>> - We will refer to the child tables as partitions, though
>> they
>> - are in every way normal <productname>PostgreSQL</>
>> tables.
>> + We will refer to the child tables as partitions, though
>> we assume
>> + that they are normal <productname>PostgreSQL</> tables.
>>
>> adds phrase "we assume that", which confuses the intention
>> behind the
>> sentence. The original text is intended to highlight the
>> equivalence
>> between "partition" and "normal table", where as the addition
>> esp. the
>> word "assume" weakens that equivalence. Instead now we have to
>> highlight
>> the equivalence between "partition" and "normal or foreign
>> table". The
>> wording similar to "though they are either normal or foreign
>> tables"
>> should be used there.
>>
>
> You are right, but I feel that there is something out of place in
>> saying that there (5.10.2. Implementing Partitioning) because the
>> procedure there has been written based on normal tables only. Put
>> another way, if we say that, I think we'd need to add more docs,
>> describing the syntax and/or the corresponding examples for
>> foreign-table cases. But I'd like to leave that for another patch.
>> So, how about the wording "we assume *here* that", instead of "we
>> assume that", as I added the following notice in the previous
>> section (5.10.1. Overview)?
>>
>> @@ -2650,7 +2669,10 @@ VALUES ('Albany', NULL, NULL, 'NY');
>> table of a single parent table. The parent table itself is
>> normally
>> empty; it exists just to represent the entire data set. You
>> should be
>> familiar with inheritance (see <xref linkend="ddl-inherit">)
>> before
>> - attempting to set up partitioning.
>> + attempting to set up partitioning. (The setup and management of
>> + partitioned tables illustrated in this section assume that each
>> + partition is a normal table. However, you can do that in a
>> similar way
>> + for cases where some or all partitions are foreign tables.)
>>
>
> This looks ok, though, I would like to see final version of the
>> document. But I think, we will leave that for committer to handle.
>>
>
> OK
>
> 2. The wording "some kind of optimization" gives vague picture.
>> May be
>> it can be worded as "Since the constraints are assumed to be
>> true, they
>> are used in constraint-based query optimization like constraint
>> exclusion for partitioned tables.".
>> + Those constraints are used in some kind of query
>> optimization such
>> + as constraint exclusion for partitioned tables (see
>> + <xref linkend="ddl-partitioning">).
>>
>
> Will follow your revision.
>>
>
> Done.
>
> Code
>> -------
>> 1. In the following change
>> +/*
>> * acquire_inherited_sample_rows -- acquire sample rows from
>> inheritance tree
>> *
>> * This has the same API as acquire_sample_rows, except that
>> rows are
>> * collected from all inheritance children as well as the
>> specified table.
>> - * We fail and return zero if there are no inheritance children.
>> + * We fail and return zero if there are no inheritance children
>> or
>> there are
>> + * inheritance children that foreign tables.
>>
>> The addition should be "there are inheritance children that *are
>> all
>> *foreign tables. Note the addition "are all".
>>
>
> Sorry, I incorrectly wrote the comment. What I tried to write is
>> "We fail and return zero if there are no inheritance children or if
>> we are not in VAC_MODE_SINGLE case and inheritance tree contains at
>> least one foreign table.".
>>
>
> You might want to use "English" description of VAC_MODE_SINGLE instead
>> of that macro in the comment, so that reader doesn't have to look up
>> VAC_MODE_SINGLE. But I think, we will leave this for the committer.
>>
>
> I corrected the comments and translated the macro into the English
> description.
>
> 2. The function has_foreign() be better named
>> has_foreign_child()? This
>>
>
> How about "has_foreign_table"?
>>
>
> has_foreign_child() would be better, since these are "children" in the
>> inheritance hierarchy and not mere "table"s.
>>
>
> Done. But I renamed it to has_foreign_children() because it sounds more
> natural at least to me.
>
> function loops through all the tableoids passed even after it
>> has found
>> a foreign table. Why can't we return true immediately after
>> finding the
>> first foreign table, unless the side effects of heap_open() on
>> all the
>> table are required. But I don't see that to be the case, since
>> these
>> tables are locked already through a previous call to
>> heap_open(). In the
>>
>
> Good catch! Will fix.
>>
>
> same function instead of argument name parentrelId may be we
>> should use
>> name parent_oid, so that we use same notation for parent and
>> child table
>> OIDs.
>>
>
> Will fix.
>>
>
> Done.
>
> 3. Regarding enum VacuumMode - it's being used only in case of
>> acquire_inherited_sample_rows(__) and that too, to check only a
>> single
>> value of the three defined there. May be we should just infer
>> that value
>> inside acquire_inherited_sample_rows(__) or pass a boolean true
>> or false
>> from do_analyse_rel() based on the VacuumStmt. I do not see need
>> for a
>> separate three value enum of which only one value is used and
>> also to
>> pass it down from vacuum() by changing signatures of the minion
>> functions.
>>
>
> I introduced that for possible future use. See the discussion in [1].
>>
>
> Will leave it for the commiter to decide.
>>
>
> I noticed that the signatures need not to be modified, as you said. Thanks
> for pointing that out! So, I revised the patch not to change the
> signatures, though I left the enum, renaming it to AnalyzeMode. Let's have
> the committer's review.
>
> 4. In postgresGetForeignPlan(), the code added by this patch is
>> required
>> to handle the case when the row mark is placed on a parent table
>> and
>> hence is required to be applied on the child table. We need a
>> comment
>> explaining this. Otherwise, the three step process to get the
>> row mark
>> information isn't clear for a reader.
>>
>
> Will add the comment.
>>
>
> Done.
>
> 5. In expand_inherited_rtentry() why do you need a separate
>> variable
>> hasForeign? Instead of using that variable, you can actually
>> set/reset
>> rte->hasForeign since existence of even a single foreign child
>> would
>> mark that member as true. - After typing this, I got the answer
>> when I
>> looked at the function code. Every child's RTE is initially a
>> copy of
>> parent's RTE and hence hasForeign status would be inherited by
>> every
>> child after the first foreign child. I think, this reasoning
>> should be
>> added as comment before assignment to rte->hasForeign at the end
>> of the
>> function.
>>
>
> As you mentioned, I think we could set rte->hasForeign directly
>> during the scan for the inheritance set, without the separate
>> variable, hasForeign. But ISTM that it'd improve code readability
>> to set rte->hasForeign using the separate variable at the end of the
>> function because rte->hasForeign has its meaning only when rte->inh
>> is true and because we know whether rte->inh is true, at the end of
>> the function.
>>
>
> Fine. Please use "hasForeignChild" instead of just "hasForeign" without
>> a clue as to what is "foreign" here.
>>
>
> Done. But I renamed it to "hasForeignChildren".
>
> 6. The tests in foreign_data.sql are pretty extensive. Thanks
>> for that.
>> I think, we should also check the effect of each of the following
>> command using \d on appropriate tables.
>> +CREATE FOREIGN TABLE ft2 () INHERITS (pt1)
>> + SERVER s0 OPTIONS (delimiter ',', quote '"', "be quoted"
>> 'value');
>> +ALTER FOREIGN TABLE ft2 NO INHERIT pt1;
>> +DROP FOREIGN TABLE ft2;
>> +CREATE FOREIGN TABLE ft2 (
>> + c1 integer NOT NULL,
>> + c2 text,
>> + c3 date
>> +) SERVER s0 OPTIONS (delimiter ',', quote '"', "be quoted"
>> 'value');
>> +ALTER FOREIGN TABLE ft2 INHERIT pt1;
>>
>
> Will fix.
>>
>
> Done.
>
> Apart from the above, I noticed that the patch doesn't consider to
>> call ExplainForeignModify during EXPLAIN for an inherited
>> UPDATE/DELETE, as shown below (note that there are no UPDATE remote
>> queries displayed):
>>
>
> Since there seems to be no objections, I updated the patch as proposed
> upthread. Here is an example.
>
> postgres=# explain (format text, verbose) update parent as p set a = a * 2
> returning *;
> QUERY PLAN
> ------------------------------------------------------------
> --------------------
> Update on public.parent p (cost=0.00..202.33 rows=11 width=10)
> Output: p.a
> For public.ft1 p_1
> Remote SQL: UPDATE public.mytable_1 SET a = $2 WHERE ctid = $1
> RETURNING a
> For public.ft2 p_2
> Remote SQL: UPDATE public.mytable_2 SET a = $2 WHERE ctid = $1
> RETURNING a
> -> Seq Scan on public.parent p (cost=0.00..0.00 rows=1 width=10)
> Output: (p.a * 2), p.ctid
> -> Foreign Scan on public.ft1 p_1 (cost=100.00..101.16 rows=5
> width=10)
> Output: (p_1.a * 2), p_1.ctid
> Remote SQL: SELECT a, ctid FROM public.mytable_1 FOR UPDATE
> -> Foreign Scan on public.ft2 p_2 (cost=100.00..101.16 rows=5
> width=10)
> Output: (p_2.a * 2), p_2.ctid
> Remote SQL: SELECT a, ctid FROM public.mytable_2 FOR UPDATE
> (14 rows)
>
> Other changes:
> * revised regression tests for contrib/file_fdw to refer to tableoid.
> * revised docs a bit further.
>
> Attached are updated patches. Patch fdw-inh-5.patch has been created on
> top of patch fdw-chk-5.patch. Patch fdw-chk-5.patch is basically the same
> as the previous one fdw-chk-4.patch, but I slightly modified that one. The
> changes are the following.
> * added to foreign_data.sql more tests for your comments.
> * revised docs on ALTER FOREIGN TABLE a bit further.
>
>
> Thanks,
>
> Best regards,
> Etsuro Fujita
>

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2014-12-10 06:54:13 Re: On partitioning
Previous Message Tom Lane 2014-12-10 05:17:33 Re: logical column ordering