Re: [v9.5] Custom Plan API

From: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Alvaro Herrera" <alvherre(at)2ndquadrant(dot)com>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, "Stephen Frost" <sfrost(at)snowman(dot)net>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, "Peter Eisentraut" <peter_e(at)gmx(dot)net>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-09-11 15:24:07
Message-ID: 9A28C8860F777E439AA12E8AEA7694F8FDA774@BPXM15GP.gisp.nec.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Thu, Sep 4, 2014 at 7:57 PM, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com> wrote:
> > Regarding to the attached three patches:
> > [1] custom-path and hook
> > It adds register_custom_path_provider() interface for registration
> > of custom-path entrypoint. Callbacks are invoked on
> > set_plain_rel_pathlist to offer alternative scan path on regular
> relations.
> > I may need to explain the terms in use. I calls the path-node
> > custom-path that is the previous step of population of plan-node
> > (like custom-scan and potentially custom-join and so on). The node
> > object created by
> > CreateCustomPlan() is called custom-plan because it is abstraction
> > for all the potential custom-xxx node; custom-scan is the first of all.
>
> I don't think it's a good thing that add_custom_path_type is declared
> as void (*)(void *) rather than having a real type. I suggest we add
> the path-creation callback function to CustomPlanMethods instead, like
> this:
>
> void (*CreateCustomScanPath)(PlannerInfo *root, RelOptInfo *baserel,
> RangeTblEntry *rte);
>
> Then, register_custom_path_provider() can just take CustomPathMethods
> * as an argument; and create_customscan_paths can just walk the list
> of CustomPlanMethods objects and call CreateCustomScanPath for each
> one where that is non-NULL. This conflates the path generation
> mechanism with the type of path getting generated a little bit, but I
> don't see any real downside to that. I don't see a reason why you'd
> ever want two different providers to offer the same type of custompath.
>
It seems to me the design you suggested is smarter than the original one.
The first patch was revised according to the design.

> Don't the changes to src/backend/optimizer/plan/createplan.c belong in
> patch #2?
>
The borderline between #1 and #2 is little bit bogus. So, I moved most of
portion into #1, however, invocation of InitCustomScan (that is a callback
in CustomPlanMethod) in create_custom_plan() is still in #2.

> > [2] custom-scan node
> > It adds custom-scan node support. The custom-scan node is expected
> > to generate contents of a particular relation or sub-plan according
> > to its custom-logic.
> > Custom-scan provider needs to implement callbacks of
> > CustomScanMethods and CustomExecMethods. Once a custom-scan node is
> > populated from custom-path node, the backend calls back these
> > methods in the planning and execution stage.
>
> It looks to me like this patch is full of holdovers from its earlier
> life as a more-generic CustomPlan node. In particular, it contains
> numerous defenses against the case where scanrelid != 0. These are
> confusingly written as scanrelid > 0, but I think really they're just bogus altogether:
> if this is specifically a CustomScan, not a CustomPlan, then the relid
> should always be filled in. Please consider what can be simplified here.
>
OK, I revised. Now custom-scan assumes it has a particular valid relation
to be scanned, so no code path with scanrelid == 0 at this moment.

Let us revisit this scenario when custom-scan replaces relation-joins.
In this case, custom-scan will not be associated with a particular base-
relation, thus it needs to admit a custom-scan node with scanrelid == 0.

> The comment in _copyCustomScan looks bogus to me. I think we should
> *require* a static method table.
>
OK, it was fixed to copy the pointer of function table; not table itself.

> In create_custom_plan, you if (IsA(custom_plan, CustomScan)) { lots of
> stuff; } else elog(ERROR, ...). I think it would be clearer to write
> if (!IsA(custom_plan, CustomScan)) elog(ERROR, ...); lots of stuff;
>
Fixed.

> > Also, regarding to the use-case of multi-exec interface.
> > Below is an EXPLAIN output of PG-Strom. It shows the custom
> > GpuHashJoin has two sub-plans; GpuScan and MultiHash.
> > GpuHashJoin is stacked on the GpuScan. It is a case when these nodes
> > utilize multi-exec interface for more efficient data exchange
> > between
> the nodes.
> > GpuScan already keeps a data structure that is suitable to send
> > to/recv from GPU devices and constructed on the memory segment being
> > DMA
> available.
> > If we have to form a tuple, pass it via row-by-row interface, then
> > deform it, it will become a major performance degradation in this
> > use
> case.
> >
> > postgres=# explain select * from t10 natural join t8 natural join t9
> > where
> x < 10;
> > QUERY PLAN
> >
> ----------------------------------------------------------------------
> > ------------------------- Custom (GpuHashJoin)
> > (cost=10979.56..90064.15 rows=333 width=49)
> > pseudo scan tlist: 1:(t10.bid), 3:(t10.aid), 4:<t10.x>,
> > 2:<t8.data>,
> 5:[t8.aid], 6:[t9.bid]
> > hash clause 1: ((t8.aid = t10.aid) AND (t9.bid = t10.bid))
> > -> Custom (GpuScan) on t10 (cost=10000.00..88831.26
> > rows=3333327
> width=16)
> > Host References: aid, bid, x
> > Device References: x
> > Device Filter: (x < 10::double precision)
> > -> Custom (MultiHash) (cost=464.56..464.56 rows=1000 width=41)
> > hash keys: aid, bid
> > -> Hash Join (cost=60.06..464.56 rows=1000 width=41)
> > Hash Cond: (t9.data = t8.data)
> > -> Index Scan using t9_pkey on t9
> > (cost=0.29..357.29
> rows=10000 width=37)
> > -> Hash (cost=47.27..47.27 rows=1000 width=37)
> > -> Index Scan using t8_pkey on t8
> > (cost=0.28..47.27 rows=1000 width=37) Planning time: 0.810 ms
> > (15 rows)
>
> Why can't the Custom(GpuHashJoin) node build the hash table internally
> instead of using a separate node?
>
It's possible, however, it prevents to check sub-plans using EXPLAIN if we
manage inner-plans internally. So, I'd like to have a separate node being
connected to the inner-plan.

> Also, for this patch we are only considering custom scan. Custom join
> is another patch. We don't need to provide infrastructure for that
> patch in this one.
>
OK, let me revisit it on the next stage, with functionalities above.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>

Attachment Content-Type Size
pgsql-v9.5-custom-scan.part-3.v9.patch application/octet-stream 45.0 KB
pgsql-v9.5-custom-scan.part-2.v9.patch application/octet-stream 45.2 KB
pgsql-v9.5-custom-scan.part-1.v9.patch application/octet-stream 11.1 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2014-09-11 15:28:19 Re: bad estimation together with large work_mem generates terrible slow hash joins
Previous Message Robert Haas 2014-09-11 15:21:53 Re: bad estimation together with large work_mem generates terrible slow hash joins