Re: [v9.5] Custom Plan API

From: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgreSQL(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-11-24 11:57:43
Message-ID: 9A28C8860F777E439AA12E8AEA7694F801087186@BPXM15GP.gisp.nec.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com> writes:
> > Let me explain the current idea of mine.
> > CustomScan node will have a field that hold varnode mapping
> > information that is constructed by custom-scan provider on
> > create_customscan_plan, if they want. It is probably a list of varnode.
> > If exists, setrefs.c changes its behavior; that updates varno/varattno
> > of varnode according to this mapping, as if set_join_references() does
> > based on indexed_tlist.
> > To reference exct_scantuple, INDEX_VAR will be a best choice for varno
> > of these varnodes, and index of the above varnode mapping list will be
> > varattno. It can be utilized to make EXPLAIN output, instead of
> > GetSpecialCustomVar hook.
>
> > So, steps to go may be:
> > (1) Add custom_private, custom_exprs, ... instead of self defined data
> > type based on CustomXXX.
> > (2) Rid of SetCustomScanRef and GetSpecialCustomVar hook for the current
> > custom-"scan" support.
> > (3) Integration of above varnode mapping feature within upcoming join
> > replacement by custom-scan support.
>
> Well ... I still do not find this interesting, because I don't believe that
> CustomScan is a solution to anything interesting. It's difficult enough
> to solve problems like expensive-function pushdown within the core code;
> why would we tie one hand behind our backs by insisting that they should
> be solved by extensions? And as I mentioned before, we do need solutions
> to these problems in the core, regardless of CustomScan.
>
I'd like to split the "anything interesting" into two portions.
As you pointed out, the feature to push-down complicated expression
may need a bit large efforts (for remaining two commit-fest at least),
however, what the feature to replace join by custom-scan requires is
similar to job of set_join_references() because it never involves
translation between varnode and general expression.

Also, from my standpoint, a simple join replacement by custom-scan has
higher priority; join acceleration in v9.5 makes sense even if full-
functionality of pushing down general expression is not supported yet.

> I think that a useful way to go at this might be to think first about how
> to make use of expensive functions that have been cached in indexes, and
> then see how the solution to that might translate to pushing down expensive
> functions into FDWs and CustomScans. If you start with the CustomScan
> aspect of it then you immediately find yourself trying to design APIs to
> divide up the solution, which is premature when you don't even know what
> the solution is.
>
Yep, it also seems to me remaining two commit fests are a bit tight
schedule to make consensus of overall design and to implement.
I'd like to focus on the simpler portion first.

> The rough idea I'd had about this is that while canvassing a relation's
> indexes (in get_relation_info), we could create a list of precomputed
> expressions that are available from indexes, then run through the query
> tree and replace any matching subexpressions with some Var-like nodes (or
> maybe better PlaceHolderVar-like nodes) that indicate that "we can get this
> expression for free if we read the right index".
> If we do read the right index, such an expression reduces to a Var in the
> finished plan tree; if not, it reverts to the original expression.
> (Some thought would need to be given to the semantics when the index's table
> is underneath an outer join --- that may just mean that we can't necessarily
> replace every textually-matching subexpression, only those that are not
> above an outer join.) One question mark here is how to do the "replace
> any matching subexpressions" bit without O(lots) processing cost in big
> queries. But that's probably just a SMOP. The bigger issue I fear is that
> the planner is not currently structured to think that evaluation cost of
> expressions in the SELECT list has anything to do with which Path it should
> pick. That is tied to the handwaving I've been doing for awhile now about
> converting all the upper-level planning logic into
> generate-and-compare-Paths style; we certainly cannot ignore tlist eval
> costs while making those decisions. So at least for those upper-level Paths,
> we'd have to have a notion of what tlist we expect that plan level to compute,
> and charge appropriate evaluation costs.
>
Let me investigate the planner code more prior to comment on...

> So there's a lot of work there and I don't find that CustomScan looks like
> a solution to any of it. CustomScan and FDWs could benefit from this work,
> in that we'd now have a way to deal with the concept that expensive functions
> (and aggregates, I hope) might be computed at the bottom scan level. But
> it's folly to suppose that we can make it work just by hacking some
> arms-length extension code without any fundamental planner changes.
>
Indeed, I don't think it is a good idea to start from this harder portion.
Let's focus on just varno/varattno remapping to replace join relation by
custom-scan, as an immediate target.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2014-11-24 12:26:37 Re: add modulo (%) operator to pgbench
Previous Message Heikki Linnakangas 2014-11-24 11:51:07 Re: Stating the significance of Lehman & Yao in the nbtree README