Re: [v9.5] Custom Plan API

Lists: pgsql-hackers
From: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, "Simon Riggs" <simon(at)2ndquadrant(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Stephen Frost <sfrost(at)snowman(dot)net>, "Shigeru Hanada" <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, "Peter Eisentraut" <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: [v9.5] Custom Plan API
Date: 2014-05-07 01:05:47
Message-ID: 9A28C8860F777E439AA12E8AEA7694F8F9E7B1@BPXM15GP.gisp.nec.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Prior to the development cycle towards v9.5, I'd like to reopen
the discussion of custom-plan interface. Even though we had lots
of discussion during the last three commit-fests, several issues
are still under discussion. So, I'd like to clarify direction of
the implementation, prior to the first commit-fest.

(1) DDL support and system catalog

Simon suggested that DDL command should be supported to track custom-
plan providers being installed, and to avoid nonsense hook calls
if it is an obvious case that custom-plan provider can help. It also
makes sense to give a chance to load extensions once installed.
(In the previous design, I assumed modules are loaded by LOAD command
or *_preload_libraries parameters).

I tried to implement the following syntax:

CREATE CUSTOM PLAN <name> FOR (scan|join|any) HANDLER <func_name>;

It records a particular function as an entrypoint of custom-plan provider,
then it will be called when planner tries to find out the best path to scan
or join relations. This function takes an argument (INTERNAL type) that packs
information to construct and register an alternative scan/join path, like
PlannerInfo, RelOptInfo and so on.

(*) The data structure below will be supplied, in case of scan path.
typedef struct {
uint32 custom_class;
PlannerInfo *root;
RelOptInfo *baserel;
RangeTblEntry *rte;
} customScanArg;

This function, usually implemented with C-language, can construct a custom
object being delivered from CustomPath type that contains a set of function
pointers; including functions that populate another objects delivered from
CustomPlan or CustomPlanState as I did in the patch towards v9.4 development.

Properties of individual custom-plan providers are recorded in the
pg_custom_plan system catalog. Right now, its definition is quite simple
- only superuser can create / drop custom-plan providers, and its definition
does not belong to a particular namespace.
Because of this assumption (only superuser can touch), I don't put database
ACL mechanism here.
What kind of characteristics should be there?

(2) Static functions to be exported

Tom concerned that custom-plan API needs several key functions can be
called by extensions, although these are declared as static functions,
thus, it looks like a part of interfaces.
Once people thought it is stable ones we can use beyond the version up,
it may become a barrier to the future improvement in the core code.
Is it a right understanding, isn't it?

One solution is to write a notice clearly, like: "these external functions
are not stable interfaces, so extension should not assumed these functions
are available beyond future version up".

Nevertheless, more stable functions are more kindness for authors of extensions.
So, I tried a few approaches.

First of all, we categorized functions into three categories.
(A) It walks on plan/expression tree recursively.
(B) It modifies internal state of the core backend.
(C) It is commonly used but in a particular source file.

Although the number of functions are not so many, (A) and (B) must have
its entrypoint from extensions. If unavailable, extension needs to manage
a copied code with small enhancement by itself, and its burden is similar
to just branching the tree.
Example of (A) are: create_plan_recurse, set_plan_refs, ...
Example of (B) are: fix_expr_common, ...

On the other hands, (C) functions are helpful if available, however, it
is not mandatory requirement to implement.

Our first trial, according to the proposition by Tom, is to investigate
a common walker function on plan tree as we are now doing on expression
tree. We expected, we can give function pointers of key routines to
extensions, instead of exporting the static functions.
However, it didn't work well because existing recursive call takes
various kind of jobs for each plan-node type, so it didn't fit a structure
of walker functions; that applies a uniform operation for each node.

Note that, I assumed the following walker functions that applies plan_walker
or expr_walker on the underlying plan/expression trees.
bool
plan_tree_walker(Plan *plan,
bool (*plan_walker) (),
bool (*expr_walker) (),
void *context)
Please tell me if it is different from your ideas, I'll reconsider it.

On the next, I tried another approach that gives function pointers of
(A) and (B) functions as a part of custom-plan interface.
It is workable at least, however, it seems to me its interface definition
has advantage in comparison to the original approach.

For example, below is definition of the callback in setref.c.

+ void (*SetCustomPlanRef)(PlannerInfo *root,
+ CustomPlan *custom_plan,
+ int rtoffset,
+ Plan *(*fn_set_plan_refs)(PlannerInfo *root,
+ Plan *plan,
+ int rtoffset),
+ void (*fn_fix_expr_common)(PlannerInfo *root,
+ Node *node));

Extension needs set_plan_refs() and fix_expr_common() at least, I added
function pointers of them. But this definition has to be updated according
to the future update of these functions. It does not seem to me a proper
way to smooth the impact of future internal change.

So, I'd like to find out where is a good common ground to solve the matter.

One idea is the first simple solution. The core PostgreSQL will be developed
independently from the out-of-tree modules, so we don't care about stability
of declaration of internal functions, even if it is exported to multiple
source files. (I believe it is our usual manner.)

One other idea is, a refactoring of the core backend to consolidate routines
per plan-node, not processing stage. For example, createplan.c contains most
of codes commonly needed to create plan, in addition to individual plan node.
Let's assume a function like create_seqscan_plan() are located in a separated
source file, then routines to be exported become clear.
One expected disadvantage is, this refactoring makes complicated to back patches.

Do you have any other ideas to implement it well?

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>

> -----Original Message-----
> From: Kohei KaiGai [mailto:kaigai(at)kaigai(dot)gr(dot)jp]
> Sent: Tuesday, April 29, 2014 10:07 AM
> To: Kaigai Kouhei(海外 浩平)
> Cc: Tom Lane; Andres Freund; Robert Haas; Simon Riggs; PgHacker; Stephen
> Frost; Shigeru Hanada; Jim Mlodgenski; Peter Eisentraut
> Subject: Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)
>
> >> Yeah. I'm still not exactly convinced that custom-scan will ever
> >> allow independent development of new plan types (which, with all due
> >> respect to Robert, is what it was being sold as last year in Ottawa).
> >> But I'm not opposed in principle to committing it, if we can find a
> >> way to have a cleaner API for things like setrefs.c. It seems like
> >> late-stage planner processing in general is an issue for this patch
> >> (createplan.c and subselect.c are also looking messy). EXPLAIN isn't
> too great either.
> >>
> >> I'm not sure exactly what to do about those cases, but I wonder
> >> whether things would get better if we had the equivalent of
> >> expression_tree_walker/mutator capability for plan nodes. The state
> >> of affairs in setrefs and subselect, at least, is a bit reminiscent
> >> of the bad old days when we had lots of different bespoke code for
> >> traversing expression trees.
> >>
> > Hmm. If we have something like expression_tree_walker/mutator for plan
> > nodes, we can pass a walker/mutator function's pointer instead of
> > exposing static functions that takes recursive jobs.
> > If custom-plan provider (that has sub-plans) got a callback with
> > walker/ mutator pointer, all it has to do for sub-plans are calling
> > this new plan-tree walking support routine with supplied walker/mutator.
> > It seems to me more simple design than what I did.
> >
> I tried to code the similar walker/mutator functions on plan-node tree,
> however, it was not available to implement these routines enough simple,
> because the job of walker/mutator functions are not uniform thus caller
> side also must have a large switch-case branches.
>
> I picked up setrefs.c for my investigation.
> The set_plan_refs() applies fix_scan_list() on the expression tree being
> appeared in the plan node if it is delivered from Scan, however, it also
> applies set_join_references() for subclass of Join, or
> set_dummy_tlist_references() for some other plan nodes.
> It implies that the walker/mutator functions of Plan node has to apply
> different operation according to the type of Plan node. I'm not certain
> how much different forms are needed.
> (In addition, set_plan_refs() performs usually like a walker, but often
> performs as a mutator if trivial subquery....)
>
> I'm expecting the function like below. It allows to call plan_walker
> function for each plan-node and also allows to call expr_walker function
> for each expression-node on the plan node.
>
> bool
> plan_tree_walker(Plan *plan,
> bool (*plan_walker) (),
> bool (*expr_walker) (),
> void *context)
>
> I'd like to see if something other form to implement this routine.
>
>
> One alternative idea to give custom-plan provider a chance to handle its
> subplans is, to give function pointers (1) to handle recursion of plan-tree
> and (2) to set up backend's internal state.
> In case of setrefs.c, set_plan_refs() and fix_expr_common() are minimum
> necessity for extensions. It also kills necessity to export static
> functions.
>
> How about your thought?
> --
> KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>

Attachment Content-Type Size
pgsql-v9.5-custom-plan-with-ctidscan.v0.patch application/octet-stream 136.2 KB

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Stephen Frost <sfrost(at)snowman(dot)net>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-07 06:18:51
Message-ID: CA+U5nMKZua6TyNL29RCN0X+isdUiZ9o5BrYW9xaNd3FvWy9bhA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 7 May 2014 02:05, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com> wrote:
> Prior to the development cycle towards v9.5, I'd like to reopen
> the discussion of custom-plan interface. Even though we had lots
> of discussion during the last three commit-fests, several issues
> are still under discussion. So, I'd like to clarify direction of
> the implementation, prior to the first commit-fest.
>
> (1) DDL support and system catalog
>
> Simon suggested that DDL command should be supported to track custom-
> plan providers being installed, and to avoid nonsense hook calls
> if it is an obvious case that custom-plan provider can help. It also
> makes sense to give a chance to load extensions once installed.
> (In the previous design, I assumed modules are loaded by LOAD command
> or *_preload_libraries parameters).
>
> I tried to implement the following syntax:
>
> CREATE CUSTOM PLAN <name> FOR (scan|join|any) HANDLER <func_name>;

Thank you for exploring that thought and leading the way on this
research. I've been thinking about this also.

What I think we need is a declarative form that expresses the linkage
between base table(s) and a related data structures that can be used
to optimize a query, while still providing accurate results.

In other DBMS, we have concepts such as a JoinIndex or a MatView which
allow some kind of lookaside behaviour. Just for clarity, a concrete
example is Oracle's Materialized Views which can be set using ENABLE
QUERY REWRITE so that the MatView can be used as an alternative path
for a query. We do already have this concept in PostgreSQL, where an
index can be used to perform an IndexOnlyScan rather than accessing
the heap itself.

We have considerable evidence that the idea of alternate data
structures results in performance gains.
* KaiGai's work - https://wiki.postgresql.org/wiki/PGStrom
* http://www.postgresql.org/message-id/52C59858.9090500@garret.ru
* http://citusdata.github.io/cstore_fdw/
* University of Manchester - exploring GPUs as part of the AXLE project
* Barcelona SuperComputer Centre - exploring FPGAs, as part of the AXLE project
* Some other authors have also cited gains using GPU technology in databases

So I would like to have a mechanism that provides a *generic*
Lookaside for a table or foreign table.

Tom and Kevin have previously expressed that MatViews would represent
a planning problem, in the general case. One way to solve that
planning issue is to link structures directly together, in the same
way that an index and a table are linked. We can then process the
lookaside in the same way we handle a partial index - check
prerequisites and if usable, calculate a cost for the alternate path.
We need not add planning time other than to the tables that might
benefit from that.

Roughly, I'm thinking of this...

CREATE LOOKASIDE ON foo
TO foo_mat_view;

and also this...

CREATE LOOKASIDE ON foo
TO foo_as_a_foreign_table /* e.g. PGStrom */

This would allow the planner to consider alternate plans for foo_mv
during set_plain_rel_pathlist() similarly to the way it considers
index paths, in one of the common cases that the mat view covers just
one table.

This concept is similar to ENABLE QUERY REWRITE in Oracle, but this
thought goes much further, to include any generic user-defined data
structure or foreign table.

Do we need this? For MVs, we *might* be able to deduce that the MV is
rewritable for "foo", but that is not deducible for Foreign Tables, by
current definition, so I prefer the explicit definition of objects
that are linked - since doing this for indexes is already familiar to
people.

Having an explicit linkage between data structures allows us to
enhance an existing application by transaparently adding new
structures, just as we already do with indexes. Specifically, that we
allow more than one lookaside structure on any one table.

Forget the exact name, thats not important. But I think the
requirements here are...

* Explicit definition that we are attaching an alternate path onto a
table (conceptually similar to adding an index)

* Ability to check that the alternate path is viable (similar to the
way we validate use of partial indexes prior to usage)
Checks on columns(SELECT), rows(WHERE), aggregations(GROUP)

* Ability to consider access cost for both normal table and alternate
path (like an index) - this allows the alternate path to *not* be
chosen when we are performing some operation that is sub-optimal (for
whatever reason).

* There may be some need to define operator classes that are
implemented via the alternate path

which works for single tables, but a later requirement would then be

* allows the join of one or more tables to be replaced with a single lookaside

Hopefully, we won't need a "Custom Plan" at all, just the ability to
lookaside when useful.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, "Andres Freund" <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Stephen Frost <sfrost(at)snowman(dot)net>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, "Peter Eisentraut" <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-07 07:17:02
Message-ID: 9A28C8860F777E439AA12E8AEA7694F8F9E93E@BPXM15GP.gisp.nec.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> On 7 May 2014 02:05, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com> wrote:
> > Prior to the development cycle towards v9.5, I'd like to reopen the
> > discussion of custom-plan interface. Even though we had lots of
> > discussion during the last three commit-fests, several issues are
> > still under discussion. So, I'd like to clarify direction of the
> > implementation, prior to the first commit-fest.
> >
> > (1) DDL support and system catalog
> >
> > Simon suggested that DDL command should be supported to track custom-
> > plan providers being installed, and to avoid nonsense hook calls if it
> > is an obvious case that custom-plan provider can help. It also makes
> > sense to give a chance to load extensions once installed.
> > (In the previous design, I assumed modules are loaded by LOAD command
> > or *_preload_libraries parameters).
> >
> > I tried to implement the following syntax:
> >
> > CREATE CUSTOM PLAN <name> FOR (scan|join|any) HANDLER <func_name>;
>
> Thank you for exploring that thought and leading the way on this research.
> I've been thinking about this also.
>
> What I think we need is a declarative form that expresses the linkage between
> base table(s) and a related data structures that can be used to optimize
> a query, while still providing accurate results.
>
> In other DBMS, we have concepts such as a JoinIndex or a MatView which allow
> some kind of lookaside behaviour. Just for clarity, a concrete example is
> Oracle's Materialized Views which can be set using ENABLE QUERY REWRITE
> so that the MatView can be used as an alternative path for a query. We do
> already have this concept in PostgreSQL, where an index can be used to
> perform an IndexOnlyScan rather than accessing the heap itself.
>
> We have considerable evidence that the idea of alternate data structures
> results in performance gains.
> * KaiGai's work - https://wiki.postgresql.org/wiki/PGStrom
> * http://www.postgresql.org/message-id/52C59858.9090500@garret.ru
> * http://citusdata.github.io/cstore_fdw/
> * University of Manchester - exploring GPUs as part of the AXLE project
> * Barcelona SuperComputer Centre - exploring FPGAs, as part of the AXLE
> project
> * Some other authors have also cited gains using GPU technology in databases
>
> So I would like to have a mechanism that provides a *generic* Lookaside
> for a table or foreign table.
>
> Tom and Kevin have previously expressed that MatViews would represent a
> planning problem, in the general case. One way to solve that planning issue
> is to link structures directly together, in the same way that an index and
> a table are linked. We can then process the lookaside in the same way we
> handle a partial index - check prerequisites and if usable, calculate a
> cost for the alternate path.
> We need not add planning time other than to the tables that might benefit
> from that.
>
> Roughly, I'm thinking of this...
>
> CREATE LOOKASIDE ON foo
> TO foo_mat_view;
>
> and also this...
>
> CREATE LOOKASIDE ON foo
> TO foo_as_a_foreign_table /* e.g. PGStrom */
>
> This would allow the planner to consider alternate plans for foo_mv during
> set_plain_rel_pathlist() similarly to the way it considers index paths,
> in one of the common cases that the mat view covers just one table.
>
> This concept is similar to ENABLE QUERY REWRITE in Oracle, but this thought
> goes much further, to include any generic user-defined data structure or
> foreign table.
>
Let me clarify. This mechanism allows to add alternative scan/join paths
including built-in ones, not only custom enhanced plan/exec node, isn't it?
Probably, it is a variation of above proposition if we install a handler
function that proposes built-in path nodes towards the request for scan/join.

> Do we need this? For MVs, we *might* be able to deduce that the MV is
> rewritable for "foo", but that is not deducible for Foreign Tables, by
> current definition, so I prefer the explicit definition of objects that
> are linked - since doing this for indexes is already familiar to people.
>
> Having an explicit linkage between data structures allows us to enhance
> an existing application by transaparently adding new structures, just as
> we already do with indexes. Specifically, that we allow more than one
> lookaside structure on any one table.
>
Not only alternative data structure, alternative method to scan/join towards
same data structure is also important, isn't it?

> Forget the exact name, thats not important. But I think the requirements
> here are...
>
> * Explicit definition that we are attaching an alternate path onto a table
> (conceptually similar to adding an index)
>
I think the syntax allows "tables", not only a particular table.
It will inform the core planner this lookaside/customplan (name is not
important, anyway this feature...) can provide alternative path towards
the set of relations; being considered. So, it allows to reduce number of
function calls on planner stage.

> * Ability to check that the alternate path is viable (similar to the way
> we validate use of partial indexes prior to usage)
> Checks on columns(SELECT), rows(WHERE), aggregations(GROUP)
>
I never deny it... but do you think this feature from the initial version??

> * Ability to consider access cost for both normal table and alternate path
> (like an index) - this allows the alternate path to *not* be chosen when
> we are performing some operation that is sub-optimal (for whatever reason).
>
It is an usual job of existing planner, isn't it?

> * There may be some need to define operator classes that are implemented
> via the alternate path
>
> which works for single tables, but a later requirement would then be
>
> * allows the join of one or more tables to be replaced with a single lookaside
>
It's higher priority for me, and I guess it is same in MatView usage.

> Hopefully, we won't need a "Custom Plan" at all, just the ability to
> lookaside when useful.
>
Probably, lookaside is a special case in the scenario that custom-plan can
provide. I also think it is an attractive use case if we can redirect
a particular complicated join into a MatView reference. So, it makes sense
to bundle a handler function to replace join by matview reference.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Stephen Frost <sfrost(at)snowman(dot)net>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-07 08:01:37
Message-ID: CA+U5nMK4BYRQOmpf-c6es0=NA=75i5x-FvWP_+9fLdcv0gi-UQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 7 May 2014 08:17, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com> wrote:

> Let me clarify. This mechanism allows to add alternative scan/join paths
> including built-in ones, not only custom enhanced plan/exec node, isn't it?
> Probably, it is a variation of above proposition if we install a handler
> function that proposes built-in path nodes towards the request for scan/join.

Yes, I am looking for a way to give you the full extent of your
requirements, within the Postgres framework. I have time and funding
to assist you in achieving this in a general way that all may make use
of.

> Not only alternative data structure, alternative method to scan/join towards
> same data structure is also important, isn't it?

Agreed. My proposal is that if the planner allows the lookaside to an
FDW then we pass the query for full execution on the FDW. That means
that the scan, aggregate and join could take place via the FDW. i.e.
"Custom Plan" == lookaside + FDW

Or put another way, if we add Lookaside then we can just plug in the
pgstrom FDW directly and we're done. And everybody else's FDW will
work as well, so Citus etcc will not need to recode.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, "Andres Freund" <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Stephen Frost <sfrost(at)snowman(dot)net>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, "Peter Eisentraut" <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-07 09:06:32
Message-ID: 9A28C8860F777E439AA12E8AEA7694F8F9EA04@BPXM15GP.gisp.nec.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> -----Original Message-----
> From: Simon Riggs [mailto:simon(at)2ndQuadrant(dot)com]
> Sent: Wednesday, May 07, 2014 5:02 PM
> To: Kaigai Kouhei(海外 浩平)
> Cc: Tom Lane; Robert Haas; Andres Freund; PgHacker; Stephen Frost; Shigeru
> Hanada; Jim Mlodgenski; Peter Eisentraut; Kohei KaiGai
> Subject: Re: [v9.5] Custom Plan API
>
> On 7 May 2014 08:17, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com> wrote:
>
> > Let me clarify. This mechanism allows to add alternative scan/join
> > paths including built-in ones, not only custom enhanced plan/exec node,
> isn't it?
> > Probably, it is a variation of above proposition if we install a
> > handler function that proposes built-in path nodes towards the request
> for scan/join.
>
> Yes, I am looking for a way to give you the full extent of your requirements,
> within the Postgres framework. I have time and funding to assist you in
> achieving this in a general way that all may make use of.
>
> > Not only alternative data structure, alternative method to scan/join
> > towards same data structure is also important, isn't it?
>
> Agreed. My proposal is that if the planner allows the lookaside to an FDW
> then we pass the query for full execution on the FDW. That means that the
> scan, aggregate and join could take place via the FDW. i.e.
> "Custom Plan" == lookaside + FDW
>
> Or put another way, if we add Lookaside then we can just plug in the pgstrom
> FDW directly and we're done. And everybody else's FDW will work as well,
> so Citus etcc will not need to recode.
>
Hmm. That sounds me, you intend to make FDW perform as a central facility
to host pluggable plan/exec stuff. Even though we have several things to be
clarified, I also think it's a direction worth to investigate.

Let me list up the things to be clarified / developed randomly.

* Join replacement by FDW; We still don't have consensus about join replacement
by FDW. Probably, it will be designed to remote-join implementation primarily,
however, things to do is similar. We may need to revisit the Hanada-san's
proposition in the past.

* Lookaside for ANY relations; I want planner to try GPU-scan for any relations
once installed, to reduce user's administration cost.
It needs lookaside allow to specify a particular foreign-server, not foreign-
table, then create ForeignScan node that is not associated with a particular
foreign-table.

* ForeignScan node that is not associated with a particular foreign-table.
Once we try to apply ForeignScan node instead of Sort or Aggregate, existing
FDW implementation needs to be improved. These nodes scan on a materialized
relation (generated on the fly), however, existing FDW code assumes
ForeignScan node is always associated with a particular foreign-table.
We need to eliminate this restriction.

* FDW method for MultiExec. In case when we can stack multiple ForeignScan
nodes, it's helpful to support to exchange scanned tuples in their own
data format. Let's assume two ForeignScan nodes are stacked. One performs
like Sort, another performs like Scan. If they internally handle column-
oriented data format, TupleTableSlot is not a best way for data exchange.

* Lookaside on the INSERT/UPDATE/DELETE. Probably, it can be implemented
using writable FDW feature. Not a big issue, but don't forget it...

How about your opinion?

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Stephen Frost <sfrost(at)snowman(dot)net>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-07 10:29:24
Message-ID: CA+U5nM+oD6Lh2A27muuGNiLrkVbEJLjFwnBFttSUJGYmmUf9KA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 7 May 2014 10:06, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com> wrote:

> Let me list up the things to be clarified / developed randomly.
>
> * Join replacement by FDW; We still don't have consensus about join replacement
> by FDW. Probably, it will be designed to remote-join implementation primarily,
> however, things to do is similar. We may need to revisit the Hanada-san's
> proposition in the past.

Agreed. We need to push down joins into FDWs and we need to push down
aggregates also, so they can be passed to FDWs. I'm planning to look
at aggregate push down.

> * Lookaside for ANY relations; I want planner to try GPU-scan for any relations
> once installed, to reduce user's administration cost.
> It needs lookaside allow to specify a particular foreign-server, not foreign-
> table, then create ForeignScan node that is not associated with a particular
> foreign-table.

IMHO we would not want to add indexes to every column, on every table,
nor would we wish to use lookaside for all tables. It is a good thing
to be able to add optimizations for individual tables. GPUs are not
good for everything; it is good to be able to leverage their
strengths, yet avoid their weaknesses.

If do you want that, you can write an Event Trigger that automatically
adds a lookaside for any table.

> * ForeignScan node that is not associated with a particular foreign-table.
> Once we try to apply ForeignScan node instead of Sort or Aggregate, existing
> FDW implementation needs to be improved. These nodes scan on a materialized
> relation (generated on the fly), however, existing FDW code assumes
> ForeignScan node is always associated with a particular foreign-table.
> We need to eliminate this restriction.

I don't think we need to do that, given the above.

> * FDW method for MultiExec. In case when we can stack multiple ForeignScan
> nodes, it's helpful to support to exchange scanned tuples in their own
> data format. Let's assume two ForeignScan nodes are stacked. One performs
> like Sort, another performs like Scan. If they internally handle column-
> oriented data format, TupleTableSlot is not a best way for data exchange.

I agree TupleTableSlot may not be best way for bulk data movement. We
probably need to look at buffering/bulk movement between executor
nodes in general, which would be of benefit for the FDW case also.
This would be a problem even for Custom Scans as originally presented
also, so I don't see much change there.

> * Lookaside on the INSERT/UPDATE/DELETE. Probably, it can be implemented
> using writable FDW feature. Not a big issue, but don't forget it...

Yes, possible.

I hope these ideas make sense. This is early days and there may be
other ideas and much detail yet to come.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-07 16:33:17
Message-ID: 20140507163317.GX2556@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Simon Riggs (simon(at)2ndQuadrant(dot)com) wrote:
> Agreed. My proposal is that if the planner allows the lookaside to an
> FDW then we pass the query for full execution on the FDW. That means
> that the scan, aggregate and join could take place via the FDW. i.e.
> "Custom Plan" == lookaside + FDW

How about we get that working for FDWs to begin with and then we can
come back to this idea..? We're pretty far from join-pushdown or
aggregate-pushdown to FDWs, last I checked, and having those would be a
massive win for everyone using FDWs.

Thanks,

Stephen


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-07 16:43:19
Message-ID: 20140507164319.GY2556@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Simon Riggs (simon(at)2ndQuadrant(dot)com) wrote:
> IMHO we would not want to add indexes to every column, on every table,
> nor would we wish to use lookaside for all tables. It is a good thing
> to be able to add optimizations for individual tables. GPUs are not
> good for everything; it is good to be able to leverage their
> strengths, yet avoid their weaknesses.

It's the optimizer's job to figure out which path to pick though, based
on which will have the lowest cost.

> If do you want that, you can write an Event Trigger that automatically
> adds a lookaside for any table.

This sounds terribly ugly and like we're pushing optimization decisions
on to the user instead of just figuring out what the best answer is.

> I agree TupleTableSlot may not be best way for bulk data movement. We
> probably need to look at buffering/bulk movement between executor
> nodes in general, which would be of benefit for the FDW case also.
> This would be a problem even for Custom Scans as originally presented
> also, so I don't see much change there.

Being able to do bulk movement would be useful, but (as I proposed
months ago) being able to do asyncronous returns would be extremely
useful also, when you consider FDWs and Append()- the main point there
being that you want to keep the FDWs busy and working in parallel.

Thanks,

Stephen


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-07 17:23:48
Message-ID: CA+U5nMJrJ7MWtXZv-PaaeZS+QbEkO_5qSek_FMhQY_3VQ3TUTg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 7 May 2014 17:43, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> * Simon Riggs (simon(at)2ndQuadrant(dot)com) wrote:
>> IMHO we would not want to add indexes to every column, on every table,
>> nor would we wish to use lookaside for all tables. It is a good thing
>> to be able to add optimizations for individual tables. GPUs are not
>> good for everything; it is good to be able to leverage their
>> strengths, yet avoid their weaknesses.
>
> It's the optimizer's job to figure out which path to pick though, based
> on which will have the lowest cost.

Of course. I'm not suggesting otherwise.

>> If do you want that, you can write an Event Trigger that automatically
>> adds a lookaside for any table.
>
> This sounds terribly ugly and like we're pushing optimization decisions
> on to the user instead of just figuring out what the best answer is.

I'm proposing that we use a declarative approach, just like we do when
we say CREATE INDEX.

The idea is that we only consider a lookaside when a lookaside has
been declared. Same as when we add an index, the optimizer considers
whether to use that index. What we don't want to happen is that the
optimizer considers a GIN plan, even when a GIN index is not
available.

I'll explain it more at the developer meeting. It probably sounds a
bit weird at first.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-07 17:39:17
Message-ID: 20140507173917.GZ2556@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Simon Riggs (simon(at)2ndQuadrant(dot)com) wrote:
> On 7 May 2014 17:43, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> > It's the optimizer's job to figure out which path to pick though, based
> > on which will have the lowest cost.
>
> Of course. I'm not suggesting otherwise.
>
> >> If do you want that, you can write an Event Trigger that automatically
> >> adds a lookaside for any table.
> >
> > This sounds terribly ugly and like we're pushing optimization decisions
> > on to the user instead of just figuring out what the best answer is.
>
> I'm proposing that we use a declarative approach, just like we do when
> we say CREATE INDEX.

There's quite a few trade-offs when it comes to indexes though. I'm
trying to figure out when you wouldn't want to use a GPU, if it's
available to you and the cost model says it's faster? To me, that's
kind of like saying you want a declarative approach for when to use a
HashJoin.

> The idea is that we only consider a lookaside when a lookaside has
> been declared. Same as when we add an index, the optimizer considers
> whether to use that index. What we don't want to happen is that the
> optimizer considers a GIN plan, even when a GIN index is not
> available.

Yes, I understood your proposal- I just don't agree with it. ;)

For MatViews and/or Indexes, there are trade-offs to be had as it
relates to disk space, insert speed, etc.

Thanks,

Stephen


From: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, "Andres Freund" <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Stephen Frost <sfrost(at)snowman(dot)net>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, "Peter Eisentraut" <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-08 00:49:59
Message-ID: 9A28C8860F777E439AA12E8AEA7694F8F9F03F@BPXM15GP.gisp.nec.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> > Let me list up the things to be clarified / developed randomly.
> >
> > * Join replacement by FDW; We still don't have consensus about join
> replacement
> > by FDW. Probably, it will be designed to remote-join implementation
> primarily,
> > however, things to do is similar. We may need to revisit the Hanada-san's
> > proposition in the past.
>
> Agreed. We need to push down joins into FDWs and we need to push down
> aggregates also, so they can be passed to FDWs. I'm planning to look at
> aggregate push down.
>
Probably, it's a helpful feature.

> > * Lookaside for ANY relations; I want planner to try GPU-scan for any
> relations
> > once installed, to reduce user's administration cost.
> > It needs lookaside allow to specify a particular foreign-server, not
> foreign-
> > table, then create ForeignScan node that is not associated with a
> particular
> > foreign-table.
>
> IMHO we would not want to add indexes to every column, on every table, nor
> would we wish to use lookaside for all tables. It is a good thing to be
> able to add optimizations for individual tables. GPUs are not good for
> everything; it is good to be able to leverage their strengths, yet avoid
> their weaknesses.
>
> If do you want that, you can write an Event Trigger that automatically adds
> a lookaside for any table.
>
It may be a solution if we try to replace scan on a relation by a ForeignScan,
in other words, a case when we can describe 1:1 relationship between a table
and a foreign-table; being alternatively scanned.

Is it possible to fit a case when a ForeignScan replaces a built-in Join plans?
I don't think it is a realistic assumption to set up lookaside configuration
for all the possible combination of joins, preliminary.

I have an idea; if lookaside accept a function, foreign-server or something
subjective entity as an alternative path, it will be able to create paths
on the fly, not only preconfigured foreign-tables.
This idea will take two forms of DDL commands as:

CREATE LOOKASIDE <name> ON <target reltaion>
TO <alternative table/matview/foreign table/...>;

CREATE LOOKASIDE <name> ON <target relation>
EXECUTE <path generator function>;

Things to do internally is same. TO- form kicks a built-in routine, instead
of user defined function, to add alternative scan/join paths according to
the supplied table/matview/foreign table and so on.

> > * ForeignScan node that is not associated with a particular foreign-table.
> > Once we try to apply ForeignScan node instead of Sort or Aggregate,
> existing
> > FDW implementation needs to be improved. These nodes scan on a
> materialized
> > relation (generated on the fly), however, existing FDW code assumes
> > ForeignScan node is always associated with a particular foreign-table.
> > We need to eliminate this restriction.
>
> I don't think we need to do that, given the above.
>
It makes a problem if ForeignScan is chosen as alternative path of Join.

The target-list of Join node are determined according to the query form
on the fly, so we cannot expect a particular TupleDesc to be returned
preliminary. Once we try to apply ForeignScan instead of Join node, it
has to have its TupleDesc depending on a set of joined relations.

I think, it is more straightforward approach to allow ForeignScan that
is not associated to a particular (cataloged) relations.

> > * FDW method for MultiExec. In case when we can stack multiple ForeignScan
> > nodes, it's helpful to support to exchange scanned tuples in their own
> > data format. Let's assume two ForeignScan nodes are stacked. One
> performs
> > like Sort, another performs like Scan. If they internally handle column-
> > oriented data format, TupleTableSlot is not a best way for data
> exchange.
>
> I agree TupleTableSlot may not be best way for bulk data movement. We
> probably need to look at buffering/bulk movement between executor nodes
> in general, which would be of benefit for the FDW case also.
> This would be a problem even for Custom Scans as originally presented also,
> so I don't see much change there.
>
Yes. I is the reason why my Custom Scan proposition supports MultiExec method.

> > * Lookaside on the INSERT/UPDATE/DELETE. Probably, it can be implemented
> > using writable FDW feature. Not a big issue, but don't forget it...
>
> Yes, possible.
>
>
> I hope these ideas make sense. This is early days and there may be other
> ideas and much detail yet to come.
>
I'd like to agree general direction. My biggest concern towards FDW is
transparency for application. If lookaside allows to redirect a reference
towards scan/join on regular relations by ForeignScan (as an alternative
method to execute), here is no strong reason to stick on custom-plan.

However, existing ForeignScan node does not support to work without
a particular foreign table. It may become a restriction if we try to
replace Join node by ForeignScan, and it is my worry.
(Even it may be solved during Join replacement by FDW works.)

One other point I noticed.

* SubPlan support; if an extension support its special logic to join relations,
but don't want to support various method to scan relations, it is natural to
leverage built-in scan logics (like SeqScan, ...).
I want ForeignScan to support to have SubPlans if FDW driver has capability.
I believe it can be implemented according to the existing manner, but we
need to expose several static functions to handle plan-tree recursively.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Stephen Frost <sfrost(at)snowman(dot)net>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-08 02:18:35
Message-ID: CA+U5nMLV8WrsRTeqM6xGBsfddxK+9DF1BB4PnWhPx6m79=qeyA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 8 May 2014 01:49, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com> wrote:

>> > * ForeignScan node that is not associated with a particular foreign-table.
>> > Once we try to apply ForeignScan node instead of Sort or Aggregate,
>> existing
>> > FDW implementation needs to be improved. These nodes scan on a
>> materialized
>> > relation (generated on the fly), however, existing FDW code assumes
>> > ForeignScan node is always associated with a particular foreign-table.
>> > We need to eliminate this restriction.
>>
>> I don't think we need to do that, given the above.
>>
> It makes a problem if ForeignScan is chosen as alternative path of Join.
>
> The target-list of Join node are determined according to the query form
> on the fly, so we cannot expect a particular TupleDesc to be returned
> preliminary. Once we try to apply ForeignScan instead of Join node, it
> has to have its TupleDesc depending on a set of joined relations.
>
> I think, it is more straightforward approach to allow ForeignScan that
> is not associated to a particular (cataloged) relations.

From your description, my understanding is that you would like to
stream data from 2 standard tables to the GPU, then perform a join on
the GPU itself.

I have been told that is not likely to be useful because of the data
transfer overheads.

Or did I misunderstand, and that this is intended to get around the
current lack of join pushdown into FDWs?

Can you be specific about the actual architecture you wish for, so we
can understand how to generalise that into an API?

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-08 02:22:54
Message-ID: CA+U5nMLLhRz=yN1KCiJEDJBKw_BYCBTKuk6q7wA9xr5L5-urwA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 7 May 2014 18:39, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> * Simon Riggs (simon(at)2ndQuadrant(dot)com) wrote:
>> On 7 May 2014 17:43, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
>> > It's the optimizer's job to figure out which path to pick though, based
>> > on which will have the lowest cost.
>>
>> Of course. I'm not suggesting otherwise.
>>
>> >> If do you want that, you can write an Event Trigger that automatically
>> >> adds a lookaside for any table.
>> >
>> > This sounds terribly ugly and like we're pushing optimization decisions
>> > on to the user instead of just figuring out what the best answer is.
>>
>> I'm proposing that we use a declarative approach, just like we do when
>> we say CREATE INDEX.
>
> There's quite a few trade-offs when it comes to indexes though. I'm
> trying to figure out when you wouldn't want to use a GPU, if it's
> available to you and the cost model says it's faster? To me, that's
> kind of like saying you want a declarative approach for when to use a
> HashJoin.

I'm proposing something that is like an index, not like a plan node.

The reason that proposal is being made is that we need to consider
data structure, data location and processing details.

* In the case of Mat Views, if there is no Mat View, then we can't use
it - we can't replace that with just any mat view instead
* GPUs and other special processing units have finite data transfer
rates, so other people have proposed that they retain data on the
GPU/SPU - so we want to do a lookaside only for situations where the
data is already prepared to handle a lookaside.
* The other cases I cited of in-memory data structures are all
pre-arranged items with structures suited to processing particular
types of query

Given that I count 4-5 beneficial use cases for this index-like
lookaside, it seems worth investing time in.

It appears that Kaigai wishes something else in addition to this
concept, so there may be some confusion from that. I'm sure it will
take a while to really understand all the ideas and possibilities.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-08 02:28:45
Message-ID: 20140508022845.GJ2556@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon,

* Simon Riggs (simon(at)2ndQuadrant(dot)com) wrote:
> On 8 May 2014 01:49, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com> wrote:
> >From your description, my understanding is that you would like to
> stream data from 2 standard tables to the GPU, then perform a join on
> the GPU itself.
>
> I have been told that is not likely to be useful because of the data
> transfer overheads.

That was my original understanding and, I believe, the case at one
point, however...

> Or did I misunderstand, and that this is intended to get around the
> current lack of join pushdown into FDWs?

I believe the issue with the transfer speeds to the GPU have been either
eliminated or at least reduced to the point where it's practical now.
This is all based on prior discussions with KaiGai- I've not done any
testing myself. In any case, this is exactly what they're looking to
do, as I understand it, and to do the same with aggregates that work
well on GPUs.

> Can you be specific about the actual architecture you wish for, so we
> can understand how to generalise that into an API?

It's something that *could* be done with FDWs, once they have the
ability to have join push-down and aggregate push-down, but I (and, as I
understand it, Tom) feel isn't really the right answer for this because
the actual *data* is completely under PG in this scenario. It's just
in-memory processing that's being done on the GPU and in the GPU's
memory.

KaiGai has speculated about other possibilities (eg: having the GPU's
memory also used as some kind of multi-query cache, which would reduce
the transfer costs, but at a level of complexity regarding that cache
that I'm not sure it'd be sensible to try and do and, in any case, could
be done later and might make sense independently, if we could make it
work for, say, a memcached environment too; I'm thinking it would be
transaction-specific, but even that would be pretty tricky unless we
held locks across every row...).

Thanks,

Stephen


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-08 02:36:23
Message-ID: 20140508023623.GK2556@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon,

* Simon Riggs (simon(at)2ndQuadrant(dot)com) wrote:
> I'm proposing something that is like an index, not like a plan node.
>
> The reason that proposal is being made is that we need to consider
> data structure, data location and processing details.
>
> * In the case of Mat Views, if there is no Mat View, then we can't use
> it - we can't replace that with just any mat view instead

I agree with you about MatView's. There are clear trade-offs there,
similar to those with indexes.

> * GPUs and other special processing units have finite data transfer
> rates, so other people have proposed that they retain data on the
> GPU/SPU - so we want to do a lookaside only for situations where the
> data is already prepared to handle a lookaside.

I've heard this and I'm utterly unconvinced that it could be made to
work at all- and it's certainly moving the bar of usefullness quite far
away, making the whole thing much less practical. If we can't cost for
this transfer rate and make use of GPUs for medium-to-large size queries
which are only transient, then perhaps shoving all GPU work out across
an FDW is actually the right solution, and make that like some kind of
MatView as you're proposing- but I don't see how you're going to manage
updates and invalidation of that data in a sane way for a multi-user PG
system.

> * The other cases I cited of in-memory data structures are all
> pre-arranged items with structures suited to processing particular
> types of query

If it's transient in-memory work, I'd like to see our generalized
optimizer consider them all instead of pushing that job on the user to
decide when the optimizer should consider certain methods.

> Given that I count 4-5 beneficial use cases for this index-like
> lookaside, it seems worth investing time in.

I'm all for making use of MatViews and GPUs, but there's more than one
way to get there and look-asides feels like pushing the decision,
unnecessarily, on to the user.

Thanks,

Stephen


From: Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
To: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Stephen Frost <sfrost(at)snowman(dot)net>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-08 03:02:41
Message-ID: CAEZqfEeCSPcwKtv_pyhKFZX13Rcq5LCivv9EoM+jMLxiToRgYg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2014-05-07 18:06 GMT+09:00 Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>:
> Let me list up the things to be clarified / developed randomly.
>
> * Join replacement by FDW; We still don't have consensus about join replacement
> by FDW. Probably, it will be designed to remote-join implementation primarily,
> however, things to do is similar. We may need to revisit the Hanada-san's
> proposition in the past.

I can't recall the details soon but the reason I gave up was about
introducing ForiegnJoinPath node, IIRC. I'll revisit the discussion
and my proposal.
--
Shigeru HANADA


From: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, "Andres Freund" <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Stephen Frost <sfrost(at)snowman(dot)net>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, "Peter Eisentraut" <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-08 03:33:57
Message-ID: 9A28C8860F777E439AA12E8AEA7694F8F9F189@BPXM15GP.gisp.nec.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> >> > * ForeignScan node that is not associated with a particular
> foreign-table.
> >> > Once we try to apply ForeignScan node instead of Sort or
> >> > Aggregate,
> >> existing
> >> > FDW implementation needs to be improved. These nodes scan on a
> >> materialized
> >> > relation (generated on the fly), however, existing FDW code assumes
> >> > ForeignScan node is always associated with a particular
> foreign-table.
> >> > We need to eliminate this restriction.
> >>
> >> I don't think we need to do that, given the above.
> >>
> > It makes a problem if ForeignScan is chosen as alternative path of Join.
> >
> > The target-list of Join node are determined according to the query
> > form on the fly, so we cannot expect a particular TupleDesc to be
> > returned preliminary. Once we try to apply ForeignScan instead of Join
> > node, it has to have its TupleDesc depending on a set of joined relations.
> >
> > I think, it is more straightforward approach to allow ForeignScan that
> > is not associated to a particular (cataloged) relations.
>
> From your description, my understanding is that you would like to stream
> data from 2 standard tables to the GPU, then perform a join on the GPU itself.
>
> I have been told that is not likely to be useful because of the data transfer
> overheads.
>
Here are two solutions. One is currently I'm working; in case when number
of rows in left- and right- tables are not balanced well, we can keep a hash
table in the GPU DRAM, then we transfer the data stream chunk-by-chunk from
the other side. Kernel execution and data transfer can be run asynchronously,
so it allows to hide data transfer cost as long as we have enough number of
chunks, like processor pipelining.
Other solution is "integrated" GPU that kills necessity of data transfer,
like Intel's Haswell, AMD's Kaveri or Nvidia's Tegra K1; all majors are
moving to same direction.

> Or did I misunderstand, and that this is intended to get around the current
> lack of join pushdown into FDWs?
>
The logic above is obviously executed on the extension side, so it needs
ForeignScan node to perform like Join node; that reads two input relation
streams and output one joined relation stream.

It is quite similar to expected FDW join-pushdown design. It will consume
(remote) two relations and generates one output stream; looks like a scan
on a particular relation (but no catalog definition here).

Probably, it shall be visible to local backend as follows:
(it is a result of previous prototype based on custom-plan api)

postgres=# EXPLAIN VERBOSE SELECT count(*) FROM
pgbench1_branches b JOIN pgbench1_accounts a ON a.bid = b.bid WHERE aid < 100;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=101.60..101.61 rows=1 width=0)
Output: count(*)
-> Custom Scan (postgres-fdw) (cost=100.00..101.43 rows=71 width=0)
Remote SQL: SELECT NULL FROM (public.pgbench_branches r1 JOIN public.pgbench_accounts r2 ON ((r1.bid = r2.bid))) WHERE ((r2.aid < 100))
(4 rows)

The place of "Custom Scan" node will be ForeignScan, if Join pushdown got supported.
At that time, what relation should be scanned by this ForeignScan?
It is the reason why I proposed ForeignScan node without particular relation.

> Can you be specific about the actual architecture you wish for, so we can
> understand how to generalise that into an API?
>
If we push the role of CustomPlan node into ForeignScan, I want to use this node
to acquire control during query planning/execution.

As I did in the custom-plan patch, first of all, I want extension to have
a chance to add alternative path towards particular scan/join.
If extension can take over the execution, it will generate a ForeignPath
(or CustomPath) node then call add_path(). As usual manner, planner decide
whether the alternative path is cheaper than other candidates.

In case when it replaced scan relation by ForeignScan, it is almost same as
existing API doing, except for the underlying relation is regular one, not
foreign table.

In case when it replaced join relations by ForeignScan, it will be almost
same as expected ForeignScan with join-pushed down. Unlike usual table scan,
it does not have actual relation definition on catalog, and its result
tuple-slot is determined on the fly.
One thing different from the remote-join is, this ForeignScan node may have
sub-plans locally, if FDW driver (e.g GPU execution) may have capability on
Join only, but no relation scan portion.
So, unlike its naming, I want ForeignScan to support to have sub-plans if
FDW driver supports the capability.

Does it make you clear? Or, makes you more confused??

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Stephen Frost <sfrost(at)snowman(dot)net>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-08 09:51:17
Message-ID: CA+U5nM+5W3xwqQsaJV=Q25RtdPu3LcBuJp_1B7riUF9==TPVcQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 8 May 2014 04:33, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com> wrote:

>> From your description, my understanding is that you would like to stream
>> data from 2 standard tables to the GPU, then perform a join on the GPU itself.
>>
>> I have been told that is not likely to be useful because of the data transfer
>> overheads.
>>
> Here are two solutions. One is currently I'm working; in case when number
> of rows in left- and right- tables are not balanced well, we can keep a hash
> table in the GPU DRAM, then we transfer the data stream chunk-by-chunk from
> the other side. Kernel execution and data transfer can be run asynchronously,
> so it allows to hide data transfer cost as long as we have enough number of
> chunks, like processor pipelining.

Makes sense to me, thanks for explaining.

The hardware-enhanced hash join sounds like a great idea.

My understanding is we would need

* a custom cost-model
* a custom execution node

The main question seems to be whether doing that would be allowable,
cos its certainly doable.

I'm still looking for a way to avoid adding planning time for all
queries though.

> Other solution is "integrated" GPU that kills necessity of data transfer,
> like Intel's Haswell, AMD's Kaveri or Nvidia's Tegra K1; all majors are
> moving to same direction.

Sounds useful, but very non-specific, as yet.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-08 11:02:52
Message-ID: CA+U5nMKLDDWokqhzfHBg=uYWiMpfL9mmA4kUmjkx3C3dzBUwuw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 8 May 2014 03:36, Stephen Frost <sfrost(at)snowman(dot)net> wrote:

>> Given that I count 4-5 beneficial use cases for this index-like
>> lookaside, it seems worth investing time in.
>
> I'm all for making use of MatViews and GPUs, but there's more than one
> way to get there and look-asides feels like pushing the decision,
> unnecessarily, on to the user.

I'm not sure I understand where most of your comments come from, so
its clear we're not talking about the same things yet.

We have multiple use cases where an alternate data structure could be
used to speed up queries.

My goal is to use the alternate data structure(s)

1) if the data structure contains matching data for the current query
2) only when the user has explicitly stated it would be correct to do
so, and they wish it
3) transparently to the application, rather than forcing them to recode
4) after fully considering cost-based optimization, which we can only
do if it is transparent

all of which is how mat views work in other DBMS. My additional requirement is

5) allow this to work with data structures outside the normal
heap/index/block structures, since we have multiple already working
examples of such things and many users wish to leverage those in their
applications

which I now understand is different from the main thrust of Kaigai's
proposal, so I will restate this later on another thread.

The requirement is similar to the idea of running

CREATE MATERIALIZED VIEW foo
BUILD DEFERRED
REFRESH COMPLETE
ON DEMAND
ENABLE QUERY REWRITE
ON PREBUILT TABLE

but expands on that to encompass any external data structure.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Stephen Frost <sfrost(at)snowman(dot)net>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-08 12:16:00
Message-ID: CA+TgmoZFwQnjpyhBwX2P6v++ZdP-G2K_ymXHUYaR+QPpcrcp0Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, May 7, 2014 at 4:01 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> Agreed. My proposal is that if the planner allows the lookaside to an
> FDW then we pass the query for full execution on the FDW. That means
> that the scan, aggregate and join could take place via the FDW. i.e.
> "Custom Plan" == lookaside + FDW
>
> Or put another way, if we add Lookaside then we can just plug in the
> pgstrom FDW directly and we're done. And everybody else's FDW will
> work as well, so Citus etcc will not need to recode.

As Stephen notes downthread, Tom has already expressed opposition to
this idea on other threads, and I tend to agree with him, at least to
some degree. I think the drive to use foreign data wrappers for
PGStrom, CitusDB, and other things that aren't really foreign data
wrappers as originally conceived is a result of the fact that we've
got only one interface in this area that looks remotely like something
pluggable; and so everyone's trying to fit things into the constraints
of that interface whether it's actually a good fit or not.
Unfortunately, I think what CitusDB really wants is pluggable storage,
and what PGStrom really wants is custom paths, and I don't think
either of those things is the same as what FDWs provide.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-08 12:48:52
Message-ID: 20140508124852.GN2556@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Simon Riggs (simon(at)2ndQuadrant(dot)com) wrote:
> On 8 May 2014 03:36, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> > I'm all for making use of MatViews and GPUs, but there's more than one
> > way to get there and look-asides feels like pushing the decision,
> > unnecessarily, on to the user.
>
> I'm not sure I understand where most of your comments come from, so
> its clear we're not talking about the same things yet.
>
> We have multiple use cases where an alternate data structure could be
> used to speed up queries.

I don't view on-GPU memory as being an alternate *permanent* data store.
Perhaps that's the disconnect that we have here, as it was my
understanding that we're talking about using GPUs to make queries run
faster where the data comes from regular tables.

> My goal is to use the alternate data structure(s)

Pluggable storage is certainly interesting, but I view that as
independent of the CustomPlan-related work.

> which I now understand is different from the main thrust of Kaigai's
> proposal, so I will restate this later on another thread.

Sounds good.

Thanks,

Stephen


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-08 13:01:15
Message-ID: 20140508130115.GO2556@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> As Stephen notes downthread, Tom has already expressed opposition to
> this idea on other threads, and I tend to agree with him, at least to
> some degree. I think the drive to use foreign data wrappers for
> PGStrom, CitusDB, and other things that aren't really foreign data
> wrappers as originally conceived is a result of the fact that we've
> got only one interface in this area that looks remotely like something
> pluggable; and so everyone's trying to fit things into the constraints
> of that interface whether it's actually a good fit or not.

Agreed.

> Unfortunately, I think what CitusDB really wants is pluggable storage,
> and what PGStrom really wants is custom paths, and I don't think
> either of those things is the same as what FDWs provide.

I'm not entirely sure that PGStrom even really "wants" custom paths.. I
believe the goal there is to be able to use GPUs to do work for us and
custom paths/pluggable plan/execution are seen as the way to do that and
not depend on libraries which are under GPL, LGPL or other licenses which
we'd object to depending on from core.

Personally, I'd love to just see CUDA or whatever support in core as a
configure option and be able to detect at start-up when the right
libraries and hardware are available and enable the join types which
could make use of that gear.

I don't like that we're doing all of this because of licenses or
whatever and would still hope to figure out a way to address those
issues but I haven't had time to go research it myself and evidently
KaiGai and others see the issues there as insurmountable, so they're
trying to work around it by creating a pluggable interface where an
extension could provide those join types.

Thanks,

Stephen


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-08 13:18:39
Message-ID: CA+U5nMJZ3STn_eF=OYA4t=n3ZpYxxHWKOjKdvRfEpM4hR3M6dg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 8 May 2014 13:48, Stephen Frost <sfrost(at)snowman(dot)net> wrote:

>> We have multiple use cases where an alternate data structure could be
>> used to speed up queries.
>
> I don't view on-GPU memory as being an alternate *permanent* data store.

As I've said, others have expressed an interest in placing specific
data on specific external resources that we would like to use to speed
up queries. That might be termed a "cache" of various kinds or it
might be simply be an allocation of that resource to a specific
purpose.

If we forget GPUs, that leaves multiple use cases that do fit the description.

> Perhaps that's the disconnect that we have here, as it was my
> understanding that we're talking about using GPUs to make queries run
> faster where the data comes from regular tables.

I'm trying to consider a group of use cases, so we get a generic API
that is useful to many people, not just to one use case. I had
understood the argument to be there must be multiple potential users
of an API before we allow it.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Stephen Frost <sfrost(at)snowman(dot)net>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-08 13:26:38
Message-ID: CA+U5nMJRoOYQd5FUBj-Dpc0U43qG28Sxt9EQxQvAD4Dhyw9cKQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 8 May 2014 04:33, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com> wrote:

> In case when it replaced join relations by ForeignScan, it will be almost
> same as expected ForeignScan with join-pushed down. Unlike usual table scan,
> it does not have actual relation definition on catalog, and its result
> tuple-slot is determined on the fly.
> One thing different from the remote-join is, this ForeignScan node may have
> sub-plans locally, if FDW driver (e.g GPU execution) may have capability on
> Join only, but no relation scan portion.
> So, unlike its naming, I want ForeignScan to support to have sub-plans if
> FDW driver supports the capability.

From here, it looks exactly like pushing a join into an FDW. If we had
that, we wouldn't need Custom Scan at all.

I may be mistaken and there is a critical difference. Local sub-plans
doesn't sound like a big difference.

Have we considered having an Optimizer and Executor plugin that does
this without touching core at all?

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-08 13:32:13
Message-ID: 20140508133213.GP2556@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Simon Riggs (simon(at)2ndQuadrant(dot)com) wrote:
> On 8 May 2014 13:48, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> > I don't view on-GPU memory as being an alternate *permanent* data store.
>
> As I've said, others have expressed an interest in placing specific
> data on specific external resources that we would like to use to speed
> up queries. That might be termed a "cache" of various kinds or it
> might be simply be an allocation of that resource to a specific
> purpose.

I don't think some generalized structure that addresses the goals of
FDWs, CustomPaths, MatViews and query cacheing is going to be workable
and I'm definitely against having to specify at a per-relation level
when I want certain join types to be considered.

> > Perhaps that's the disconnect that we have here, as it was my
> > understanding that we're talking about using GPUs to make queries run
> > faster where the data comes from regular tables.
>
> I'm trying to consider a group of use cases, so we get a generic API
> that is useful to many people, not just to one use case. I had
> understood the argument to be there must be multiple potential users
> of an API before we allow it.

The API you've outlined requires users to specify on a per-relation
basis what join types are valid. As for if CustomPlans, there's
certainly potential for many use-cases there beyond just GPUs. What I'm
unsure about is if any others would actually need to be implemented
externally as the GPU-related work seems to need or if we would just
implement those other join types in core.

Thanks,

Stephen


From: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Stephen Frost <sfrost(at)snowman(dot)net>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-08 13:34:34
Message-ID: 9A28C8860F777E439AA12E8AEA7694F8F9F3BC@BPXM15GP.gisp.nec.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> On Wed, May 7, 2014 at 4:01 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> > Agreed. My proposal is that if the planner allows the lookaside to an
> > FDW then we pass the query for full execution on the FDW. That means
> > that the scan, aggregate and join could take place via the FDW. i.e.
> > "Custom Plan" == lookaside + FDW
> >
> > Or put another way, if we add Lookaside then we can just plug in the
> > pgstrom FDW directly and we're done. And everybody else's FDW will
> > work as well, so Citus etcc will not need to recode.
>
> As Stephen notes downthread, Tom has already expressed opposition to this
> idea on other threads, and I tend to agree with him, at least to some degree.
> I think the drive to use foreign data wrappers for PGStrom, CitusDB, and
> other things that aren't really foreign data wrappers as originally
> conceived is a result of the fact that we've got only one interface in this
> area that looks remotely like something pluggable; and so everyone's trying
> to fit things into the constraints of that interface whether it's actually
> a good fit or not.
> Unfortunately, I think what CitusDB really wants is pluggable storage, and
> what PGStrom really wants is custom paths, and I don't think either of those
> things is the same as what FDWs provide.
>
Yes, what PGStrom really needs is a custom paths; that allows extension to
replace a part of built-in nodes according to extension's characteristics.
The discussion upthread clarified that FDW needs to be enhanced to support
functionality that PGStrom wants to provide, however, some of them also needs
redefinition of FDW, indeed.

Umm... I'm now missing the direction towards my goal.
What approach is the best way to glue PostgreSQL and PGStrom?

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-08 13:40:39
Message-ID: 20140508134039.GQ2556@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Simon Riggs (simon(at)2ndQuadrant(dot)com) wrote:
> >From here, it looks exactly like pushing a join into an FDW. If we had
> that, we wouldn't need Custom Scan at all.
>
> I may be mistaken and there is a critical difference. Local sub-plans
> doesn't sound like a big difference.

Erm. I'm not sure that you're really thinking through what you're
suggesting.

Allow me to re-state your suggestion here:

An FDW is loaded which provides hook for join push-down (whatever those
end up being).

A query is run which joins *local* table A to *local* table B. Standard
heaps, standard indexes, all local to this PG instance.

The FDW which supports join push-down is then passed this join for
planning, with local sub-plans for the local tables.

> Have we considered having an Optimizer and Executor plugin that does
> this without touching core at all?

Uh, isn't that what we're talking about? The issue is that there's a
bunch of internal functions that such a plugin would need to either have
access to or re-implement, but we'd rather not expose those internal
functions to the whole world because they're, uh, internal helper
routines, essentially, which could disappear in another release.

The point is that there isn't a good API for this today and what's being
proposed isn't a good API, it's just bolted-on to the existing system by
exposing what are rightfully internal routines.

Thanks,

Stephen


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-08 13:41:39
Message-ID: CA+U5nM+QAT8syBHNdYmG5C5io91nAnRziBMOYF6KKu_zUyk2mg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 8 May 2014 14:32, Stephen Frost <sfrost(at)snowman(dot)net> wrote:

> The API you've outlined requires users to specify on a per-relation
> basis what join types are valid.

No, it doesn't. I've not said or implied that at any point.

If you keep telling me what I mean, rather than asking, we won't get anywhere.

I think that's as far as we'll get on email.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-08 13:49:04
Message-ID: 20140508134904.GR2556@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon,

Perhaps you've changed your proposal wrt LOOKASIDES's and I've missed it
somewhere in the thread, but this is what I was referring to with my
concerns regarding per-relation definition of 'LOOKASIDES':

* Simon Riggs (simon(at)2ndQuadrant(dot)com) wrote:
> Roughly, I'm thinking of this...
>
> CREATE LOOKASIDE ON foo
> TO foo_mat_view;
>
> and also this...
>
> CREATE LOOKASIDE ON foo
> TO foo_as_a_foreign_table /* e.g. PGStrom */

where I took 'foo' to mean 'a relation'.

Your downthread comments on 'CREATE MATERIALIZED VIEW' are in the same
vein, though there I agree that we need it per-relation as there are
other trade-offs to consider (storage costs of the matview, cost to
maintain the matview, etc, similar to indexes).

The PGStrom proposal, aiui, is to add a new join type which supports
using a GPU to answer a query where all the data is in regular PG
tables. I'd like that to "just work" when a GPU is available (perhaps
modulo having to install some extension), for any join which is costed
to be cheaper/faster when done that way.

Thanks,

Stephen


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-08 14:01:26
Message-ID: CA+U5nMLL9_A0gN18WX1X2F5Fnv8==mVwxVnmsAUJoEemJt62Bw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 8 May 2014 14:40, Stephen Frost <sfrost(at)snowman(dot)net> wrote:

> Allow me to re-state your suggestion here:
>
> An FDW is loaded which provides hook for join push-down (whatever those
> end up being).
>
> A query is run which joins *local* table A to *local* table B. Standard
> heaps, standard indexes, all local to this PG instance.
>
> The FDW which supports join push-down is then passed this join for
> planning, with local sub-plans for the local tables.

Yes that is correct; thank you for confirming your understanding with me.

That also supports custom join of local to non-local table, or custom
join of two non-local tables.

If we can use interfaces that already exist with efficiency, why
invent a new one?

>> Have we considered having an Optimizer and Executor plugin that does
>> this without touching core at all?
>
> Uh, isn't that what we're talking about?

No. I meant writing this as an extension rather than a patch on core.

> The issue is that there's a
> bunch of internal functions that such a plugin would need to either have
> access to or re-implement, but we'd rather not expose those internal
> functions to the whole world because they're, uh, internal helper
> routines, essentially, which could disappear in another release.
>
> The point is that there isn't a good API for this today and what's being
> proposed isn't a good API, it's just bolted-on to the existing system by
> exposing what are rightfully internal routines.

I think the main point is that people don't want to ask for our
permission before they do what they want to do.

We either help people use Postgres, or they go elsewhere.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-08 14:03:40
Message-ID: CA+U5nM+ZLPEK3Yem6_NML5bKtOW9oGkKMw6igga=--NSOLCo8A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 8 May 2014 14:49, Stephen Frost <sfrost(at)snowman(dot)net> wrote:

> Your downthread comments on 'CREATE MATERIALIZED VIEW' are in the same
> vein, though there I agree that we need it per-relation as there are
> other trade-offs to consider (storage costs of the matview, cost to
> maintain the matview, etc, similar to indexes).
>
> The PGStrom proposal, aiui, is to add a new join type which supports
> using a GPU to answer a query where all the data is in regular PG
> tables. I'd like that to "just work" when a GPU is available (perhaps
> modulo having to install some extension), for any join which is costed
> to be cheaper/faster when done that way.

All correct and agreed. As I explained earlier, lets cover the join
requirement here and we can discuss lookasides to data structures at
Pgcon.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-08 14:25:24
Message-ID: 20140508142524.GT2556@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Simon Riggs (simon(at)2ndQuadrant(dot)com) wrote:
> On 8 May 2014 14:40, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> > Allow me to re-state your suggestion here:
> >
> > An FDW is loaded which provides hook for join push-down (whatever those
> > end up being).
> >
> > A query is run which joins *local* table A to *local* table B. Standard
> > heaps, standard indexes, all local to this PG instance.
> >
> > The FDW which supports join push-down is then passed this join for
> > planning, with local sub-plans for the local tables.
>
> Yes that is correct; thank you for confirming your understanding with me.

I guess for my part, that doesn't look like an FDW any more.

> That also supports custom join of local to non-local table, or custom
> join of two non-local tables.

Well, we already support these, technically, but the FDW
doesn't actually implement the join, it's done in core.

> If we can use interfaces that already exist with efficiency, why
> invent a new one?

Perhaps once we have a proposal for FDW join push-down this will make
sense, but I'm not seeing it right now.

> >> Have we considered having an Optimizer and Executor plugin that does
> >> this without touching core at all?
> >
> > Uh, isn't that what we're talking about?
>
> No. I meant writing this as an extension rather than a patch on core.

KaiGai's patches have been some changes to core and then an extension
which uses those changes. The changes to core include exposing internal
functions for extensions to use, which will undoubtably end up being a
sore spot and fragile.

Thanks,

Stephen


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-08 15:46:11
Message-ID: CA+U5nMLq97Xti2nWCumtrgOh7fTpvcrLpt0rj0DgnRmkse_8-A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 8 May 2014 15:25, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> * Simon Riggs (simon(at)2ndQuadrant(dot)com) wrote:
>> On 8 May 2014 14:40, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
>> > Allow me to re-state your suggestion here:
>> >
>> > An FDW is loaded which provides hook for join push-down (whatever those
>> > end up being).
>> >
>> > A query is run which joins *local* table A to *local* table B. Standard
>> > heaps, standard indexes, all local to this PG instance.
>> >
>> > The FDW which supports join push-down is then passed this join for
>> > planning, with local sub-plans for the local tables.
>>
>> Yes that is correct; thank you for confirming your understanding with me.
>
> I guess for my part, that doesn't look like an FDW any more.

If it works, it works. If it doesn't, we can act otherwise.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Stephen Frost <sfrost(at)snowman(dot)net>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-08 16:06:48
Message-ID: CA+U5nMJKgMjYCVmnSriRrYzPE9xgP4NhL6qTwLq=9fGPhh3efQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 7 May 2014 02:05, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com> wrote:

> (1) DDL support and system catalog
>
> Simon suggested that DDL command should be supported to track custom-
> plan providers being installed, and to avoid nonsense hook calls
> if it is an obvious case that custom-plan provider can help. It also
> makes sense to give a chance to load extensions once installed.
> (In the previous design, I assumed modules are loaded by LOAD command
> or *_preload_libraries parameters).

I've tried hard to bend my mind to this and its beginning to sink in.

We've already got pg_am for indexes, and soon to have pg_seqam for sequences.

It would seem normal and natural to have

* pg_joinam catalog table for "join methods" with a join method API
Which would include some way of defining which operators/datatypes we
consider this for, so if PostGIS people come up with some fancy GIS
join thing, we don't invoke it every time even when its inapplicable.
I would prefer it if PostgreSQL also had some way to control when the
joinam was called, possibly with some kind of table_size_threshold on
the AM tuple, which could be set to >=0 to control when this was even
considered.

* pg_scanam catalog table for "scan methods" with a scan method API
Again, a list of operators that can be used with it, like indexes and
operator classes

By analogy to existing mechanisms, we would want

* A USERSET mechanism to allow users to turn it off for testing or
otherwise, at user, database level

We would also want

* A startup call that allows us to confirm it is available and working
correctly, possibly with some self-test for hardware, performance
confirmation/derivation of planning parameters

* Some kind of trace mode that would allow people to confirm the
outcome of calls

* Some interface to the stats system so we could track the frequency
of usage of each join/scan type. This would be done within Postgres,
tracking the calls by name, rather than trusting the plugin to do it
for us

> I tried to implement the following syntax:
>
> CREATE CUSTOM PLAN <name> FOR (scan|join|any) HANDLER <func_name>;

Not sure if we need that yet

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-08 19:10:53
Message-ID: 20140508191052.GB2556@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Simon Riggs (simon(at)2ndQuadrant(dot)com) wrote:
> It would seem normal and natural to have
>
> * pg_joinam catalog table for "join methods" with a join method API
> Which would include some way of defining which operators/datatypes we
> consider this for, so if PostGIS people come up with some fancy GIS
> join thing, we don't invoke it every time even when its inapplicable.
> I would prefer it if PostgreSQL also had some way to control when the
> joinam was called, possibly with some kind of table_size_threshold on
> the AM tuple, which could be set to >=0 to control when this was even
> considered.

It seems useful to think about how we would redefine our existing join
methods using such a structure. While thinking about that, it seems
like we would worry more about what the operators provide rather than
the specific operators themselves (ala hashing / HashJoin) and I'm not
sure we really care about the data types directly- just about the
operations which we can do on them..

I can see a case for sticking data types into this if we feel that we
have to constrain the path possibilities for some reason, but I'd rather
try and deal with any issues around "it doesn't make sense to do X
because we'll know it'll be really expensive" through the cost model
instead of with a table that defines what's allowed or not allowed.
There may be cases where we get the costing wrong and it's valuable
to be able to tweak cost values on a per-connection basis or for
individual queries.

I don't mean to imply that a 'pg_joinam' table is a bad idea, just that
I'd think of it being defined in terms of what capabilities it requires
of operators and a way for costing to be calculated for it, plus the
actual functions which it provides to implement the join itself (to
include some way to get output suitable for explain, etc..).

> * pg_scanam catalog table for "scan methods" with a scan method API
> Again, a list of operators that can be used with it, like indexes and
> operator classes

Ditto for this- but there's lots of other things this makes me wonder
about because it's essentially trying to define a pluggable storage
layer, which is great, but also requires some way to deal with all of
things we use our storage system for: cacheing / shared buffers,
locking, visibility, WAL, unique identifier / ctid (for use in indexes,
etc)...

> By analogy to existing mechanisms, we would want
>
> * A USERSET mechanism to allow users to turn it off for testing or
> otherwise, at user, database level

If we re-implement our existing components through this ("eat our own
dogfood" as it were), I'm not sure that we'd be able to have a way to
turn it on/off.. I realize we wouldn't have to, but then it seems like
we'd have two very different code paths and likely a different level of
support / capability afforded to "external" storage systems and then I
wonder if we're not back to just FDWs again..

> We would also want
>
> * A startup call that allows us to confirm it is available and working
> correctly, possibly with some self-test for hardware, performance
> confirmation/derivation of planning parameters

Yeah, we'd need this for anything that supports a GPU, regardless of how
we implement it, I'd think.

> * Some kind of trace mode that would allow people to confirm the
> outcome of calls

Seems like this would be useful independently of the rest..

> * Some interface to the stats system so we could track the frequency
> of usage of each join/scan type. This would be done within Postgres,
> tracking the calls by name, rather than trusting the plugin to do it
> for us

This is definitely something I want for core already...

Thanks,

Stephen


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-08 19:40:40
Message-ID: CA+TgmoZz4qePe2TiCMU9GwmH9=bVJdpMfuOZLr87pPuDHa91og@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, May 8, 2014 at 3:10 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> * Simon Riggs (simon(at)2ndQuadrant(dot)com) wrote:
>> It would seem normal and natural to have
>>
>> * pg_joinam catalog table for "join methods" with a join method API
>> Which would include some way of defining which operators/datatypes we
>> consider this for, so if PostGIS people come up with some fancy GIS
>> join thing, we don't invoke it every time even when its inapplicable.
>> I would prefer it if PostgreSQL also had some way to control when the
>> joinam was called, possibly with some kind of table_size_threshold on
>> the AM tuple, which could be set to >=0 to control when this was even
>> considered.
>
> It seems useful to think about how we would redefine our existing join
> methods using such a structure. While thinking about that, it seems
> like we would worry more about what the operators provide rather than
> the specific operators themselves (ala hashing / HashJoin) and I'm not
> sure we really care about the data types directly- just about the
> operations which we can do on them..

I'm pretty skeptical about this whole line of inquiry. We've only got
three kinds of joins, and each one of them has quite a bit of bespoke
logic, and all of this code is pretty performance-sensitive on large
join nests. If there's a way to make this work for KaiGai's use case
at all, I suspect something really lightweight like a hook, which
should have negligible impact on other workloads, is a better fit than
something involving system catalog access. But I might be wrong.

I also think that there are really two separate problems here: getting
the executor to call a custom scan node when it shows up in the plan
tree; and figuring out how to get it into the plan tree in the first
place. I'm not sure we've properly separated those problems, and I'm
not sure into which category the issues that sunk KaiGai's 9.4 patch
fell. Most of this discussion seems like it's about the latter
problem, but we need to solve both. For my money, we'd be better off
getting some kind of basic custom scan node functionality committed
first, even if the cases where you can actually inject them into real
plans are highly restricted. Then, we could later work on adding more
ways to inject them in more places.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-08 20:06:33
Message-ID: CA+U5nMJAocp-B9xazu+iGxhK9DfV4m69gxoSvq0bD-nPXtpKgg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 8 May 2014 20:40, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> For my money, we'd be better off
> getting some kind of basic custom scan node functionality committed
> first, even if the cases where you can actually inject them into real
> plans are highly restricted. Then, we could later work on adding more
> ways to inject them in more places.

We're past the prototyping stage and into productionising what we know
works, AFAIK. If that point is not clear, then we need to discuss that
first.

At the moment the Custom join hook is called every time we attempt to
cost a join, with no restriction.

I would like to highly restrict this, so that we only consider a
CustomJoin node when we have previously said one might be usable and
the user has requested this (e.g. enable_foojoin = on)

We only consider merge joins if the join uses operators with oprcanmerge=true.
We only consider hash joins if the join uses operators with oprcanhash=true

So it seems reasonable to have a way to define/declare what is
possible and what is not. But my take is that adding a new column to
pg_operator for every CustomJoin node is probably out of the question,
hence my suggestion to list the operators we know it can work with.

Given that everything else in Postgres is agnostic and configurable,
I'm looking to do the same here.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-08 20:13:25
Message-ID: CA+U5nMK3LskgwvJHbLTe+GpGOtq97bYTJMTJC2v8872Ofotygg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 8 May 2014 20:10, Stephen Frost <sfrost(at)snowman(dot)net> wrote:

>> * A USERSET mechanism to allow users to turn it off for testing or
>> otherwise, at user, database level
>
> If we re-implement our existing components through this ("eat our own
> dogfood" as it were), I'm not sure that we'd be able to have a way to
> turn it on/off.. I realize we wouldn't have to, but then it seems like
> we'd have two very different code paths and likely a different level of
> support / capability afforded to "external" storage systems and then I
> wonder if we're not back to just FDWs again..

We have SET enable_hashjoin = on | off

I would like a way to do the equivalent of SET enable_mycustomjoin =
off so that when it starts behaving weirdly in production, I can turn
it off so we can prove that is not the casue, or keep it turned off if
its a problem. I don't want to have to call a hook and let the hook
decide whether it can be turned off or not.

Postgres should be in control of the plugin, not give control to the
plugin every time and hope it gives us control back.

(I'm trying to take the "FDW isn't the right way" line of thinking to
its logical conclusions, so we can decide).

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-08 20:43:30
Message-ID: 29287.1399581810@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> I'm pretty skeptical about this whole line of inquiry. We've only got
> three kinds of joins, and each one of them has quite a bit of bespoke
> logic, and all of this code is pretty performance-sensitive on large
> join nests. If there's a way to make this work for KaiGai's use case
> at all, I suspect something really lightweight like a hook, which
> should have negligible impact on other workloads, is a better fit than
> something involving system catalog access. But I might be wrong.

We do a great deal of catalog consultation already during planning,
so I think a few more wouldn't be a problem, especially if the planner
is smart enough to touch the catalogs just once (per query?) and cache
the results. However, your point about lots of bespoke logic is dead
on, and I'm afraid it's damn near a fatal objection. As just one example,
if we did not have merge joins then an awful lot of what the planner does
with path keys simply wouldn't exist, or at least would look a lot
different than it does. Without that infrastructure, I can't imagine
that a plugin approach would be able to plan mergejoins anywhere near as
effectively. Maybe there's a way around this issue, but it sure won't
just be a pg_am-like API.

> I also think that there are really two separate problems here: getting
> the executor to call a custom scan node when it shows up in the plan
> tree; and figuring out how to get it into the plan tree in the first
> place. I'm not sure we've properly separated those problems, and I'm
> not sure into which category the issues that sunk KaiGai's 9.4 patch
> fell.

I thought that the executor side of his patch wasn't in bad shape. The
real problems were in the planner, and indeed largely in the "backend"
part of the planner where there's a lot of hard-wired logic for fixing up
low-level details of the constructed plan tree. It seems like in
principle it might be possible to make that logic cleanly extensible,
but it'll likely take a major rewrite. The patch tried to skate by with
just exposing a bunch of internal functions, which I don't think is a
maintainable approach, either for the core or for the extensions using it.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Andres Freund <andres(at)2ndQuadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-08 20:55:11
Message-ID: 29518.1399582511@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
> On 8 May 2014 20:40, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> For my money, we'd be better off
>> getting some kind of basic custom scan node functionality committed
>> first, even if the cases where you can actually inject them into real
>> plans are highly restricted. Then, we could later work on adding more
>> ways to inject them in more places.

> We're past the prototyping stage and into productionising what we know
> works, AFAIK. If that point is not clear, then we need to discuss that
> first.

OK, I'll bite: what here do we know works? Not a damn thing AFAICS;
it's all speculation that certain hooks might be useful, and speculation
that's not supported by a lot of evidence. If you think this isn't
prototyping, I wonder what you think *is* prototyping.

It seems likely to me that our existing development process is not
terribly well suited to developing a good solution in this area.
We need to be able to try some things and throw away what doesn't
work; but the project's mindset is not conducive to throwing features
away once they've appeared in a shipped release. And the other side
of the coin is that trying these things is not inexpensive: you have
to write some pretty serious code before you have much of a feel for
whether a planner hook API is actually any good. So by the time
you've built something of the complexity of, say, contrib/postgres_fdw,
you don't really want to throw that away in the next major release.
And that's at the bottom end of the scale of the amount of work that'd
be needed to do anything with the sorts of interfaces we're discussing.

So I'm not real sure how we move forward. Maybe something to brainstorm
about in Ottawa.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Andres Freund <andres(at)2ndQuadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-08 21:11:05
Message-ID: 29865.1399583465@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
> We only consider merge joins if the join uses operators with oprcanmerge=true.
> We only consider hash joins if the join uses operators with oprcanhash=true

> So it seems reasonable to have a way to define/declare what is
> possible and what is not. But my take is that adding a new column to
> pg_operator for every CustomJoin node is probably out of the question,
> hence my suggestion to list the operators we know it can work with.

For what that's worth, I'm not sure that either the oprcanmerge or
oprcanhash columns really pull their weight. We could dispense with both
at the cost of doing some wasted lookups in pg_amop. (Perhaps we should
replace them with a single "oprisequality" column, which would amount to
a hint that it's worth looking for hash or merge properties, or for other
equality-ish properties in future.)

So I think something comparable to an operator class is indeed a better
approach than adding more columns to pg_operator. Other than the
connection to pg_am, you could pretty nearly just use the operator class
infrastructure as-is for a lot of operator-property things like this.

regards, tom lane


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-08 21:24:47
Message-ID: CA+U5nM+qmAayK6C5qMD3OrGrY+TACBg6EY36aTOWyXYbFvP5Sg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 8 May 2014 21:55, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> So I'm not real sure how we move forward. Maybe something to brainstorm
> about in Ottawa.

I'm just about to go on away for a week, so that's probably the best
place to leave (me out of) the discussion until Ottawa.

I've requested some evidence this hardware route is worthwhile from my
contacts, so we'll see what we get. Presumably Kaigai has something to
share already also.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Stephen Frost <sfrost(at)snowman(dot)net>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-08 21:37:25
Message-ID: CAM3SWZR705rqmbcTZSc4R2aeswKWFnkyy=1pdSW5mFXc9OWXvw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, May 8, 2014 at 6:34 AM, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com> wrote:
> Umm... I'm now missing the direction towards my goal.
> What approach is the best way to glue PostgreSQL and PGStrom?

I haven't really paid any attention to PGStrom. Perhaps it's just that
I missed it, but I would find it useful if you could direct me towards
a benchmark or something like that, that demonstrates a representative
scenario in which the facilities that PGStrom offers are compelling
compared to traditional strategies already implemented in Postgres and
other systems.

If I wanted to make joins faster, personally, I would look at
opportunities to optimize our existing hash joins to take better
advantage of modern CPU characteristics. A lot of the research
suggests that it may be useful to implement techniques that take
better advantage of available memory bandwidth through techniques like
prefetching and partitioning, perhaps even (counter-intuitively) at
the expense of compute bandwidth. It's possible that it just needs to
be explained to me, but, with respect, intuitively I have a hard time
imagining that offloading joins to the GPU will help much in the
general case. Every paper on joins from the last decade talks a lot
about memory bandwidth and memory latency. Are you concerned with some
specific case that I may have missed? In what scenario might a
cost-based optimizer reasonably prefer a custom join node implemented
by PgStrom, over any of the existing join node types? It's entirely
possible that I simply missed relevant discussions here.

--
Peter Geoghegan


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-09 00:27:31
Message-ID: CA+TgmoaKM9+G=K9Z56_RDp0ffAGU9iq-j_FLdOZVMK1YS2pjbg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, May 8, 2014 at 4:43 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I thought that the executor side of his patch wasn't in bad shape. The
> real problems were in the planner, and indeed largely in the "backend"
> part of the planner where there's a lot of hard-wired logic for fixing up
> low-level details of the constructed plan tree. It seems like in
> principle it might be possible to make that logic cleanly extensible,
> but it'll likely take a major rewrite. The patch tried to skate by with
> just exposing a bunch of internal functions, which I don't think is a
> maintainable approach, either for the core or for the extensions using it.

Well, I consider that somewhat good news, because I think it would be
rather nice if we could get by with solving one problem at a time, and
if the executor part is close to being well-solved, excellent.

My ignorance is probably showing here, but I guess I don't understand
why it's so hard to deal with the planner side of things. My
perhaps-naive impression is that a Seq Scan node, or even an Index
Scan node, is not all that complicated. If we just want to inject
some more things that behave a lot like those into various baserels, I
guess I don't understand why that's especially hard.

Now I do understand that part of what KaiGai wants to do here is
inject custom scan paths as additional paths for *joinrels*. And I
can see why that would be somewhat more complicated. But I also don't
see why that's got to be part of the initial commit.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, "Kohei KaiGai" <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-09 01:10:37
Message-ID: 9A28C8860F777E439AA12E8AEA7694F8F9F925@BPXM15GP.gisp.nec.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> > I also think that there are really two separate problems here: getting
> > the executor to call a custom scan node when it shows up in the plan
> > tree; and figuring out how to get it into the plan tree in the first
> > place. I'm not sure we've properly separated those problems, and I'm
> > not sure into which category the issues that sunk KaiGai's 9.4 patch
> > fell.
>
> I thought that the executor side of his patch wasn't in bad shape. The
> real problems were in the planner, and indeed largely in the "backend"
> part of the planner where there's a lot of hard-wired logic for fixing up
> low-level details of the constructed plan tree. It seems like in principle
> it might be possible to make that logic cleanly extensible, but it'll likely
> take a major rewrite. The patch tried to skate by with just exposing a
> bunch of internal functions, which I don't think is a maintainable approach,
> either for the core or for the extensions using it.
>
(I'm now trying to catch up the discussion last night...)

I initially intended to allow extensions to add their custom-path based
on their arbitrary decision, because the core backend cannot have
expectation towards the behavior of custom-plan.
However, of course, the custom-path that replaces built-in paths shall
have compatible behavior in spite of different implementation.

So, I'm inclined to the direction that custom-plan provider will inform
the core backend what they can do, and planner will give extensions more
practical information to construct custom path node.

Let me investigate how to handle join replacement by custom-path in the
planner stage.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>


From: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Stephen Frost <sfrost(at)snowman(dot)net>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-09 01:22:05
Message-ID: 9A28C8860F777E439AA12E8AEA7694F8F9F993@BPXM15GP.gisp.nec.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> On Thu, May 8, 2014 at 6:34 AM, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com> wrote:
> > Umm... I'm now missing the direction towards my goal.
> > What approach is the best way to glue PostgreSQL and PGStrom?
>
> I haven't really paid any attention to PGStrom. Perhaps it's just that I
> missed it, but I would find it useful if you could direct me towards a
> benchmark or something like that, that demonstrates a representative
> scenario in which the facilities that PGStrom offers are compelling compared
> to traditional strategies already implemented in Postgres and other
> systems.
>
Implementation of Hash-Join on GPU side is still under development.

Only available use-case right now is an alternative scan path towards
full table scan in case when a table contains massive amount of records
and qualifiers are enough complicated.

EXPLAIN command below is, a sequential scan towards a table that contains
80M records (all of them are on memory; no disk accesses during execution).
Nvidia's GT640 takes advantages towards single threaded Core i5 4570S, at
least.

postgres=# explain (analyze) select count(*) from t1 where sqrt((x-20.0)^2 + (y-20.0)^2) < 10;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=10003175757.67..10003175757.68 rows=1 width=0) (actual time=46648.635..46648.635 rows=1 loops=1)
-> Seq Scan on t1 (cost=10000000000.00..10003109091.00 rows=26666667 width=0) (actual time=0.047..46351.567 rows=2513814 loops=1)
Filter: (sqrt((((x - 20::double precision) ^ 2::double precision) + ((y - 20::double precision) ^ 2::double precision))) < 10::double precision)
Rows Removed by Filter: 77486186
Planning time: 0.066 ms
Total runtime: 46648.668 ms
(6 rows)
postgres=# set pg_strom.enabled = on;
SET
postgres=# explain (analyze) select count(*) from t1 where sqrt((x-20.0)^2 + (y-20.0)^2) < 10;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=1274424.33..1274424.34 rows=1 width=0) (actual time=1784.729..1784.729 rows=1 loops=1)
-> Custom (GpuScan) on t1 (cost=10000.00..1207757.67 rows=26666667 width=0) (actual time=179.748..1567.018 rows=2513699 loops=1)
Host References:
Device References: x, y
Device Filter: (sqrt((((x - 20::double precision) ^ 2::double precision) + ((y - 20::double precision) ^ 2::double precision))) < 10::double precision)
Total time to load: 0.231 ms
Avg time in send-mq: 0.027 ms
Max time to build kernel: 1.064 ms
Avg time of DMA send: 3.050 ms
Total time of DMA send: 933.318 ms
Avg time of kernel exec: 5.117 ms
Total time of kernel exec: 1565.799 ms
Avg time of DMA recv: 0.086 ms
Total time of DMA recv: 26.289 ms
Avg time in recv-mq: 0.011 ms
Planning time: 0.094 ms
Total runtime: 1784.793 ms
(17 rows)

> If I wanted to make joins faster, personally, I would look at opportunities
> to optimize our existing hash joins to take better advantage of modern CPU
> characteristics. A lot of the research suggests that it may be useful to
> implement techniques that take better advantage of available memory
> bandwidth through techniques like prefetching and partitioning, perhaps
> even (counter-intuitively) at the expense of compute bandwidth. It's
> possible that it just needs to be explained to me, but, with respect,
> intuitively I have a hard time imagining that offloading joins to the GPU
> will help much in the general case. Every paper on joins from the last decade
> talks a lot about memory bandwidth and memory latency. Are you concerned
> with some specific case that I may have missed? In what scenario might a
> cost-based optimizer reasonably prefer a custom join node implemented by
> PgStrom, over any of the existing join node types? It's entirely possible
> that I simply missed relevant discussions here.
>
If our purpose is to consume 100% capacity of GPU device, memory bandwidth
is troublesome. But I'm not interested in GPU benchmarking.
Things I want to do is, accelerate complicated query processing than existing
RDBMS, with cheap in cost and transparent to existing application approach.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-09 01:40:19
Message-ID: 20140509014019.GC2556@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Simon Riggs (simon(at)2ndQuadrant(dot)com) wrote:
> On 8 May 2014 20:40, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> > For my money, we'd be better off
> > getting some kind of basic custom scan node functionality committed
> > first, even if the cases where you can actually inject them into real
> > plans are highly restricted. Then, we could later work on adding more
> > ways to inject them in more places.
>
> We're past the prototyping stage and into productionising what we know
> works, AFAIK. If that point is not clear, then we need to discuss that
> first.
>
> At the moment the Custom join hook is called every time we attempt to
> cost a join, with no restriction.
>
> I would like to highly restrict this, so that we only consider a
> CustomJoin node when we have previously said one might be usable and
> the user has requested this (e.g. enable_foojoin = on)

This is part of what I disagree with- I'd rather not require users to
know and understand when they want to do a HashJoin vs. a MergeJoin vs.
a CustomJoinTypeX.

> We only consider merge joins if the join uses operators with oprcanmerge=true.
> We only consider hash joins if the join uses operators with oprcanhash=true

I wouldn't consider those generally "user-facing" options, and the
enable_X counterparts are intended for debugging and not to be used in
production environments. To the point you make in the other thread- I'm
fine w/ having similar cost-based enable_X options, but I think we're
doing our users a disservice if we require that they populate or update
a table. Perhaps an extension needs to do that on installation, but
that would need to enable everything to avoid the user having to mess
around with the table.

> So it seems reasonable to have a way to define/declare what is
> possible and what is not. But my take is that adding a new column to
> pg_operator for every CustomJoin node is probably out of the question,
> hence my suggestion to list the operators we know it can work with.

It does seem like there should be some work done in this area, as Tom
mentioned, to avoid needing to have more columns to track how equality
can be done. I do wonder just how we'd deal with this when it comes to
GPUs as, presumably, the code to implement the equality for various
types would have to be written in CUDA-or-whatever.

> Given that everything else in Postgres is agnostic and configurable,
> I'm looking to do the same here.

It's certainly a neat idea, but I do have concerns (which appear to be
shared by others) about just how practical it'll be and how much rework
it'd take and the question about if it'd really be used in the end..

Thanks,

Stephen


From: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>, Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Andres Freund" <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-09 01:56:29
Message-ID: 9A28C8860F777E439AA12E8AEA7694F8F9FA11@BPXM15GP.gisp.nec.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> > So it seems reasonable to have a way to define/declare what is
> > possible and what is not. But my take is that adding a new column to
> > pg_operator for every CustomJoin node is probably out of the question,
> > hence my suggestion to list the operators we know it can work with.
>
> It does seem like there should be some work done in this area, as Tom mentioned,
> to avoid needing to have more columns to track how equality can be done.
> I do wonder just how we'd deal with this when it comes to GPUs as, presumably,
> the code to implement the equality for various types would have to be written
> in CUDA-or-whatever.
>
GPU has workloads likes and dislikes. It is a reasonable idea to list up
operators (or something else) that have advantage to run on custom-path.
For example, numeric calculation on fixed-length variables has greate
advantage on GPU, but locale aware text matching is not a workload suitable
to GPUs.
It may be a good hint for planner to pick up candidate paths to be considered.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-09 02:13:23
Message-ID: 20140509021322.GE2556@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Peter Geoghegan (pg(at)heroku(dot)com) wrote:
> On Thu, May 8, 2014 at 6:34 AM, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com> wrote:
> > Umm... I'm now missing the direction towards my goal.
> > What approach is the best way to glue PostgreSQL and PGStrom?
>
> I haven't really paid any attention to PGStrom. Perhaps it's just that
> I missed it, but I would find it useful if you could direct me towards
> a benchmark or something like that, that demonstrates a representative
> scenario in which the facilities that PGStrom offers are compelling
> compared to traditional strategies already implemented in Postgres and
> other systems.

I agree that some concrete evidence would be really nice. I
more-or-less took KaiGai's word on it, but having actual benchmarks
would certainly be better.

> If I wanted to make joins faster, personally, I would look at
> opportunities to optimize our existing hash joins to take better
> advantage of modern CPU characteristics.

Yeah, I'm pretty confident we're leaving a fair bit on the table right
there based on my previous investigation into this area. There were
easily cases which showed a 3x improvement, as I recall (the trade-off
being increased memory usage for a larger, sparser hash table). Sadly,
there were also cases which ended up being worse and it seemed to be
very sensetive to the size of the hash table which ends up being built
and the size of the on-CPU cache.

> A lot of the research
> suggests that it may be useful to implement techniques that take
> better advantage of available memory bandwidth through techniques like
> prefetching and partitioning, perhaps even (counter-intuitively) at
> the expense of compute bandwidth.

While I agree with this, one of the big things about GPUs is that they
operate in a highly parallel fashion and across a different CPU/Memory
architecture than what we're used to (for starters, everything is much
"closer"). In a traditional memory system, there's a lot of back and
forth to memory, but a single memory dump over to the GPU's memory where
everything is processed in a highly parallel way and then shipped back
wholesale to main memory is at least conceivably faster.

Of course, things will change when we are able to parallelize joins
across multiple CPUs ourselves.. In a way, the PGStrom approach gets to
"cheat" us today, since it can parallelize the work where core can't and
that ends up not being an entirely fair comparison.

Thanks,

Stephen


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-09 02:16:24
Message-ID: 20140509021624.GF2556@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> Well, I consider that somewhat good news, because I think it would be
> rather nice if we could get by with solving one problem at a time, and
> if the executor part is close to being well-solved, excellent.

Sadly, I'm afraid the news really isn't all that good in the end..

> My ignorance is probably showing here, but I guess I don't understand
> why it's so hard to deal with the planner side of things. My
> perhaps-naive impression is that a Seq Scan node, or even an Index
> Scan node, is not all that complicated. If we just want to inject
> some more things that behave a lot like those into various baserels, I
> guess I don't understand why that's especially hard.

That's not what is being asked for here though...

> Now I do understand that part of what KaiGai wants to do here is
> inject custom scan paths as additional paths for *joinrels*. And I
> can see why that would be somewhat more complicated. But I also don't
> see why that's got to be part of the initial commit.

I'd say it's more than "part" of what the goal is here- it's more or
less what everything boils down to. Oh, plus being able to replace
aggregates with a GPU-based operation instead, but that's no trivially
done thing either really (if it is, let's get it done for FDWs
already...).

Thanks,

Stephen


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-09 02:18:46
Message-ID: 20140509021846.GG2556@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Kouhei Kaigai (kaigai(at)ak(dot)jp(dot)nec(dot)com) wrote:
> I initially intended to allow extensions to add their custom-path based
> on their arbitrary decision, because the core backend cannot have
> expectation towards the behavior of custom-plan.
> However, of course, the custom-path that replaces built-in paths shall
> have compatible behavior in spite of different implementation.

I didn't ask this before but it's been on my mind for a while- how will
this work for custom data types, ala the 'geometry' type from PostGIS?
There's user-provided code that we have to execute to check equality for
those, but they're not giving us CUDA code to run to perform that
equality...

Thanks,

Stephen


From: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, "Simon Riggs" <simon(at)2ndquadrant(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, "Peter Eisentraut" <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-09 02:26:58
Message-ID: 9A28C8860F777E439AA12E8AEA7694F8F9FA87@BPXM15GP.gisp.nec.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> * Kouhei Kaigai (kaigai(at)ak(dot)jp(dot)nec(dot)com) wrote:
> > I initially intended to allow extensions to add their custom-path
> > based on their arbitrary decision, because the core backend cannot
> > have expectation towards the behavior of custom-plan.
> > However, of course, the custom-path that replaces built-in paths shall
> > have compatible behavior in spite of different implementation.
>
> I didn't ask this before but it's been on my mind for a while- how will
> this work for custom data types, ala the 'geometry' type from PostGIS?
> There's user-provided code that we have to execute to check equality for
> those, but they're not giving us CUDA code to run to perform that equality...
>
If custom-plan provider support the user-defined data types such as PostGIS,
it will be able to pick up these data types also, in addition to built-in
ones. It fully depends on coverage of the extension.
If not a supported data type, it is not a show-time of GPUs.

In my case, if PG-Strom can also have compatible code, but runnable on OpenCL,
of them, it will say "yes, I can handle this data type".

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
Cc: Simon Riggs <simon(at)2ndQuadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-09 02:27:56
Message-ID: 20140509022755.GH2556@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Kouhei Kaigai (kaigai(at)ak(dot)jp(dot)nec(dot)com) wrote:
> GPU has workloads likes and dislikes. It is a reasonable idea to list up
> operators (or something else) that have advantage to run on custom-path.
> For example, numeric calculation on fixed-length variables has greate
> advantage on GPU, but locale aware text matching is not a workload suitable
> to GPUs.

Right- but this points out exactly what I was trying to bring up.

Locale-aware text matching requires running libc-provided code, which
isn't going to happen on the GPU (unless we re-implement it...).
Aren't we going to have the same problem with the 'numeric' type? Our
existing functions won't be usable on the GPU and we'd have to
re-implement them and then make darn sure that they produce the same
results...

We'll also have to worry about any cases where we have a libc function
and a CUDA function and convince ourselves that there's no difference
between the two.. Not sure exactly how we'd built this kind of
knowledge into the system through a catalog (I tend to doubt that'd
work, in fact) and trying to make it work from an extension in a way
that it would work with *other* extensions strikes me as highly
unlikely. Perhaps the extension could provide the core types and the
other extensions could provide their own bits to hook into the right
places, but that sure seems fragile.

Thanks,

Stephen


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-09 02:29:19
Message-ID: 20140509022919.GI2556@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Kouhei Kaigai (kaigai(at)ak(dot)jp(dot)nec(dot)com) wrote:
> > I didn't ask this before but it's been on my mind for a while- how will
> > this work for custom data types, ala the 'geometry' type from PostGIS?
> > There's user-provided code that we have to execute to check equality for
> > those, but they're not giving us CUDA code to run to perform that equality...
> >
> If custom-plan provider support the user-defined data types such as PostGIS,
> it will be able to pick up these data types also, in addition to built-in
> ones. It fully depends on coverage of the extension.
> If not a supported data type, it is not a show-time of GPUs.

So the extension will need to be aware of all custom data types and then
installed *after* all other extensions are installed? That doesn't
strike me as workable...

Thanks,

Stephen


From: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, "Simon Riggs" <simon(at)2ndquadrant(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, "Peter Eisentraut" <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-09 02:33:33
Message-ID: 9A28C8860F777E439AA12E8AEA7694F8F9FAD6@BPXM15GP.gisp.nec.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> * Kouhei Kaigai (kaigai(at)ak(dot)jp(dot)nec(dot)com) wrote:
> > > I didn't ask this before but it's been on my mind for a while- how
> > > will this work for custom data types, ala the 'geometry' type from
> PostGIS?
> > > There's user-provided code that we have to execute to check equality
> > > for those, but they're not giving us CUDA code to run to perform that
> equality...
> > >
> > If custom-plan provider support the user-defined data types such as
> > PostGIS, it will be able to pick up these data types also, in addition
> > to built-in ones. It fully depends on coverage of the extension.
> > If not a supported data type, it is not a show-time of GPUs.
>
> So the extension will need to be aware of all custom data types and then
> installed *after* all other extensions are installed? That doesn't strike
> me as workable...
>
I'm not certain why do you think an extension will need to support all
the data types.
Even if it works only for a particular set of data types, it makes sense
as long as it covers data types user actually using.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-09 02:39:23
Message-ID: CAM3SWZScsapa8OAuVopYqXZm=PfG_9MXicCZ88D7XtAiBZNR_A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, May 8, 2014 at 7:13 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> Of course, things will change when we are able to parallelize joins
> across multiple CPUs ourselves.. In a way, the PGStrom approach gets to
> "cheat" us today, since it can parallelize the work where core can't and
> that ends up not being an entirely fair comparison.

I was thinking of SIMD, along similar lines. We might be able to cheat
our way out of having to solve some of the difficult problems of
parallelism that way. For example, if you can build a SIMD-friendly
bitonic mergesort, and combine that with poor man's normalized keys,
that could make merge joins on text faster. That's pure speculation,
but it seems like an interesting possibility.

--
Peter Geoghegan


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-09 02:43:57
Message-ID: 20140509024357.GJ2556@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Kouhei Kaigai (kaigai(at)ak(dot)jp(dot)nec(dot)com) wrote:
> > So the extension will need to be aware of all custom data types and then
> > installed *after* all other extensions are installed? That doesn't strike
> > me as workable...
> >
> I'm not certain why do you think an extension will need to support all
> the data types.

Mostly because we have a very nice extension system which quite a few
different extensions make use of and it'd be pretty darn unfortunate if
none of them could take advtange of GPUs because we decided that the
right way to support GPUs was through an extension.

This is argument which might be familiar to some as it was part of the
reason that json and jsonb were added to core, imv...

> Even if it works only for a particular set of data types, it makes sense
> as long as it covers data types user actually using.

I know quite a few users of PostGIS, ip4r, and hstore...

Thanks,

Stephen


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-09 02:46:57
Message-ID: CA+TgmobhuF3B4EsbwRNjq40Kiwd_uHAiKdjoY_ec2nCeAf103w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, May 8, 2014 at 10:16 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> * Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
>> Well, I consider that somewhat good news, because I think it would be
>> rather nice if we could get by with solving one problem at a time, and
>> if the executor part is close to being well-solved, excellent.
>
> Sadly, I'm afraid the news really isn't all that good in the end..
>
>> My ignorance is probably showing here, but I guess I don't understand
>> why it's so hard to deal with the planner side of things. My
>> perhaps-naive impression is that a Seq Scan node, or even an Index
>> Scan node, is not all that complicated. If we just want to inject
>> some more things that behave a lot like those into various baserels, I
>> guess I don't understand why that's especially hard.
>
> That's not what is being asked for here though...

I am not sure what your point is here. Here's mine: if we can strip
this down to the executor support plus the most minimal planner
support possible, we might be able to get *something* committed. Then
we can extend it in subsequent commits.

You seem to be saying there's no value in getting anything committed
unless it handles the scan-substituting-for-join case. I don't agree.
Incremental commits are good, whether they get you all the way to
where you want to be or not.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-09 03:08:10
Message-ID: 20140509030810.GK2556@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> I am not sure what your point is here. Here's mine: if we can strip
> this down to the executor support plus the most minimal planner
> support possible, we might be able to get *something* committed. Then
> we can extend it in subsequent commits.

I guess my point is that I see this more-or-less being solved already by
FDWs, but that doesn't address the case when it's a local table, so
perhaps there is something useful our of a commit that allows
replacement of a SeqScan node (which presumably would also be costed
differently).

> You seem to be saying there's no value in getting anything committed
> unless it handles the scan-substituting-for-join case. I don't agree.
> Incremental commits are good, whether they get you all the way to
> where you want to be or not.

To be honest, I think this is really the first proposal to replace
specific Nodes, rather than provide a way for a generic Node to exist
(which could also replace joins). While I do think it's an interesting
idea, and if we could push filters down to this new Node it might even
be worthwhile, I'm not sure that it actually moves us down the path to
supporting Nodes which replace joins.

Still, I'm not against it.

Thanks,

Stephen


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-10 11:21:08
Message-ID: CA+U5nMJFpqJegHTNT+O6BH3QF90swTkPztGS1DkT-DgFojox2g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 9 May 2014 02:40, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> * Simon Riggs (simon(at)2ndQuadrant(dot)com) wrote:
>> On 8 May 2014 20:40, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> > For my money, we'd be better off
>> > getting some kind of basic custom scan node functionality committed
>> > first, even if the cases where you can actually inject them into real
>> > plans are highly restricted. Then, we could later work on adding more
>> > ways to inject them in more places.
>>
>> We're past the prototyping stage and into productionising what we know
>> works, AFAIK. If that point is not clear, then we need to discuss that
>> first.
>>
>> At the moment the Custom join hook is called every time we attempt to
>> cost a join, with no restriction.
>>
>> I would like to highly restrict this, so that we only consider a
>> CustomJoin node when we have previously said one might be usable and
>> the user has requested this (e.g. enable_foojoin = on)
>
> This is part of what I disagree with- I'd rather not require users to
> know and understand when they want to do a HashJoin vs. a MergeJoin vs.
> a CustomJoinTypeX.

Again, I have *not* said users should know that.

>> We only consider merge joins if the join uses operators with oprcanmerge=true.
>> We only consider hash joins if the join uses operators with oprcanhash=true
>
> I wouldn't consider those generally "user-facing" options, and the
> enable_X counterparts are intended for debugging and not to be used in
> production environments. To the point you make in the other thread- I'm
> fine w/ having similar cost-based enable_X options, but I think we're
> doing our users a disservice if we require that they populate or update
> a table. Perhaps an extension needs to do that on installation, but
> that would need to enable everything to avoid the user having to mess
> around with the table.

Again, I did *not* say those should be user facing options, nor that
they be set at table-level.

What I have said is that authors of CustomJoins or CustomScans should
declare in advance via system catalogs which operators their new code
works with so that Postgres knows when it is appropriate to call them.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-10 20:02:39
Message-ID: 20140510200239.GT2556@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Simon Riggs (simon(at)2ndQuadrant(dot)com) wrote:
> What I have said is that authors of CustomJoins or CustomScans should
> declare in advance via system catalogs which operators their new code
> works with so that Postgres knows when it is appropriate to call them.

I guess I just took that as given, since the discussion has been about
GPUs and there will have to be new operators since there will be
different code (CUDA-or-whatever GPU-language code).

Thanks,

Stephen


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-11 08:22:53
Message-ID: CA+U5nMJx_+HM3DvpYvfb7shAd8pG4UR1_YSx_eYkC8+wU7tYVQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 8 May 2014 22:55, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

>> We're past the prototyping stage and into productionising what we know
>> works, AFAIK. If that point is not clear, then we need to discuss that
>> first.
>
> OK, I'll bite: what here do we know works? Not a damn thing AFAICS;
> it's all speculation that certain hooks might be useful, and speculation
> that's not supported by a lot of evidence. If you think this isn't
> prototyping, I wonder what you think *is* prototyping.

My research contacts advise me of this recent work
http://www.ntu.edu.sg/home/bshe/hashjoinonapu_vldb13.pdf
and also that they expect a prototype to be ready by October, which I
have been told will be open source.

So there are at least two groups looking at this as a serious option
for Postgres (not including the above paper's authors).

That isn't *now*, but it is at least a time scale that fits with
acting on this in the next release, if we can separate out the various
ideas and agree we wish to proceed.

I'll submerge again...

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Stephen Frost" <sfrost(at)snowman(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, "Kohei KaiGai" <kaigai(at)kaigai(dot)gr(dot)jp>
Subject: Re: [v9.5] Custom Plan API
Date: 2014-05-12 01:09:17
Message-ID: 9A28C8860F777E439AA12E8AEA7694F8FA075E@BPXM15GP.gisp.nec.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> On 8 May 2014 22:55, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> >> We're past the prototyping stage and into productionising what we
> >> know works, AFAIK. If that point is not clear, then we need to
> >> discuss that first.
> >
> > OK, I'll bite: what here do we know works? Not a damn thing AFAICS;
> > it's all speculation that certain hooks might be useful, and
> > speculation that's not supported by a lot of evidence. If you think
> > this isn't prototyping, I wonder what you think *is* prototyping.
>
> My research contacts advise me of this recent work
> http://www.ntu.edu.sg/home/bshe/hashjoinonapu_vldb13.pdf
> and also that they expect a prototype to be ready by October, which I have
> been told will be open source.
>
> So there are at least two groups looking at this as a serious option for
> Postgres (not including the above paper's authors).
>
> That isn't *now*, but it is at least a time scale that fits with acting
> on this in the next release, if we can separate out the various ideas and
> agree we wish to proceed.
>
> I'll submerge again...
>
Through the discussion last week, our minimum consensus are:
1. Deregulated enhancement of FDW is not a way to go
2. Custom-path that can replace built-in scan makes sense as a first step
towards the future enhancement. Its planner integration is enough simple
to do.
3. Custom-path that can replace built-in join takes investigation how to
integrate existing planner structure, to avoid (3a) reinvention of
whole of join handling in extension side, and (3b) unnecessary extension
calls towards the case obviously unsupported.

So, I'd like to start the (2) portion towards the upcoming 1st commit-fest
on the v9.5 development cycle. Also, we will be able to have discussion
for the (3) portion concurrently, probably, towards 2nd commit-fest.

Unfortunately, I cannot participate PGcon/Ottawa this year. Please share
us the face-to-face discussion here.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>