a funnel by any other name

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: a funnel by any other name
Date: 2015-09-17 02:39:48
Message-ID: CA+TgmoZr8M2XpxfBoebVoEYUig5nbuV8=W6kGve6DqjhooLaaw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I have discovered that I have reinvented the wheel.

In http://www.postgresql.org/message-id/CA+TgmobM7X6jgre442638b+33h1EWa=vcZqnsvzEdX057ZHVuw@mail.gmail.com
I invented an operator called Funnel, whose job it is to fire up a
bunch of workers and run a plan in all of them, so that we can
eventually generate plans like this:

Funnel
Number of Workers: 4
-> Hash Join
-> Partial Seq Scan on a
-> Hash
-> Seq Scan on b

The idea is that each worker will read all of b and build a hash
table, and about 1/4 of a, and do the join of its portion of a to all
of b, and then the funnel will combine the separate flows of tuples
into one. Funny thing though, it turns out that I'm not the first
person to invent this operator. Greenplum has it, and there it's
called "Gather Motion".[1] However, it seems like the most systems
use the name "Exchange" to refer to a more powerful operator which can
do this thing among others.[2,3]

The Volcano paper[3] is particularly interesting for a couple of
reasons. First, it's about 20 years old. Second, it may be the
earliest reference to the exchange operator; the authors speak as if
they invented it. Third, it describes several variants of exchange:

1. One parent, one child (see p. 11, under A. Vertical Parallelism).
The child can execute in parallel with the parent. This is basically
the same as a Funnel if you had only one worker and didn't ever let
the parent execute the stuff under the funnel.

2. One parent, multiple children (see p. 12, last paragraph). Classic funnel.

3. Multiple parents, multiple children, with a copy to each parent
(see p. 13, under "Variants of the Exchange Operator").

4. Exchange-merge (see p. 13, column two). This is like what I
described as a funnel, but instead of returning all of the results in
the order they're returned, it merges a set of streams into a single
stream.

5. Interchange (see end of p. 13, continuing on p. 14). This is an
operator for "repartitioning" or "distributed shuffle" or whatever you
want to call it. The portion of the plan tree below the operator is
run to generate tuples, which are then consumed by the portion of the
plan tree above the operator. But between the two the tuples are
shuffled back and forth between cooperating workers so that each
worker ends up with the tuples that map to some partition. For
example, you can imagine that below the interchange you might have a
parallel sequential scan of a table with a column x. Let h be a hash
function. The interchange operator will move the tuples around so
that, above the interchange, each worker will see all and only those
tuples where h(x) % number_of_workers = my_worker_number.

This paper basically calls all of these things an Exchange, but
they're clearly all somewhat different from each other, so I'm not
sure it's a good idea to use the name Exchange for our node. At the
same time, I'm not sure it's a good idea to use terminology that I
invented completely out of whole cloth in preference to terminology
that's seems to be somewhat standard. One idea is to call them all
Exchange nodes but with a subtype. For example, using the Volcano's
paper's terminology, we could call these:

1. Exchange Bushy
2. Exchange Inter-Operator (this is what's currently implemented)
3. Exchange Replicate
4. Exchange Merge
5. Interchange

Or taking inspiration from Greenplum, we could go with:

1. ?
2. Gather
3. Broadcast (sorta)
4. Gather Merge
5. Redistribute

Or maybe something like this:

1. Parallel Child
2. Parallel Gather
3. Parallel Replicate
4. Parallel Merge
5. Parallel Redistribute

Or, yet another option, we could combine the similar operators under
one umbrella while keeping the things that are more different as
separate nodes:

1, 2. Exchange (or Gather or Funnel)
3, 5. Distribute (or Redistribute or Interchange or Exchange)
4. Exchange Merge (or Gather Merge or Funnel Merge)

Thoughts?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

[1] http://www.ndm.net/datawarehouse/Greenplum/parallel-query-optimizer
[2] http://infocenter.sybase.com/help/index.jsp?topic=/com.sybase.infocenter.dc00743.1570/html/queryprocessing/CHDHHIIF.htm
[3] http://paperhub.s3.amazonaws.com/dace52a42c07f7f8348b08dc2b186061.pdf

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2015-09-17 02:42:42 Re: some pg_rewind usability issues
Previous Message Petr Jelinek 2015-09-17 02:36:11 Re: Sequence Access Method WIP