Re: [COMMITTERS] pgsql: If we expect a hash join to be performed in multiple batches,

Lists: pgsql-committerspgsql-hackers
From: tgl(at)postgresql(dot)org (Tom Lane)
To: pgsql-committers(at)postgresql(dot)org
Subject: pgsql: If we expect a hash join to be performed in multiple batches,
Date: 2009-03-26 17:15:35
Message-ID: 20090326171535.6B9BF754ADE@cvs.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Log Message:
-----------
If we expect a hash join to be performed in multiple batches, suppress
"physical tlist" optimization on the outer relation (ie, force a projection
step to occur in its scan). This avoids storing useless column values when
the outer relation's tuples are written to temporary batch files.

Modified version of a patch by Michael Henderson and Ramon Lawrence.

Modified Files:
--------------
pgsql/src/backend/nodes:
outfuncs.c (r1.355 -> r1.356)
(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/nodes/outfuncs.c?r1=1.355&r2=1.356)
pgsql/src/backend/optimizer/path:
costsize.c (r1.205 -> r1.206)
(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/optimizer/path/costsize.c?r1=1.205&r2=1.206)
pgsql/src/backend/optimizer/plan:
createplan.c (r1.256 -> r1.257)
(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/optimizer/plan/createplan.c?r1=1.256&r2=1.257)
pgsql/src/backend/optimizer/util:
pathnode.c (r1.150 -> r1.151)
(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/optimizer/util/pathnode.c?r1=1.150&r2=1.151)
pgsql/src/include/nodes:
relation.h (r1.170 -> r1.171)
(http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/include/nodes/relation.h?r1=1.170&r2=1.171)


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Tom Lane <tgl(at)postgresql(dot)org>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [COMMITTERS] pgsql: If we expect a hash join to be performed in multiple batches,
Date: 2009-04-02 17:11:23
Message-ID: 1238692283.5444.123.camel@ebony.2ndQuadrant
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers


On Thu, 2009-03-26 at 17:15 +0000, Tom Lane wrote:
> Log Message:
> -----------
> If we expect a hash join to be performed in multiple batches, suppress
> "physical tlist" optimization on the outer relation (ie, force a projection
> step to occur in its scan). This avoids storing useless column values when
> the outer relation's tuples are written to temporary batch files.

Can we add "batches=N" to the EXPLAIN output for Hash and/or Hash Join?

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [COMMITTERS] pgsql: If we expect a hash join to be performed in multiple batches,
Date: 2009-04-02 19:49:59
Message-ID: 20202.1238701799@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers

Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> Can we add "batches=N" to the EXPLAIN output for Hash and/or Hash Join?

Are you talking about expected batches, or actual batches? If the
former, would it be sufficient to distinguish "1" from "more than 1"?
If so, maybe changing the node title to "Batched Hash" would do.

regards, tom lane


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [COMMITTERS] pgsql: If we expect a hash join to be performed in multiple batches,
Date: 2009-04-02 20:32:56
Message-ID: 1238704376.5444.142.camel@ebony.2ndQuadrant
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-committers pgsql-hackers


On Thu, 2009-04-02 at 15:49 -0400, Tom Lane wrote:
> Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> > Can we add "batches=N" to the EXPLAIN output for Hash and/or Hash Join?
>
> Are you talking about expected batches, or actual batches?

Expected batches for EXPLAIN, both for EXPLAIN ANALYZE.

> If the
> former, would it be sufficient to distinguish "1" from "more than 1"?
> If so, maybe changing the node title to "Batched Hash" would do.

Hmmm, knowing the number of batches is beneficial since it helps you to
calculate the required memory to get best performance.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support