Re: Block nested loop join

Lists: pgsql-hackers
From: "Bramandia Ramadhana" <bramandia(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Block nested loop join
Date: 2008-10-10 03:27:11
Message-ID: 700260640810092027r3b4f5c0aoe3589fc8d54ed50a@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi all,

I am new to postgresql. I am currently doing research to optimize the query
performance of RDBMS, specifically postgresql. Hence, I am currently reading
out the code to understand the implementation of various query evaluation
algorithm in postgresql.

Currently, I am investigating the nested loop join algorithm in
nodeNestloop.c. After reading the code, my understanding is that it performs
simple nested loop join (not block nested loop join). Is this true? Does
postgresql support block nested loop join? If it does, where is it? I might
miss some buffering/caching mechanism.

I appreciate any helps/advice.

Regards,

Bramandia R.


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Bramandia Ramadhana <bramandia(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Block nested loop join
Date: 2008-10-10 06:41:47
Message-ID: 48EEF92B.4010405@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Bramandia Ramadhana wrote:
> Currently, I am investigating the nested loop join algorithm in
> nodeNestloop.c. After reading the code, my understanding is that it performs
> simple nested loop join (not block nested loop join). Is this true?

Yep.

> Does postgresql support block nested loop join?

Nope.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Bramandia Ramadhana <bramandia(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Block nested loop join
Date: 2008-10-10 07:30:45
Message-ID: 874p3koqje.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> writes:

>> Does postgresql support block nested loop join?
>
> Nope.

We do support Hash Join though so I think the only difference is that we can't
use the hash join for cartesian joins.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's PostGIS support!


From: "Bramandia Ramadhana" <bramandia(at)gmail(dot)com>
To: "Gregory Stark" <stark(at)enterprisedb(dot)com>
Cc: "Heikki Linnakangas" <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Block nested loop join
Date: 2008-10-10 08:45:30
Message-ID: 700260640810100145k14610225k36e668be18ea6225@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Thanks for the clarifications.

Just for curiosity, is there any reason of not having block nested-loop join
implementation? Is it rarely useful?

As far as I am aware of, in the case of cross product of two tables, block
nested-loop join is the most efficient algorithm.

Regards,

Bramandia R.

On Fri, Oct 10, 2008 at 3:30 PM, Gregory Stark <stark(at)enterprisedb(dot)com>wrote:

>
> Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> writes:
>
> >> Does postgresql support block nested loop join?
> >
> > Nope.
>
> We do support Hash Join though so I think the only difference is that we
> can't
> use the hash join for cartesian joins.
>
> --
> Gregory Stark
> EnterpriseDB http://www.enterprisedb.com
> Ask me about EnterpriseDB's PostGIS support!
>


From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Bramandia Ramadhana" <bramandia(at)gmail(dot)com>
Cc: "Heikki Linnakangas" <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Block nested loop join
Date: 2008-10-10 09:27:01
Message-ID: 871vyo4x7e.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Bramandia Ramadhana" <bramandia(at)gmail(dot)com> writes:

> Thanks for the clarifications.
>
> Just for curiosity, is there any reason of not having block nested-loop join
> implementation? Is it rarely useful?

Oh, actually it occurs to me that we do implement something analogous to a
degenerate block nested loop where we materialize one side of the nested loop.

It looks "backward" since the materialized side is the "inner" side of the
loop but it's basically equivalent to a block nested loop with the entire
outer side in a single block.

So the use case of a real block nested loop would be doing a cartesian join of
two large tables where neither fits in RAM. That does seem like it might be
kind of narrow given how large the output would be.

But we won't know unless someone does the experiment.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's Slony Replication support!


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: "Bramandia Ramadhana" <bramandia(at)gmail(dot)com>, "Heikki Linnakangas" <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Block nested loop join
Date: 2008-10-10 12:19:08
Message-ID: 23390.1223641148@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Gregory Stark <stark(at)enterprisedb(dot)com> writes:
> So the use case of a real block nested loop would be doing a cartesian join of
> two large tables where neither fits in RAM. That does seem like it might be
> kind of narrow given how large the output would be.

Yeah. If you have a hashable join condition then our existing batched
hash join code should be roughly equivalent to this method. So the use
case is joining a couple of large tables with an un-hashable,
un-indexable join condition (or none at all, ie cross product) and that
just isn't something we hear people wanting to do a lot. I can't really
see why we'd bother maintaining extra code for block nested loop.

regards, tom lane


From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Bramandia Ramadhana" <bramandia(at)gmail(dot)com>, "Heikki Linnakangas" <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Block nested loop join
Date: 2008-10-10 12:32:05
Message-ID: 878wsw3a2i.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

> Gregory Stark <stark(at)enterprisedb(dot)com> writes:
>> So the use case of a real block nested loop would be doing a cartesian join of
>> two large tables where neither fits in RAM. That does seem like it might be
>> kind of narrow given how large the output would be.
>
> Yeah. If you have a hashable join condition then our existing batched
> hash join code should be roughly equivalent to this method. So the use
> case is joining a couple of large tables with an un-hashable,
> un-indexable join condition (or none at all, ie cross product) and that
> just isn't something we hear people wanting to do a lot. I can't really
> see why we'd bother maintaining extra code for block nested loop.

Hm, I hadn't thought of other non-hashable join conditions.

I wonder how much code it would be though if we just hacked hash join to
support returning the full cartesian product. Ie, instead of doing a hash
lookup do a full scan of the hash and recheck the join condition if any for
every combination.

That seems like it would be a pretty small change to HashJoin and would
effectively support precisely this join type.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's On-Demand Production Tuning


From: "Bramandia Ramadhana" <bramandia(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Gregory Stark" <stark(at)enterprisedb(dot)com>, "Heikki Linnakangas" <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Block nested loop join
Date: 2008-10-13 09:14:18
Message-ID: 700260640810130214t6b99761et34a471dfc90d13a6@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dear All,

I took a look at the source code for hash join this morning and I realized
that the block nested loop join is somewhat similar to that.

Thanks for the discussions.

Bramandia R.

On Fri, Oct 10, 2008 at 8:19 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Gregory Stark <stark(at)enterprisedb(dot)com> writes:
> > So the use case of a real block nested loop would be doing a cartesian
> join of
> > two large tables where neither fits in RAM. That does seem like it might
> be
> > kind of narrow given how large the output would be.
>
> Yeah. If you have a hashable join condition then our existing batched
> hash join code should be roughly equivalent to this method. So the use
> case is joining a couple of large tables with an un-hashable,
> un-indexable join condition (or none at all, ie cross product) and that
> just isn't something we hear people wanting to do a lot. I can't really
> see why we'd bother maintaining extra code for block nested loop.
>
> regards, tom lane
>