Re: TupleTableSlot abstraction

From: Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Subject: Re: TupleTableSlot abstraction
Date: 2018-10-16 11:04:41
Message-ID: CAJ3gD9eq38XhLsLTie+3NHsCRkLO0xHLA4MQX_3sr6or7xws4Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 9 Oct 2018 at 20:46, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com> wrote:
> There is still one more regression test failure in polygon.sql which I
> am yet to analyze.

Below is a narrowed down testcase which reproduces the failure in polygon.sql :

create table tab (n int, dist int, id integer);
insert into tab values (1, 2, 3);

-- Force a hash join
set enable_nestloop TO f;
set enable_mergejoin TO f;

-- Expected to return a row
SELECT * FROM tab t1 , tab t2 where t1.id = t2.id;
n | dist | id | n | dist | id
---+------+----+---+------+----
(0 rows)

In MultiExecPrivateHash(), to generate the hash table, the tuples are
retrieved one by one from the scan of outer plan state. For each
tuple, ExecHashGetHashValue() is called to get the join attribute
value of the tuple. Here, when the attribute is retrieved by a
jit-compiled slot-deforming function built by slot_compile_deform(),
the attribute value is a junk value. So the hash join condition fails
and the join returns no rows.

Root cause :

In llvm_compile_expr(), for the switch case : EEOP_INNER_FETCHSOME,
innerPlanState(parent)->ps_ResultTupleSlot->tts_cb is passed to
slot_compile_deform(). And in slot_compile_deform(), this tts_cb is
used to determine whether the inner slot is a minimal slot or a
buffer/heap tuple slot, and accordingly v_tupleheaderp is calculated
using StructMinimalTupleTableSlot or StructHeapTupleTableSlot.

In the above hash join scenario, the ps_ResultTupleSlot is a minimal tuple slot.
But at runtime, when MultiExecPrivateHash()=>ExecHashGetHashValue() is
called, the slot returned by outer node (Seqscan) is a buffer heap
tuple slot; this is because the seq scan does not return using its
ps_ResultTupleSlot, instead it directly returns its scan slot since
there is no projection info needed. Hence the tuple is retrieved using
a wrong offset inside the Tuple table slot, because the jit function
was compiled assuming it's going to be a minimal tuple slot.

So, although we can safely use
innerPlanState(parent)->ps_ResultTupleSlot to get the tuple descriptor
for slot_compile_deform(), we should not use the same tuple slot to
know what kind of a tuple slot it will be. That can be known only at
runtime.

Possible Fix :

I am thinking, in slot_compile_deform(), we need to include the logic
instructions to determine the slot type. Have a new
FIELDNO_TUPLETABLESLOT_OPS to retrieve TupleTableSlot.tts_cb, and then
accordingly calculate the tuple offset. I am not sure if this will
turn out to have a performance impact on jit execution, or whether it
is feasible to do such conditional thing in llvm; trying to
understand.

Comments ?

--
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jakob Egger 2018-10-16 11:56:49 Re: PG vs macOS Mojave
Previous Message John Naylor 2018-10-16 10:57:25 Re: WIP: Avoid creation of the free space map for small tables