tuplestore API problem

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: tuplestore API problem
Date: 2009-03-26 16:57:36
Message-ID: 23347.1238086656@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

By chance I discovered that this query in the regression tests

SELECT ntile(NULL) OVER (ORDER BY ten, four), ten, four FROM tenk1 LIMIT 2;

stops working if work_mem is small enough: it either dumps core or
delivers wrong answers depending on platform.

After some tracing I found out the reason. ExecWindowAgg() does this:

if (!tuplestore_gettupleslot(winstate->buffer, true,
winstate->ss.ss_ScanTupleSlot))
elog(ERROR, "unexpected end of tuplestore");

and then goes off and calls the window functions (ntile() here), and
expects the ScanTupleSlot to still be valid afterwards. However,
ntile() forces us to read to the end of the input to find out the number
of rows. If work_mem is small enough, that means the tuplestore is
forced into dump-to-disk mode, which means it releases all its in-memory
tuples. And guess what: the ScanTupleSlot is pointing at one of those,
it doesn't have its own copy of the tuple. So we wind up trying to read
from a trashed bit of memory.

A brute-force solution is to change tuplestore_gettupleslot() so that it
always copies the tuple, but this would be wasted cycles for most uses
of tuplestores. I'm thinking of changing tuplestore_gettupleslot's API
to add a bool parameter specifying whether the caller wants to force
a copy.

Comments, better ideas?

BTW: this tells me that no one has tried to apply window functions
to nontrivial problems yet ... we'll need to encourage beta testers
to stress that code.

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2009-03-26 17:15:35 pgsql: If we expect a hash join to be performed in multiple batches,
Previous Message Sergey Konoplev 2009-03-26 15:40:38 Re: Crash in gist insertion on pathological box data