Re: Re: bulk_multi_insert infinite loops with large rows and small fill factors

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>, David Gould <daveg(at)sonic(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org, Anand Ranganathan <arangana(at)adobe(dot)com>, Alex Eulenberg <aeulenbe(at)adobe(dot)com>, Ashokraj M <ashokraj(at)adobe(dot)com>, Hari <hari(at)adobe(dot)com>, Elein Mustain <mustain(at)adobe(dot)com>
Subject: Re: Re: bulk_multi_insert infinite loops with large rows and small fill factors
Date: 2012-12-12 11:56:08
Message-ID: 50C870D8.5040002@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 12.12.2012 13:27, Andres Freund wrote:
> On 2012-12-12 03:04:19 -0800, David Gould wrote:
>> One more point, in the case where we don't insert any rows, we still do all the
>> dirtying and logging work even though we did not modify the page. I have tried
>> skip all this if no rows are added (nthispage == 0), but my access method foo
>> is sadly out of date, so someone should take a skeptical look at that.
>>
>> A test case and patch against 9.2.2 is attached. It fixes the problem and passes
>> make check. Most of the diff is just indentation changes. Whoever tries this will
>> want to test this on a small partition by itself.
>
> ISTM this would be fixed with a smaller footprint by just making
>
> if (PageGetHeapFreeSpace(page)< MAXALIGN(heaptup->t_len) + saveFreeSpace)
>
> if (!PageIsEmpty(page)&&
> PageGetHeapFreeSpace(page)< MAXALIGN(heaptup->t_len) + saveFreeSpace)
>
> I think that should work?

Yeah, seems that it should, although PageIsEmpty() is no guarantee that
the tuple fits, because even though PageIsEmpty() returns true, there
might be dead line pointers consuming so much space that the tuple at
hand doesn't fit. However, RelationGetBufferForTuple() won't return such
a page, it guarantees that the first tuple does indeed fit on the page
it returns. For the same reason, the later check that at least one tuple
was actually placed on the page is not necessary.

I committed a slightly different version, which unconditionally puts the
first tuple on the page, and only applies the freespace check to the
subsequent tuples. Since RelationGetBufferForTuple() guarantees that the
first tuple fits, we can trust that, like heap_insert does.

--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -2172,8 +2172,12 @@ heap_multi_insert(Relation relation, HeapTuple
*tuples, int ntuples,
/* NO EREPORT(ERROR) from here till changes are logged */
START_CRIT_SECTION();

- /* Put as many tuples as fit on this page */
- for (nthispage = 0; ndone + nthispage < ntuples; nthispage++)
+ /*
+ * RelationGetBufferForTuple has ensured that the first tuple fits.
+ * Put that on the page, and then as many other tuples as fit.
+ */
+ RelationPutHeapTuple(relation, buffer, heaptuples[ndone]);
+ for (nthispage = 1; ndone + nthispage < ntuples; nthispage++)
{
HeapTuple heaptup = heaptuples[ndone + nthispage];

Thanks for the report!

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Gould 2012-12-12 12:17:31 Re: Re: bulk_multi_insert infinite loops with large rows and small fill factors
Previous Message Andres Freund 2012-12-12 11:27:11 Re: bulk_multi_insert infinite loops with large rows and small fill factors