Re: Fast insertion indexes: why no developments

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Leonardo Francalanci <m_lists(at)yahoo(dot)it>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Fast insertion indexes: why no developments
Date: 2013-11-04 20:39:07
Message-ID: CAMkU=1wWsn0Y0Tw5P1Eyjr7dwfz3s2HmDsxFbeth4H2=jnZ7Gw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Nov 4, 2013 at 8:09 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Sat, Nov 2, 2013 at 6:07 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> > On 29 October 2013 16:10, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
> >> On Tue, Oct 29, 2013 at 7:53 AM, Leonardo Francalanci <m_lists(at)yahoo(dot)it>
> wrote:
> >>> I don't see much interest in insert-efficient indexes.
> >>
> >> Presumably someone will get around to implementing a btree index
> >> insertion buffer one day. I think that would be a particularly
> >> compelling optimization for us, because we could avoid ever inserting
> >> index tuples that are already dead when the deferred insertion
> >> actually occurs.
> >
> > That's pretty much what the LSM-tree is.
>
> What is pretty cool about this sort of thing is that there's no
> intrinsic reason the insertion buffer needs to be block-structured or
> disk-backed.

How do we commit to not spilling to disk, in the face of an unbounded
number of indexes existing and wanting to use this mechanism
simultaneously? If it routinely needs to spill to disk, that would
probably defeat the purpose of having it in the first place, but committing
to never doing so seems to be extremely restrictive. As you say it is also
freeing, in terms of using pointers and such, but I think the restrictions
would outweigh the freedom.

> In theory, you can structure the in-memory portion of
> the tree any way you like, using pointers and arbitrary-size memory
> allocations and all that fun stuff. You need to log that there's a
> deferred insert (or commit to flushing the insertion buffer before
> every commit, which would seem to miss the point) so that recovery can
> reconstruct the in-memory data structure and flush it, but that's it:
> the WAL format need not know any other details of the in-memory
> portion of the tree. I think that, plus the ability to use pointers
> and so forth, might lead to significant performance gains.
>
> In practice, the topology of our shared memory segment makes this a
> bit tricky. The problem isn't so much that it's fixed size as that it
> lacks a real allocator, and that all the space used for shared_buffers
> is nailed down and can't be borrowed for other purposes.

I think the fixed size is also a real problem, especially given the
ubiquitous advice not to exceed 2 to 8 GB.

Cheers,

Jeff

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Kevin Grittner 2013-11-04 20:53:32 Re: [BUGS] BUG #8542: Materialized View with another column_name does not work?
Previous Message Simon Riggs 2013-11-04 20:01:59 Re: Fast insertion indexes: why no developments