Re: Fast insertion indexes: why no developments

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Leonardo Francalanci <m_lists(at)yahoo(dot)it>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Fast insertion indexes: why no developments
Date: 2013-11-12 00:03:31
Message-ID: CAMkU=1ynOfE6YwC3zsiCNND_toex2DRLJOggjFj1nCvy3qgTxg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Oct 31, 2013 at 12:43 AM, Leonardo Francalanci <m_lists(at)yahoo(dot)it>wrote:

> Jeff Janes wrote
> > True, but that is also true of indexes created in bulk. It all has to
> > reach disk eventually--
> > [...]
> > If the checkpoint interval is as long as the partitioning period, then
> > hopefully the active index buffers get re-dirtied while protected in
> > shared_buffers, and only get written to disk once.
>
> Honestly, I made a lot of tests in the past, and I don't remember if I
> tried
> 15-minute checkpoints + high shared_buffers. That might work. I'm going to
> try it and see what happens.
>

You might want to go even beyond 15 minutes.

>
>
> Jeff Janes wrote
> > If the buffers get read, dirtied, and evicted from a small shared_buffers
> > over and over again
> > then you are almost guaranteed that will get written to disk multiple
> > times
>
> (as I understand, but I might be wrong):
> high shared_buffers don't help because in such a random index writing, lots
> and lots of pages get dirtied, even if the change in the page was minimal.
> So, in the "15-minute" period, you write the same pages over and over
> again.
>

But you write them only if you need to due to a checkpoint, needing new
buffers to read in something else that is not already in shared_buffers, or
because you are using a buffer-access-strategy that uses a ring. If you
make checkpoints longs, it will cut down on the first. If shared_buffers
is large enough to contain the active part of the indexes being updated,
that should cut down on the second. I don't know if the third is a problem
or not--I think copy might try to use a ring-buffer, but I don't if it does
that for indexed table.

> Even if you have high shared_buffers, the same page will get sync-ed to
> disk
> multiple times (at every checkpoint).
>

If the active part of the indexes is much larger than you can could
possibly set shared_buffers to, then there is probably little point in
increasing shared_buffers from, say, 1% of the active index size to 8% of
it. It only makes sense to increase it if you can do so large enough to
cover ~100% of the needed space.

> The idea of those "other" indexes is to avoid the random writing,
> maximizing
> the writing in sequence, even if that means writing more bytes. In other
> words: writing a full 8KB is no different than write 20 bytes in a page, as
> we'll have to sync the whole page anyway...
>

True, but that is the idea here as well. If you can delay writing the page
until 20 bytes of it have been dirtied on 400 different occasions...

I'm not saying we shouldn't think about some kind of insert buffer, but I
really doubt that that is going to happen in 9.4 while increasing
shared_buffers can be done today, if it works and if you can live with the
consequences.

Cheers,

Jeff

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2013-11-12 00:15:36 Re: pg_dump and pg_dumpall in real life
Previous Message Jeffrey Walton 2013-11-12 00:02:50 Re: Clang 3.3 Analyzer Results