Re: Fast insertion indexes: why no developments

From: Gavin Flower <GavinFlower(at)archidevsys(dot)co(dot)nz>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Leonardo Francalanci <m_lists(at)yahoo(dot)it>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Fast insertion indexes: why no developments
Date: 2013-10-30 18:38:00
Message-ID: 52715208.4010608@archidevsys.co.nz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 31/10/13 06:46, Jeff Janes wrote:
> On Wed, Oct 30, 2013 at 9:54 AM, Leonardo Francalanci
> <m_lists(at)yahoo(dot)it <mailto:m_lists(at)yahoo(dot)it>> wrote:
>
> Jeff Janes wrote
> > The index insertions should be fast until the size of the active
> part of
> > the indexes being inserted into exceeds shared_buffers by some
> amount
> > (what
> > that amount is would depend on how much dirty data the kernel is
> willing
> > to
> > allow in the page cache before it starts suffering anxiety about
> it). If
> > you have enough shared_buffers to make that last for 15 minutes,
> then you
> > shouldn't have a problem inserting with live indexes.
>
> Sooner or later you'll have to checkpoint those shared_buffers...
>
>
> True, but that is also true of indexes created in bulk. It all has to
> reach disk eventually--either the checkpointer writes it out and
> fsyncs it, or the background writer or user backends writes it out and
> the checkpoint fsyncs it. If bulk creation uses a ring buffer
> strategy (I don't know if it does), then it might kick the buffers to
> kernel in more or less physical order, which would help the kernel get
> them to disk in long sequential writes. Or not. I think that this is
> where sorted checkpoint could really help.
>
> > and we are
> > talking about GB of data (my understanding is that we change
> basically every
> > btree page, resulting in re-writing of the whole index).
>
> If the checkpoint interval is as long as the partitioning period, then
> hopefully the active index buffers get re-dirtied while protected in
> shared_buffers, and only get written to disk once. If the buffers get
> read, dirtied, and evicted from a small shared_buffers over and over
> again then you are almost guaranteed that will get written to disk
> multiple times while they are still hot, unless your kernel is very
> aggressive about caching dirty data (which will cause other problems).
>
> Cheers,
>
> Jeff
How about being able to mark indexes:
'MEMORY ONLY' to make them not go to disk
and
'PERSISTENT | TRANSIENT' to mark if they should be recreated on
machine bootup?

or something similar

Cheers,
Gavin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Claudio Freire 2013-10-30 18:40:13 Re: Fast insertion indexes: why no developments
Previous Message Jeff Janes 2013-10-30 17:46:59 Re: Fast insertion indexes: why no developments