Re: Support for REINDEX CONCURRENTLY

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Re: Support for REINDEX CONCURRENTLY
Date: 2013-09-17 23:04:11
Message-ID: 20130917230411.GC29545@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2013-09-17 16:34:37 -0400, Robert Haas wrote:
> On Mon, Sep 16, 2013 at 10:38 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> > Actually, the shared inval code already has that knowledge, doesn't it?
> > ISTM all we'd need is have a "sequence number" of SI entries which has
> > to be queryable. Then one can simply wait till all backends have
> > consumed up to that id which we keep track of the furthest back backend
> > in shmem.
>
> In theory, yes, but in practice, there are a few difficulties.

Agreed ;)

> 1. We're not in a huge hurry to ensure that sinval notifications are
> delivered in a timely fashion. We know that sinval resets are bad, so
> if a backend is getting close to needing a sinval reset, we kick it in
> an attempt to get it to AcceptInvalidationMessages(). But if the
> sinval queue isn't filling up, there's no upper bound on the amount of
> time that can pass before a particular sinval is read. Therefore, the
> amount of time that passes before an idle backend is forced to drain
> the sinval queue can vary widely, from a fraction of a second to
> minutes, hours, or days. So it's kind of unappealing to think about
> making user-visible behavior dependent on how long it ends up taking.

Well, when we're signalling it's certainly faster than waiting for the
other's snapshot to vanish which can take ages for normal backends. And
we can signal when we wait for consumption without too many
problems.
Also, I think in most of the usecases we can simply not wait for any of
the idle backends, those don't use the old definition anyway.

> 2. Every time we add a new kind of sinval message, we increase the
> frequency of sinval resets, and those are bad. So any notifications
> that we choose to send this way had better be pretty low-volume.

In pretty much all the cases where I can see the need for something like
that, we already send sinval messages, so we should be able to
piggbyback on those.

> Considering the foregoing points, it's unclear to me whether we should
> try to improve sinval incrementally or replace it with something
> completely new. I'm sure that the above-mentioned problems are
> solvable, but I'm not sure how hairy it will be. On the other hand,
> designing something new could be pretty hairy, too.

I am pretty sure there's quite a bit to improve around sinvals but I
think any replacement would look surprisingly similar to what we
have. So I think doing it incrementally is more realistic.
And I am certainly scared by the thought of having to replace it without
breaking corner cases all over.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2013-09-17 23:10:50 Re: [PERFORM] encouraging index-only scans
Previous Message Andres Freund 2013-09-17 22:56:30 Re: [RFC] Extend namespace of valid guc names