Re: Changeset Extraction v7.0 (was logical changeset generation)

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Changeset Extraction v7.0 (was logical changeset generation)
Date: 2014-01-16 14:34:51
Message-ID: CA+TgmoZt2DhWKWwgbqdUpKJ-tH9V=fD8k9uHZYKVokrUR=Me_A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jan 15, 2014 at 3:39 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> On 2014-01-15 13:28:25 -0500, Robert Haas wrote:
>> - I think you should just regard ReplicationSlotCtlLock as protecting
>> the "name" and "active" flags of every slot. ReplicationSlotCreate()
>> would then not need to mess with the spinlocks at all, and
>> ReplicationSlotAcquire and ReplicationSlotDrop get a bit simpler too I
>> think. Functions like ReplicationSlotsComputeRequiredXmin() can
>> acquire this lock in shared mode and only lock the slots that are
>> actually active, instead of all of them.
>
> I first thought you meant that we should get rid of the spinlock, but
> after rereading I think you just mean that ->name, ->active, ->in_use
> are only allowed to change while holding the lwlock exclusively so we
> don't need to spinlock in those cases? If so, yes, that works for me.

Yeah, that's about what I had in mind.

>> - If you address /* FIXME: apply sanity checking to slot name */, then
>> I think that also addresses /* XXX: do we want to use truncate
>> identifier instead? */. In other words, let's just error out if the
>> name is too long. I'm not sure what other sanity checking is needed
>> here; maybe just restrict it to 7-bit characters to avoid problems
>> with encodings for individual databases varying.
>
> Yea, erroring out seems like a good idea. But I think we need to
> restrict slot names a bit more than that, given they are used as
> filenames... We could instead name the files using the slot's offset,
> but I'd prefer to not go that route.

OK. Well, add some code, then. :-)

>> - ReplicationSlotAcquire probably needs to ignore slots that are not active.
> Not sure what you mean? If the slot isn't in_use we'll skip it in the loop.

active != in_use.

I suppose your point is that the slot can't be in_use if it's not also
active. Maybe it would be better to get rid of active/in_use and have
three states: REPLSLOT_CONNECTED, REPLSLOT_NOT_CONNECTED,
REPLSLOT_FREE. Or something like that.

>> - If there's a coding rule that slot->database can't be changed while
>> the slot is active, then the check to make sure that the user isn't
>> trying to bind to a slot with a mis-matching database could be done
>> before the code described in the previous point, avoiding the need to
>> go back and release the resource.
>
> I don't think slot->database should be allowed to change at all...

Well, it can if the slot is dropped and a new one created.

>> - I think the critical section in ReplicationSlotDrop is bogus. If
>> DeleteSlot() fails, we scarcely need to PANIC. The slot just isn't
>> gone.
>
> Well, if delete slot fails, we don't really know at which point it
> failed which means that the on-disk state might not correspond to the
> in-memory state. I don't see a point in adding code trying to handle
> that case correctly...

Deleting the slot should be an atomic operation. There's some
critical point before which the slot will be picked up by recovery and
after which it won't. You either did that operation, or not, and can
adjust the in-memory state accordingly.

>> - cancel_before_shmem_exit is only guaranteed to remove the
>> most-recently-added callback.
>
> Yea :(. I think that's safe for the current usages but seems mighty
> fragile... Not sure what to do about it. Just register KillSlot once and
> keep it registered?

Yep. Use a module-private flag to decide whether it needs to do anything.

>> - A comment in KillSlot wonders whether locking is required. I say
>> yes. It's safe to take lwlocks and spinlocks during shmem exit, and
>> failing to do so seems like a recipe for subtle corner-case bugs.
>
> I agree that it's safe to use spinlocks, but lwlocks? What if we are
> erroring out while holding an lwlock? Code that runs inside a
> TransactionCommand is protected against that, but if not ProcKill()
> which invokes LWLockReleaseAll() runs pretty late in the teardown
> process...

Hmm. I guess it'd be fine to decide that a connected slot can be
marked not-connected without the lock. I think you'd want a rule that
a slot can't be freed except when it's not-connected; otherwise, you
might end up marking the slot not-connected after someone else had
already recycled it for an unrelated purpose (drop slot, create new
slot).

>> - There seems to be no interface to acquire or release slots from
>> either SQL or the replication protocol, nor any way for a client of
>> this code to update its slot details.
>
> I don't think either ever needs to do that - START_TRANSACTION SLOT slot
> ...; and decoding_slot_*changes will acquire/release for them while
> active. What would the usecase be to allow them to be acquired from SQL?

My point isn't so much about SQL as that with just this patch I don't
see any way for anyone to ever acquire a slot for anything, ever. So
I think there's a piece missing, or three.

> The slot details get updates by the respective replication code. For
> streaming rep, that should happen via reply and feedback messages. For
> changeset extraction it happens when LogicalConfirmReceivedLocation() is
> called; the walsender interface does that using reply messages, the SQL
> interface calls it when finished (unless you use the _peek_ functions).

Right, but where is this code? I don't see this updating the reply
and feedback message processing code to touch slots. Did I miss that?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2014-01-16 14:37:32 Re: Performance Improvement by reducing WAL for Update Operation
Previous Message Andres Freund 2014-01-16 14:30:53 Re: [PATCH] Relocation of tablespaces in pg_basebackup