Re: Online enabling of checksums

From: Andres Freund <andres(at)anarazel(dot)de>
To: Daniel Gustafsson <daniel(at)yesql(dot)se>
Cc: Magnus Hagander <magnus(at)hagander(dot)net>, Peter Geoghegan <pg(at)bowt(dot)ie>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Michael Banck <michael(dot)banck(at)credativ(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Stark <stark(at)mit(dot)edu>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Online enabling of checksums
Date: 2018-04-06 22:56:09
Message-ID: 20180406225609.4jvciiceims5xll7@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2018-04-06 14:33:48 -0700, Andres Freund wrote:
> On 2018-04-06 02:28:17 +0200, Daniel Gustafsson wrote:
> > Looking into the isolationtester failure on piculet, which builds using
> > --disable-atomics, and locust which doesn’t have atomics, the code for
> > pg_atomic_test_set_flag seems a bit odd.
> >
> > TAS() is defined to return zero if successful, and pg_atomic_test_set_flag()
> > defined to return True if it could set. When running without atomics, don’t we
> > need to do something like the below diff to make these APIs match? :
> >
> > --- a/src/backend/port/atomics.c
> > +++ b/src/backend/port/atomics.c
> > @@ -73,7 +73,7 @@ pg_atomic_init_flag_impl(volatile pg_atomic_flag *ptr)
> > bool
> > pg_atomic_test_set_flag_impl(volatile pg_atomic_flag *ptr)
> > {
> > - return TAS((slock_t *) &ptr->sema);
> > + return TAS((slock_t *) &ptr->sema) == 0;
> > }
>
> Yes, this looks wrong.

And the reason the tests fail reliably after is because the locking
model around ChecksumHelperShmem->launcher_started arguably is broken:

/* If the launcher isn't started, there is nothing to shut down */
if (pg_atomic_unlocked_test_flag(&ChecksumHelperShmem->launcher_started))
return;

This uses a non-concurrency safe primitive. Which then spuriously
triggers:

#define PG_HAVE_ATOMIC_UNLOCKED_TEST_FLAG
static inline bool
pg_atomic_unlocked_test_flag_impl(volatile pg_atomic_flag *ptr)
{
/*
* Can't do this efficiently in the semaphore based implementation - we'd
* have to try to acquire the semaphore - so always return true. That's
* correct, because this is only an unlocked test anyway. Do this in the
* header so compilers can optimize the test away.
*/
return true;
}

no one can entirely quibble with the rationale that this is ok (I'll
post a patch cleaning up the atomics simulation of flags in a bit), but
this is certainly not a correct locking strategy.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2018-04-06 22:57:36 Re: PostgreSQL 11 Release Management Team & Feature Freeze
Previous Message Jonathan S. Katz 2018-04-06 22:54:58 Re: PostgreSQL 11 Release Management Team & Feature Freeze