removing volatile qualifiers from lwlock.c

Lists: pgsql-hackers
From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: removing volatile qualifiers from lwlock.c
Date: 2014-09-10 18:53:07
Message-ID: CA+TgmoYWgqtcpOkz6KRfnb8r-mwt=BhqiNGJPe6Tm_BQ=bdxMA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

As discussed on the thread "Spinlocks and compiler/memory barriers",
now that we've made the spinlock primitives function as compiler
barriers (we think), it should be possible to remove volatile
qualifiers from many places in the source code. The attached patch
does this in lwlock.c. If the changes in commit
0709b7ee72e4bc71ad07b7120acd117265ab51d0 (and follow-on commits) are
correct and complete, applying this shouldn't break anything, while
possibly giving the compiler room to optimize things better than it
does today.

However, demonstrating the necessity of that commit for these changes
seems to be non-trivial. I tried applying this patch and reverting
commits 5b26278822c69dd76ef89fd50ecc7cdba9c3f035,
b4c28d1b92c81941e4fc124884e51a7c110316bf, and
0709b7ee72e4bc71ad07b7120acd117265ab51d0 on a PPC64 POWER8 box with a
whopping 192 hardware threads (thanks, IBM!). I then ran the
regression tests repeatedly, and I ran several long pgbench runs with
as many as 350 concurrent clients. No failures.

So I'm posting this patch in the hope that others can help. The
relevant tests are:

1. If you apply this patch to master and run tests of whatever kind
strikes your fancy, does anything break under high concurrency? If it
does, then the above commits weren't enough to make this safe on your
platform.

2. If you apply this patch to master, revert the commits mentioned
above, and again run tests, does anything now break? If it does (but
the first tests were OK), then that shows that those commits did
something useful on your platform.

The change to make spinlocks act as compiler barriers provides the
foundation for, hopefully, a cleaner code base and easier future work
on scalabilty and performance projects, so help is much appreciated.

Thanks,

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
lwlock-remove-volatile.patch text/x-patch 11.1 KB

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: removing volatile qualifiers from lwlock.c
Date: 2014-09-17 11:45:58
Message-ID: 20140917114558.GR25887@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On 2014-09-10 14:53:07 -0400, Robert Haas wrote:
> As discussed on the thread "Spinlocks and compiler/memory barriers",
> now that we've made the spinlock primitives function as compiler
> barriers (we think), it should be possible to remove volatile
> qualifiers from many places in the source code. The attached patch
> does this in lwlock.c. If the changes in commit
> 0709b7ee72e4bc71ad07b7120acd117265ab51d0 (and follow-on commits) are
> correct and complete, applying this shouldn't break anything, while
> possibly giving the compiler room to optimize things better than it
> does today.
>
> However, demonstrating the necessity of that commit for these changes
> seems to be non-trivial. I tried applying this patch and reverting
> commits 5b26278822c69dd76ef89fd50ecc7cdba9c3f035,
> b4c28d1b92c81941e4fc124884e51a7c110316bf, and
> 0709b7ee72e4bc71ad07b7120acd117265ab51d0 on a PPC64 POWER8 box with a
> whopping 192 hardware threads (thanks, IBM!). I then ran the
> regression tests repeatedly, and I ran several long pgbench runs with
> as many as 350 concurrent clients. No failures.

There's actually one more commit to revert. What I used was:
git revert 5b26278822c69dd76ef89fd50ecc7cdba9c3f035 \
b4c28d1b92c81941e4fc124884e51a7c110316bf \
68e66923ff629c324e219090860dc9e0e0a6f5d6 \
0709b7ee72e4bc71ad07b7120acd117265ab51d0

> So I'm posting this patch in the hope that others can help. The
> relevant tests are:
>
> 1. If you apply this patch to master and run tests of whatever kind
> strikes your fancy, does anything break under high concurrency? If it
> does, then the above commits weren't enough to make this safe on your
> platform.
>
> 2. If you apply this patch to master, revert the commits mentioned
> above, and again run tests, does anything now break? If it does (but
> the first tests were OK), then that shows that those commits did
> something useful on your platform.

I just tried this on my normal x86 workstation. I applied your lwlock
patch and ontop I removed most volatiles (there's a couple still
required) from xlog.c. Works for 100 seconds. Then I reverted the above
commits. Breaks within seconds:
master:
LOG: request to flush past end of generated WAL; request 2/E5EC3DE0, currpos 2/E5EC1E60
standby:
LOG: record with incorrect prev-link 4/684C3108 at 4/684C3108
and similar.

So at least for x86 the compiler barriers are obviously required and
seemingly working.

I've attached the very quickly written xlog.c de-volatizing patch.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
0001-xlog.c-remove-volatile.patch text/x-patch 30.9 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: removing volatile qualifiers from lwlock.c
Date: 2014-09-19 17:58:17
Message-ID: CA+TgmoYmrmhOwZVBqpq9YLKKExRks4EOk_aFjP1UvgD7EE_fDw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Sep 17, 2014 at 7:45 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> I just tried this on my normal x86 workstation. I applied your lwlock
> patch and ontop I removed most volatiles (there's a couple still
> required) from xlog.c. Works for 100 seconds. Then I reverted the above
> commits. Breaks within seconds:
> master:
> LOG: request to flush past end of generated WAL; request 2/E5EC3DE0, currpos 2/E5EC1E60
> standby:
> LOG: record with incorrect prev-link 4/684C3108 at 4/684C3108
> and similar.
>
> So at least for x86 the compiler barriers are obviously required and
> seemingly working.

Oh, that's great. In that case I think I should go ahead and apply
that patch in the hopes of turning up any systems where the barriers
aren't working properly yet. Although it would be nice to know
whether it breaks with *only* the lwlock.c patch.

> I've attached the very quickly written xlog.c de-volatizing patch.

I don't have time to go through this in detail, but I don't object to
you applying it if you're confident you've done it carefully enough.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: removing volatile qualifiers from lwlock.c
Date: 2014-09-19 18:05:14
Message-ID: 20140919180514.GC13527@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2014-09-19 13:58:17 -0400, Robert Haas wrote:
> On Wed, Sep 17, 2014 at 7:45 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> > I just tried this on my normal x86 workstation. I applied your lwlock
> > patch and ontop I removed most volatiles (there's a couple still
> > required) from xlog.c. Works for 100 seconds. Then I reverted the above
> > commits. Breaks within seconds:
> > master:
> > LOG: request to flush past end of generated WAL; request 2/E5EC3DE0, currpos 2/E5EC1E60
> > standby:
> > LOG: record with incorrect prev-link 4/684C3108 at 4/684C3108
> > and similar.
> >
> > So at least for x86 the compiler barriers are obviously required and
> > seemingly working.
>
> Oh, that's great. In that case I think I should go ahead and apply
> that patch in the hopes of turning up any systems where the barriers
> aren't working properly yet.

Agreed.

> Although it would be nice to know whether it breaks with *only* the lwlock.c patch.

It didn't, at least not visibly within the 1000s I let pgbench run.

> > I've attached the very quickly written xlog.c de-volatizing patch.
>
> I don't have time to go through this in detail, but I don't object to
> you applying it if you're confident you've done it carefully enough.

It's definitely not yet carefully enough checked. I wanted to get it
break fast and only made one pass through the file. But I think it
should be easy enough to get it into shape for that.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services