Quick Links

SSI non-serializalbe UPDATE performance (was: getting to beta)

Lists:	pgsql-hackers

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	getting to beta
Date:	2011-04-06 13:21:18
Message-ID:	BANLkTimVE2_rTeD70dEZT10Kw0nyaYowLQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

A quick review of the open items list suggests that we have three main
areas that need attention before we can declare ourselves ready for
beta.

In no particular order:

1. There are a bunch of small, outstanding SSI patches.
2. Bugs - plural - related to pg_upgrade & typed tables.
3. Assorted collation issues.

There are a couple of smaller items, too, but those are the big ones.
Per previous discussion, the viable dates for code freeze for beta1
appear to be April 14th and April 28th. If we want to hit the earlier
of those dates, which in my opinion would be a great goal to have,
then we need to get all of the above issues resolved in the next 8
days, and I think we're going to need to kick it up a notch if we want
that to happen.

Most urgently, I believe we need a bit more committer bandwidth. I
believe that I could tackle either the SSI patches or the pg_upgrade &
typed tables issue, or I could try to make a dent in the collation
stuff, but I don't think I can cover two of those areas and I
definitely can't cover all three. Especially in the area of SSI, and
to some extent as regards typed tables, the patches are written, but
we have to get them reviewed and committed. Is anyone available to
help with this?

There are also a few issues where we need a patch and don't have one.
In those cases the patches could be written by either a committer or a
non-committer, but we need to make sure we know who is doing it so
that everything gets covered. In particular:

- SSI needs patch for the issue "SSI: three different HTABs contend
for shared memory in a free-for-all"
- typed tables needs a patch to allow an existing table to be made
into a typed table, and pg_dump --binary-upgrade needs to be made to
use that feature
- the open collation issues all lack any associated code (but maybe
Tom is planning to do this himself?)

The other minor issues are:

- do latches have memory ordering problems? I think the consensus is
that they work OK the way we're using them right now, so maybe we can
just drop this item, unless someone wants to pontificate further on
it.
- sync rep & smart shutdown - someone needs to review & apply Fujii
Masao's proposed patch
- generate_series boundary issue - I think this isn't a new regression
so it's probably not a blocker for beta1, but we might still want to
try to fix it. I seem to remember thinking that the prototype patch
looked like it needed pretty significant cleanup, but I haven't looked
at it in a while so I might be all wet.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: getting to beta
Date:	2011-04-06 13:42:16
Message-ID:	11500.1302097336@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> ... Most urgently, I believe we need a bit more committer bandwidth. I
> believe that I could tackle either the SSI patches or the pg_upgrade &
> typed tables issue, or I could try to make a dent in the collation
> stuff, but I don't think I can cover two of those areas and I
> definitely can't cover all three.

I intend to return to the collations issues as soon as I've knocked off
the GUC assign-hooks patch. That's taking longer than I thought (there
are a *lot* of assign hooks) but I think I'll be able to finish it today
or tomorrow. I have yet to read any of the SSI code, so I can't offer
much help in that area.

> The other minor issues are:

> - do latches have memory ordering problems? I think the consensus is
> that they work OK the way we're using them right now, so maybe we can
> just drop this item, unless someone wants to pontificate further on
> it.

I think this can be left as an open issue for now, to remind us that
some harder stress-testing on affected platforms would be a good thing.

> - generate_series boundary issue - I think this isn't a new regression
> so it's probably not a blocker for beta1, but we might still want to
> try to fix it.

Again, there's no reason that can't stay on the open items list past
beta1. We may or may not choose to fix it for 9.1, but it's not a beta
blocker.

regards, tom lane

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: getting to beta
Date:	2011-04-06 14:56:48
Message-ID:	BANLkTimeW+iEa-Qyg8ieirBghu2UbetpRw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, Apr 6, 2011 at 9:42 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> ... Most urgently, I believe we need a bit more committer bandwidth. I
>> believe that I could tackle either the SSI patches or the pg_upgrade &
>> typed tables issue, or I could try to make a dent in the collation
>> stuff, but I don't think I can cover two of those areas and I
>> definitely can't cover all three.
>
> I intend to return to the collations issues as soon as I've knocked off
> the GUC assign-hooks patch. That's taking longer than I thought (there
> are a *lot* of assign hooks) but I think I'll be able to finish it today
> or tomorrow. I have yet to read any of the SSI code, so I can't offer
> much help in that area.
>
>> The other minor issues are:
>
>> - do latches have memory ordering problems? I think the consensus is
>> that they work OK the way we're using them right now, so maybe we can
>> just drop this item, unless someone wants to pontificate further on
>> it.
>
> I think this can be left as an open issue for now, to remind us that
> some harder stress-testing on affected platforms would be a good thing.

OK, fair enough.

>> - generate_series boundary issue - I think this isn't a new regression
>> so it's probably not a blocker for beta1, but we might still want to
>> try to fix it.
>
> Again, there's no reason that can't stay on the open items list past
> beta1. We may or may not choose to fix it for 9.1, but it's not a beta
> blocker.

I agree. But again, that's not really what I'm focusing on - the
collations stuff, the typed tables patch, and SSI all need serious
looking at, and I'm not sure who is going to pick all that up.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: getting to beta
Date:	2011-04-06 15:02:21
Message-ID:	13212.1302102141@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> I agree. But again, that's not really what I'm focusing on - the
> collations stuff, the typed tables patch, and SSI all need serious
> looking at, and I'm not sure who is going to pick all that up.

Well, I'll take responsibility for collations. If I get done with that
before the 14th, I can see what's up with typed tables. I'm not willing
to do anything with SSI at this stage.

regards, tom lane

From:	Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: getting to beta
Date:	2011-04-06 16:06:56
Message-ID:	4D9C8FA0.1030602@enterprisedb.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 06.04.2011 18:02, Tom Lane wrote:
> Robert Haas<robertmhaas(at)gmail(dot)com> writes:
>> I agree. But again, that's not really what I'm focusing on - the
>> collations stuff, the typed tables patch, and SSI all need serious
>> looking at, and I'm not sure who is going to pick all that up.
>
> Well, I'll take responsibility for collations. If I get done with that
> before the 14th, I can see what's up with typed tables. I'm not willing
> to do anything with SSI at this stage.

I can look at the SSI patches, but not until next week, I'm afraid.
Robert, would you like to pick that up before then? Kevin & Dan have
done all the heavy lifting, but it's nevertheless pretty complicated
code to review.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: getting to beta
Date:	2011-04-06 16:16:17
Message-ID:	BANLkTi=8nAPcGebuCeU15jPe+N_++8_2jA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, Apr 6, 2011 at 12:06 PM, Heikki Linnakangas
<heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
> On 06.04.2011 18:02, Tom Lane wrote:
>>> I agree. But again, that's not really what I'm focusing on - the
>>> collations stuff, the typed tables patch, and SSI all need serious
>>> looking at, and I'm not sure who is going to pick all that up.
>>
>> Well, I'll take responsibility for collations. If I get done with that
>> before the 14th, I can see what's up with typed tables. I'm not willing
>> to do anything with SSI at this stage.
>
> I can look at the SSI patches, but not until next week, I'm afraid. Robert,
> would you like to pick that up before then? Kevin & Dan have done all the
> heavy lifting, but it's nevertheless pretty complicated code to review.

I'll try, and see how far I get with it. If you can pick up whatever
I don't get to by early next week, that would be a big help. I am
going to be in Santa Clara next week for the MySQL conference (don't
worry, I'll be talking about PostgreSQL!) and that's going to cut into
my time quite a bit. The one I'm most worried about is "SSI: three
different HTABs contend for shared memory in a free-for-all" - because
there's no patch for that yet, and I am wary of breaking something
mucking around with it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

From:	"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To:	"Heikki Linnakangas" <heikki(dot)linnakangas(at)enterprisedb(dot)com>, "Robert Haas" <robertmhaas(at)gmail(dot)com>
Cc:	"Dan Ports" <drkp(at)csail(dot)mit(dot)edu>,<pgsql-hackers(at)postgresql(dot)org>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject:	Re: getting to beta
Date:	2011-04-06 16:27:24
Message-ID:	4D9C4E1C020000250003C3B1@gw.wicourts.gov
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:

>> I can look at the SSI patches, but not until next week, I'm
>> afraid. Robert, would you like to pick that up before then? Kevin
>> & Dan have done all the heavy lifting, but it's nevertheless
>> pretty complicated code to review.
>
> I'll try, and see how far I get with it. If you can pick up
> whatever I don't get to by early next week, that would be a big
> help. I am going to be in Santa Clara next week for the MySQL
> conference (don't worry, I'll be talking about PostgreSQL!) and
> that's going to cut into my time quite a bit. The one I'm most
> worried about is "SSI: three different HTABs contend for shared
> memory in a free-for-all" - because there's no patch for that yet,
> and I am wary of breaking something mucking around with it.

I haven't seen any objection to Heikki's suggestion for how to
handle the shared memory free-for-all:

http://archives.postgresql.org/message-id/4D94C889.3050607@enterprisedb.com

Either Dan or I will put something together along those lines before
next week.

-Kevin

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc:	"Heikki Linnakangas" <heikki(dot)linnakangas(at)enterprisedb(dot)com>, "Robert Haas" <robertmhaas(at)gmail(dot)com>, "Dan Ports" <drkp(at)csail(dot)mit(dot)edu>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: getting to beta
Date:	2011-04-06 16:46:23
Message-ID:	29933.1302108383@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> writes:
> Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> ... The one I'm most
>> worried about is "SSI: three different HTABs contend for shared
>> memory in a free-for-all" - because there's no patch for that yet,
>> and I am wary of breaking something mucking around with it.

> I haven't seen any objection to Heikki's suggestion for how to
> handle the shared memory free-for-all:

I confess to not having been reading the discussions about SSI very
much, but ... do we actually care whether there's a free-for-all?
What's the downside to letting the remaining shmem get claimed by
whichever table uses it first?

regards, tom lane

From:	Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Robert Haas <robertmhaas(at)gmail(dot)com>, Dan Ports <drkp(at)csail(dot)mit(dot)edu>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: getting to beta
Date:	2011-04-06 16:57:41
Message-ID:	4D9C9B85.2070808@enterprisedb.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 06.04.2011 17:46, Tom Lane wrote:
> "Kevin Grittner"<Kevin(dot)Grittner(at)wicourts(dot)gov> writes:
>> Robert Haas<robertmhaas(at)gmail(dot)com> wrote:
>>> ... The one I'm most
>>> worried about is "SSI: three different HTABs contend for shared
>>> memory in a free-for-all" - because there's no patch for that yet,
>>> and I am wary of breaking something mucking around with it.
>
>> I haven't seen any objection to Heikki's suggestion for how to
>> handle the shared memory free-for-all:
>
> I confess to not having been reading the discussions about SSI very
> much, but ... do we actually care whether there's a free-for-all?
> What's the downside to letting the remaining shmem get claimed by
> whichever table uses it first?

It's leads to odd behavior. You start the database, and your application
runs fine. Then you restart the database, and now you get "out of shared
memory" errors from transactions that used to work.

It's not the end of the world, but I'd prefer stable, repeatable
behavior, even though having the slack shared memory be grabbed by
whoever needs it first might in theory lead to better utilization of
resources.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

From:	Thom Brown <thom(at)linux(dot)com>
To:	Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Robert Haas <robertmhaas(at)gmail(dot)com>, Dan Ports <drkp(at)csail(dot)mit(dot)edu>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: getting to beta
Date:	2011-04-06 17:01:11
Message-ID:	BANLkTikbcRgumwsRxB3U1DHeF+rCBdXyog@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 6 April 2011 17:57, Heikki Linnakangas
<heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
> On 06.04.2011 17:46, Tom Lane wrote:
>>
>> "Kevin Grittner"<Kevin(dot)Grittner(at)wicourts(dot)gov> writes:
>>>
>>> Robert Haas<robertmhaas(at)gmail(dot)com> wrote:
>>>>
>>>> ... The one I'm most
>>>> worried about is "SSI: three different HTABs contend for shared
>>>> memory in a free-for-all" - because there's no patch for that yet,
>>>> and I am wary of breaking something mucking around with it.
>>
>>> I haven't seen any objection to Heikki's suggestion for how to
>>> handle the shared memory free-for-all:
>>
>> I confess to not having been reading the discussions about SSI very
>> much, but ... do we actually care whether there's a free-for-all?
>> What's the downside to letting the remaining shmem get claimed by
>> whichever table uses it first?
>
> It's leads to odd behavior. You start the database, and your application
> runs fine. Then you restart the database, and now you get "out of shared
> memory" errors from transactions that used to work.
>
> It's not the end of the world, but I'd prefer stable, repeatable behavior,
> even though having the slack shared memory be grabbed by whoever needs it
> first might in theory lead to better utilization of resources.

It sounds a bit apocalyptic to me, if that really is happening.

--
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc:	Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Robert Haas <robertmhaas(at)gmail(dot)com>, Dan Ports <drkp(at)csail(dot)mit(dot)edu>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: getting to beta
Date:	2011-04-06 17:08:52
Message-ID:	5382.1302109732@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> writes:
> On 06.04.2011 17:46, Tom Lane wrote:
>> I confess to not having been reading the discussions about SSI very
>> much, but ... do we actually care whether there's a free-for-all?
>> What's the downside to letting the remaining shmem get claimed by
>> whichever table uses it first?

> It's leads to odd behavior. You start the database, and your application
> runs fine. Then you restart the database, and now you get "out of shared
> memory" errors from transactions that used to work.

If you get "out of shared memory" at all due to SSI, I'd say that that's
the problem, not exactly when it happens. I thought that the patch
included provisions for falling back to coarser-grained locks whenever
it was short of resources.

regards, tom lane

From:	"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To:	"Heikki Linnakangas" <heikki(dot)linnakangas(at)enterprisedb(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"Dan Ports" <drkp(at)csail(dot)mit(dot)edu>, "Robert Haas" <robertmhaas(at)gmail(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: getting to beta
Date:	2011-04-06 17:25:26
Message-ID:	4D9C5BB6020000250003C3B9@gw.wicourts.gov
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> If you get "out of shared memory" at all due to SSI, I'd say that
> that's the problem, not exactly when it happens. I thought that
> the patch included provisions for falling back to coarser-grained
> locks whenever it was short of resources.

When one of the tests was getting out of memory errors we were
initially having trouble telling where the memory was actually
consumed, because it wasn't necessarily due to the type of object
being allocated at the point of failure. That was the motivation
for my attempt to log when an HTAB grew past its "maximum". The
problem turned out to be a field which wasn't properly initialized
in certain corner cases, making the cleanup phase fail to clear them
when appropriate. There is a patch to fix that bug, but the issue
raised in the early phase of investigation is what, if anything we
should do about the free-for-all allocation.

If we want to call that a feature and take it off the 9.1 list,
that's OK with me. It's a new issue with 9.1 in the sense that
there used to be only one HTAB which could grab the slack space, and
only generate its out of memory error once that slack space was
exhausted. Now that there are three, things are a bit less
predictable.

By the way, the problem with SSI potentially running out of shared
memory is rather parallel to how heavyweight locks can run out of
shared memory. The SLRU prevents the number of transactions from
being limited in that way, and multiple locks per table escalate
granularity, but with a strange enough workload (for example,
accessing hundreds of tables per transaction) one might need to
boost max_pred_locks_per_transaction above the default to avoid
shared memory exhaustion.

-Kevin

From:	Dan Ports <drkp(at)csail(dot)mit(dot)edu>
To:	Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc:	Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: getting to beta
Date:	2011-04-06 19:27:55
Message-ID:	20110406192755.GB33037@csail.mit.edu
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, Apr 06, 2011 at 12:25:26PM -0500, Kevin Grittner wrote:
> By the way, the problem with SSI potentially running out of shared
> memory is rather parallel to how heavyweight locks can run out of
> shared memory. The SLRU prevents the number of transactions from
> being limited in that way, and multiple locks per table escalate
> granularity, but with a strange enough workload (for example,
> accessing hundreds of tables per transaction) one might need to
> boost max_pred_locks_per_transaction above the default to avoid
> shared memory exhaustion.

In fact, it's exactly the same: if a backend wants to acquire many
heavyweight locks, it doesn't stop at max_locks_per_xact, it just
keeps allocating them until shmem is exhausted.

So it's possible, if less likely, to have the same problem with regular
locks causing the system to run out of shared memory. Which sounds to
me like a good reason to address both problems in one place.

Dan

--
Dan R. K. Ports MIT CSAIL http://drkp.net/

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Dan Ports <drkp(at)csail(dot)mit(dot)edu>
Cc:	Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: getting to beta
Date:	2011-04-06 20:58:34
Message-ID:	BANLkTimcQtdVkvLG8BTSG+i5YHMzXcRYfg@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, Apr 6, 2011 at 3:27 PM, Dan Ports <drkp(at)csail(dot)mit(dot)edu> wrote:
> On Wed, Apr 06, 2011 at 12:25:26PM -0500, Kevin Grittner wrote:
>> By the way, the problem with SSI potentially running out of shared
>> memory is rather parallel to how heavyweight locks can run out of
>> shared memory. The SLRU prevents the number of transactions from
>> being limited in that way, and multiple locks per table escalate
>> granularity, but with a strange enough workload (for example,
>> accessing hundreds of tables per transaction) one might need to
>> boost max_pred_locks_per_transaction above the default to avoid
>> shared memory exhaustion.
>
> In fact, it's exactly the same: if a backend wants to acquire many
> heavyweight locks, it doesn't stop at max_locks_per_xact, it just
> keeps allocating them until shmem is exhausted.
>
> So it's possible, if less likely, to have the same problem with regular
> locks causing the system to run out of shared memory. Which sounds to
> me like a good reason to address both problems in one place.

The real fix for this problem is probably to have the ability to
actually return memory to the shared pool, rather than having everyone
grab as they need it until there's no more and never give back. But
that's not going to happen in 9.1, so the question is whether this is
a sufficiently serious problem that we ought to impose the proposed
stopgap fix between now and whenever we do that.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

From:	"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To:	"Dan Ports" <drkp(at)csail(dot)mit(dot)edu>, "Robert Haas" <robertmhaas(at)gmail(dot)com>
Cc:	"Heikki Linnakangas" <heikki(dot)linnakangas(at)enterprisedb(dot)com>, <pgsql-hackers(at)postgresql(dot)org>,"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject:	Re: getting to beta
Date:	2011-04-06 22:32:15
Message-ID:	4D9CA39F020000250003C48C@gw.wicourts.gov
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> The real fix for this problem is probably to have the ability to
> actually return memory to the shared pool, rather than having
> everyone grab as they need it until there's no more and never give
> back. But that's not going to happen in 9.1, so the question is
> whether this is a sufficiently serious problem that we ought to
> impose the proposed stopgap fix between now and whenever we do
> that.

There is a middle course between leaving the current approach of
preallocating half the maximum size and leaving the other half up
for grabs and the course Heikki proposes of making the maximum a
hard limit. I submitted a patch to preallocate the maximum, so a
request for a particular HTAB object will never get "out of shared
memory" unless it is past its maximum:

http://archives.postgresql.org/message-id/4D948066020000250003C00B@gw.wicourts.gov

That would leave some extra which is factored into the calculations
up for grabs, but each table would be guaranteed at least its
maximum number of entries. This seems pretty safe to me, and not
very invasive. We could always revisit in this 9.2 if that's not
good enough.

-Kevin

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc:	Dan Ports <drkp(at)csail(dot)mit(dot)edu>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject:	Re: getting to beta
Date:	2011-04-07 03:52:16
Message-ID:	BANLkTimVuicyZG4j3F427BgfA2iYP8Od_Q@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, Apr 6, 2011 at 6:32 PM, Kevin Grittner
<Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> The real fix for this problem is probably to have the ability to
>> actually return memory to the shared pool, rather than having
>> everyone grab as they need it until there's no more and never give
>> back. But that's not going to happen in 9.1, so the question is
>> whether this is a sufficiently serious problem that we ought to
>> impose the proposed stopgap fix between now and whenever we do
>> that.
>
> There is a middle course between leaving the current approach of
> preallocating half the maximum size and leaving the other half up
> for grabs and the course Heikki proposes of making the maximum a
> hard limit. I submitted a patch to preallocate the maximum, so a
> request for a particular HTAB object will never get "out of shared
> memory" unless it is past its maximum:
>
> http://archives.postgresql.org/message-id/4D948066020000250003C00B@gw.wicourts.gov
>
> That would leave some extra which is factored into the calculations
> up for grabs, but each table would be guaranteed at least its
> maximum number of entries. This seems pretty safe to me, and not
> very invasive. We could always revisit in this 9.2 if that's not
> good enough.

OK, I agree. We certainly can't have a temporary demand for predicate
locks starve out heavyweight locks for the rest of the postmaster
lifetime, or visca versa. So we need to do at least that much.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: getting to beta
Date:	2011-04-08 21:12:59
Message-ID:	BANLkTike_uCidznehYyiO+Pkv_Sx73Y9yA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, Apr 6, 2011 at 12:16 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Wed, Apr 6, 2011 at 12:06 PM, Heikki Linnakangas
> <heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>> On 06.04.2011 18:02, Tom Lane wrote:
>>>> I agree. But again, that's not really what I'm focusing on - the
>>>> collations stuff, the typed tables patch, and SSI all need serious
>>>> looking at, and I'm not sure who is going to pick all that up.
>>>
>>> Well, I'll take responsibility for collations. If I get done with that
>>> before the 14th, I can see what's up with typed tables. I'm not willing
>>> to do anything with SSI at this stage.
>>
>> I can look at the SSI patches, but not until next week, I'm afraid. Robert,
>> would you like to pick that up before then? Kevin & Dan have done all the
>> heavy lifting, but it's nevertheless pretty complicated code to review.
>
> I'll try, and see how far I get with it. If you can pick up whatever
> I don't get to by early next week, that would be a big help. I am
> going to be in Santa Clara next week for the MySQL conference (don't
> worry, I'll be talking about PostgreSQL!) and that's going to cut into
> my time quite a bit.

I think I've cleared out most of the small stuff. The two SSI related
issues still on the open items list are:

* SSI: failure to clean up some SLRU-summarized locks
* SSI: three different HTABs contend for shared memory in a free-for-all

If you can pick those two up, that would be very helpful; I suspect
you can work your way through them faster and with fewer mistakes than
I would be able to manage.

The other two items are:

* Typed-tables patch broke pg_upgrade
* assorted collation issues

Tom said he'd take care of the collation issues. Peter Eisentraut,
Noah Misch, and I have been exchanging emails on the typed tables
problems, of which there appear to be several, but it's not real clear
to me that we're converging on a comprehensive solution.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

From:	"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To:	"Heikki Linnakangas" <heikki(dot)linnakangas(at)enterprisedb(dot)com>, "Robert Haas" <robertmhaas(at)gmail(dot)com>
Cc:	<pgsql-hackers(at)postgresql(dot)org>,"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject:	Re: getting to beta
Date:	2011-04-08 21:54:51
Message-ID:	4D9F3DDB020000250003C5C6@gw.wicourts.gov
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> I think I've cleared out most of the small stuff.

Thanks!

> The two SSI related issues still on the open items list are:
>
> * SSI: failure to clean up some SLRU-summarized locks

This one is very important. Not only could it lead to unnecessary
false positive serialization failures, but (more importantly) it
leaks shared memory by not clearing some locks, leading to potential
"out of shared memory" errors.

While this isn't as small as most of the SSI patches, I'm going to
point out (to reassure those who haven't been reading the patches)
that this one modifies two lines, adds six Assert statements which
Dan found useful in debugging the issue, and adds (if you ignore
white space and braces) four lines of code. "Big" is a relative
term here. The problem is that the code in which these tweaks fall
is hard to get one's head around.

> * SSI: three different HTABs contend for shared memory in a
> free-for-all

I think we're pretty much agreed that something should be done about
this, but the main issue here is that if either heavyweight locks or
SIRead predicate locks exhaust memory, the other might be unlucky
enough to get the error, making it harder to identify the cause.
Without the above bug or an unusual workload, it would tend not to
make a difference.

If things come down to the wire and this is the only thing holding
up the beta release, I'd suggest going ahead and cutting the beta.

-Kevin

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: getting to beta
Date:	2011-04-19 22:18:23
Message-ID:	BANLkTikjtOD1ow30vDSBF5-XY4LYpVAE5Q@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, Apr 6, 2011 at 9:21 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> A quick review of the open items list suggests that we have three main
> areas that need attention before we can declare ourselves ready for
> beta.
>
> In no particular order:
>
> 1. There are a bunch of small, outstanding SSI patches.
> 2. Bugs - plural - related to pg_upgrade & typed tables.
> 3. Assorted collation issues.

Since we're targeting code freeze for beta1 for approximately now + 1
week, it's probably about time to take stock of where we are.

1. All of the SSI patches have been dealt with.
2. The typed tables stuff vs. pg_upgrade still needs work. I would be
just as happy if Tom or Peter wanted to fix this, mostly for fear of
getting flak over the details of the fixes, but if not I will do it.
3. The collation issues that have been discussed on-list have, I
*think*, mostly been dealt with. But maybe there are some broken
things that haven't been discussed yet?

New things:

- There is an outstanding bug-fix patch for PL/python tracebacks,
proof that no patch is too small to require multiple rounds of bug
fixing.
- There are some minor infelicities in the handling of permissions for
foreign tables. Since I committed a chunk of that stuff, I think it
probably falls to me to clean this up, unless someone else wants to
volunteer.

Thoughts?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: getting to beta
Date:	2011-04-19 23:03:27
Message-ID:	11619.1303254207@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> Since we're targeting code freeze for beta1 for approximately now + 1
> week, it's probably about time to take stock of where we are.

> 3. The collation issues that have been discussed on-list have, I
> *think*, mostly been dealt with. But maybe there are some broken
> things that haven't been discussed yet?

I have no open items for collations right now, but feel a need to
re-read the original patch in toto before signing off on it.
I'll try to get that done in the next day or two.

BTW, I'm not sure if this was mentioned on-list previously, but
we are thinking of wrapping the beta the evening of Wednesday 27th,
not Thursday 28th as the traditional release scheduling would have it.
(It seems our British contingent is planning to take the Friday off
for some wedding or other, so there's no hope of getting Windows
installers built on-time otherwise.) So that's one less day than
you might have been thinking. I see no reason we can't make it
though. It's past time to get this puppy out the door.

regards, tom lane

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: getting to beta
Date:	2011-04-20 00:20:46
Message-ID:	BANLkTinZejqir-6qMP0OY9MsDipuc-bGdw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, Apr 19, 2011 at 7:03 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> Since we're targeting code freeze for beta1 for approximately now + 1
>> week, it's probably about time to take stock of where we are.
>
>> 3. The collation issues that have been discussed on-list have, I
>> *think*, mostly been dealt with. But maybe there are some broken
>> things that haven't been discussed yet?
>
> I have no open items for collations right now, but feel a need to
> re-read the original patch in toto before signing off on it.
> I'll try to get that done in the next day or two.
>
>
> BTW, I'm not sure if this was mentioned on-list previously, but
> we are thinking of wrapping the beta the evening of Wednesday 27th,
> not Thursday 28th as the traditional release scheduling would have it.
> (It seems our British contingent is planning to take the Friday off
> for some wedding or other, so there's no hope of getting Windows
> installers built on-time otherwise.) So that's one less day than
> you might have been thinking. I see no reason we can't make it
> though. It's past time to get this puppy out the door.

+1.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

From:	Peter Eisentraut <peter_e(at)gmx(dot)net>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: getting to beta
Date:	2011-04-21 15:38:21
Message-ID:	1303400301.9126.10.camel@vanquo.pezone.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, 2011-04-19 at 18:18 -0400, Robert Haas wrote:
> 2. The typed tables stuff vs. pg_upgrade still needs work. I would be
> just as happy if Tom or Peter wanted to fix this, mostly for fear of
> getting flak over the details of the fixes, but if not I will do it.

Noah Misch is hot on the trail of that one.

> - There is an outstanding bug-fix patch for PL/python tracebacks,

That has been addressed.

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: getting to beta
Date:	2011-04-21 15:46:30
Message-ID:	BANLkTinPNQ7SFQ6EJGyUKBBeKBQZOjj3uQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, Apr 21, 2011 at 11:38 AM, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
> On Tue, 2011-04-19 at 18:18 -0400, Robert Haas wrote:
>> 2. The typed tables stuff vs. pg_upgrade still needs work. I would be
>> just as happy if Tom or Peter wanted to fix this, mostly for fear of
>> getting flak over the details of the fixes, but if not I will do it.
>
> Noah Misch is hot on the trail of that one.

Yes, but inasmuch as he is not a committer, someone who is will need
to be involved. I dealt with the prerequisite ALTER TABLE .. OF/NOT
OF patch last night, but the related pg_dump patch that actually fixes
the problem still needs to be looked at, and the earliest (and
probably only) time that I can potentially do that is Monday. So it
would be great if you or someone else could pick it up.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

From:	"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To:	"Robert Haas" <robertmhaas(at)gmail(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Cc:	"Dan Ports" <drkp(at)csail(dot)mit(dot)edu>
Subject:	Re: getting to beta
Date:	2011-04-21 16:32:03
Message-ID:	4DB015B3020000250003CB51@gw.wicourts.gov
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> 1. All of the SSI patches have been dealt with.

I'll add the non-serializable UPDATE performance issue. Dan has
been benchmarking to try to find a worst case; I don't want to speak
for him too much, but as he was headed off to lecture a class he
sent me results so far, and with beta so close I figure I should
pass along a rough outline. The worst case he has been able to
construct so far was running 32 active processes on a 16 processor
machine in an update-mostly mix against a database on tmpfs (so no
disk writes) on a dataset which fits inside shared_memory. This was
able to generate enough contention on an exclusive LW lock to cause
a 0.7% slowdown.

Speaking for myself, I believe we'll be able to provide a very small
patch to eliminate this. Probably today or tomorrow. While in a
less extreme runtime environment it would probably be hard to pick
out a performance impact in the normal noise, I expect the fix to be
small and safe enough to be worth doing.

I do feel that it would be good to apply the one-line fix Heikki
posted, which is orthogonal and needed in any event. That would
give a little time for others to easily test it before beta.

-Kevin

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc:	pgsql-hackers(at)postgresql(dot)org, Dan Ports <drkp(at)csail(dot)mit(dot)edu>
Subject:	Re: getting to beta
Date:	2011-04-21 16:41:05
Message-ID:	BANLkTi=nYmmCA9e0n_sk3a6OvKNGzEDq-A@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, Apr 21, 2011 at 12:32 PM, Kevin Grittner
<Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
>> 1. All of the SSI patches have been dealt with.
>
> I'll add the non-serializable UPDATE performance issue. Dan has
> been benchmarking to try to find a worst case; I don't want to speak
> for him too much, but as he was headed off to lecture a class he
> sent me results so far, and with beta so close I figure I should
> pass along a rough outline. The worst case he has been able to
> construct so far was running 32 active processes on a 16 processor
> machine in an update-mostly mix against a database on tmpfs (so no
> disk writes) on a dataset which fits inside shared_memory. This was
> able to generate enough contention on an exclusive LW lock to cause
> a 0.7% slowdown.
>
> Speaking for myself, I believe we'll be able to provide a very small
> patch to eliminate this. Probably today or tomorrow. While in a
> less extreme runtime environment it would probably be hard to pick
> out a performance impact in the normal noise, I expect the fix to be
> small and safe enough to be worth doing.
>
> I do feel that it would be good to apply the one-line fix Heikki
> posted, which is orthogonal and needed in any event. That would
> give a little time for others to easily test it before beta.

Please add that patch to the open items list if it is not there already.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

From:	Dan Ports <drkp(at)csail(dot)mit(dot)edu>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	SSI non-serializalbe UPDATE performance (was: getting to beta)
Date:	2011-04-22 22:07:34
Message-ID:	20110422220734.GG57793@csail.mit.edu
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

For background, the issue here is that there are three SSI calls that
get invoked even on non-serializable transactions:
- PredicateLockPageSplit/Combine, during B-tree page splits/combines
- PredicateLockTupleRowVersionLink, from heap_update

These have to update any matching SIREAD locks to match the new lock
target. If there aren't any serializable transactions, there won't be
any, but it still has to check and this requires taking a LWLock. Every
other SSI function checks XactIsoLevel and bails out immediately if
non-serializable.

Like Kevin said, I tested this by removing these three calls and
comparing under what I see as worst-case conditions. I used pgbench, an
update-mostly workload, in read committed mode. The database (scale
factor 100) fit in shared_buffers and was backed by tmpfs so disk
accesses didn't enter the picture anywhere. I ran it on a 16-core
machine to stress lock contention.

Even under these conditions I couldn't reliably see a slowdown. My
latest batch of results (16 backends, median of three 10 minute runs)
shows a difference well below 1%. In a couple of cases I saw the code
with the SSI checks running faster than with them removed, so this
difference seems in the noise.

Given that result, and considering it's a pretty extreme condition, it
probably isn't worth worrying about this too much, but...

There's a quick fix: we might as well bail out of these functions early
if there are no serializable transactions running. Kevin points out we
can do this by checking if PredXact->SxactGlobalXmin is invalid. I
would add that we can do this safely without taking any locks, even on
weak-memory-ordering machines. Even if a new serializable transaction
starts concurrently, we have the appropriate buffer page locked, so
it's not able to take any relevant SIREAD locks.

Dan

--
Dan R. K. Ports MIT CSAIL http://drkp.net/

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: getting to beta
Date:	2011-04-25 23:34:27
Message-ID:	BANLkTinTF_zxXE0T5dygAbsPVyKYwXO7Jw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, Apr 21, 2011 at 11:46 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Thu, Apr 21, 2011 at 11:38 AM, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
>> On Tue, 2011-04-19 at 18:18 -0400, Robert Haas wrote:
>>> 2. The typed tables stuff vs. pg_upgrade still needs work. I would be
>>> just as happy if Tom or Peter wanted to fix this, mostly for fear of
>>> getting flak over the details of the fixes, but if not I will do it.
>>
>> Noah Misch is hot on the trail of that one.
>
> Yes, but inasmuch as he is not a committer, someone who is will need
> to be involved. I dealt with the prerequisite ALTER TABLE .. OF/NOT
> OF patch last night, but the related pg_dump patch that actually fixes
> the problem still needs to be looked at, and the earliest (and
> probably only) time that I can potentially do that is Monday. So it
> would be great if you or someone else could pick it up.

Well, I addressed most of the remaining open items today, but not this
one. Hopefully, someone else can pick it up, because I'm leaving end
of day tomorrow for a week's vacation in Germany.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company