Re: Serializable Snapshot Isolation

From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Heikki Linnakangas" <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: <drkp(at)csail(dot)mit(dot)edu>,<pgsql-hackers(at)postgresql(dot)org>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Serializable Snapshot Isolation
Date: 2010-09-24 16:17:55
Message-ID: 4C9C88E30200002500035CF0@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:

> My aim is still to put an upper bound on the amount of shared
> memory required, regardless of the number of committed but still
> interesting transactions.

> That maps nicely to a SLRU table

Well, that didn't take as long to get my head around as I feared.

I think SLRU would totally tank performance if used for this, and
would really not put much of a cap on the memory taken out of
circulation for purposes of caching. Transactions are not
referenced more heavily at the front of the list nor are they
necessarily discarded more or less in order of acquisition. In
transaction mixes where all transaction last about the same length
of time, the upper limit of interesting transactions is about twice
the number of active transactions, so memory demands are pretty
light. The problems come in where you have at least one long-lived
transaction and a lot of concurrent short-lived transactions. Since
all transactions are scanned for cleanup every time a transaction
completes, either they would all be taking up cache space or
performance would drop to completely abysmal levels as it pounded
disk. So SLRU in this case would be a sneaky way to effectively
dynamically allocate shared memory, but about two orders of
magnitude slower, at best.

Here are the things which I think might be done, in some
combination, to address your concern without killing performance:

(1) Mitigate memory demand through more aggressive cleanup. As an
example, a transaction which is READ ONLY (or which hasn't written
to a relevant table as tracked by a flag in the transaction
structure) is not of interest after commit, and can be immediately
cleaned up, unless there is an overlapping non-read-only transaction
which overlaps a committed transaction which wrote data. This is
clearly not a solution to your concern in itself, but it combines
with the other suggestions to make them more effective.

(2) Similar to SLRU, allocate pages from shared buffers for lists,
but pin them in memory without ever writing them to disk. A buffer
could be freed when the last list item in it was freed and the
buffer count for the list was above some minimum. This could deal
with the episodic need for larger than typical amounts of RAM
without permanently taking large quantities our of circulation.
Obviously, we would still need some absolute cap, so this by itself
doesn't answer your concern, either -- it just the impact to scale
to the need dynamically and within bounds. It has the same
effective impact on memory usage as SLRU for this application
without the same performance penalty.

(3) Here's the meat of it. When the lists hit their maximum, have
some way to gracefully degrade the accuracy of the conflict
tracking. This is similar to your initial suggestion that once a
transaction committed we would not track it in detail, but
implemented "at need" when memory resources for tracking the detail
become exhausted. I haven't worked out all the details, but I have
a rough outline in my head. I wanted to run this set of ideas past
you before I put the work in to fully develop it. This would be an
alternative to just canceling the oldest running serializable
transaction, which is the solution we could use right now to live
within some set limit, possibly with (1) or (2) to help push back
the point at which that's necessary. Rather than deterministically
canceling the oldest active transaction, it would increase the
probability of transactions being canceled because of false
positives, with the chance we'd get through the peak without any
such cancellations.

Thoughts?

-Kevin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2010-09-24 16:26:17 Re: Easy way to verify gitignore files?
Previous Message Greg Stark 2010-09-24 16:17:51 Magnus? Is that you?