From: | Greg Stark <stark(at)enterprisedb(dot)com> |
---|---|
To: | Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov> |
Cc: | "<Markus Wanner" <markus(at)bluegap(dot)ch>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, "<pgsql-hackers(at)postgresql(dot)org>" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: User-facing aspects of serializable transactions |
Date: | 2009-06-01 23:46:08 |
Message-ID: | 4136ffa0906011646n2ab749bdk7a9a316b2692725a@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Jun 1, 2009 at 11:07 PM, Kevin Grittner
<Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:
> Greg Stark <stark(at)enterprisedb(dot)com> wrote:
>
>> No, I'm not. I'm questioning whether a serializable transaction
>> isolation level that makes no guarantee that it won't fire
>> spuriously is useful.
>
> Well, the technique I'm advocating virtually guarantees that there
> will be false positives, since it looks only for the "dangerous
> structure" of two adjacent read-write dependences rather than building
> a rigorous read-write dependency graph for every serializable
> transaction. Even if you user very fine-grained locks (i.e., what
> *columns* were modified in what rows) and had totally accurate
> predicate locking, you would still get spurious rollbacks with this
> technique.
Yeah, I'm ok compromising on things like having updates on other
columns or even no-op updates trigger serialization failures. For one
thing they do currently, but more importantly from my point of view
they can be explained in documentation and make sense from a user's
point of view.
More generally any time you have a set of transactions that are
touching and selecting from the same set of records, I think it's
obvious to a user that a serialization failure might be possible.
I'm not happy having things like "where x = 5 and y = 5" randomly
choose either to lock all records in one or the other index range (or
the whole table) when only the intersection are really interesting to
the plan. That leaves a careful programmer no way to tell which of his
transactions might conflict.
And I'm *really* unhappy with having the decision on which range to
lock depend on the planner decision. That means sometime (inevitably
in the middle of a night) the database will suddenly start getting
serialization failures on transactions that never did before
(inevitably critical batch jobs) because the planner switched plans.
> In spite of that, I believe that it will run faster than traditional
> serializable transactions, and in one benchmark it ran faster than
> snapshot isolation -- apparently because it rolled back conflicting
> transactions before they did updates and hit the update conflict
> detection phase.
"I can get the answer infinitely fast if it doesn't have to be right"
I know a serialization failure isn't a fatal error and the application
has to be prepared to retry. And I agree that some compromises are
reasonable, "serialization failure" doesn't have to mean "the database
ran a theorem prover and proved that it was impossible to serialize
these transactions". But I think a programmer has to be able to look
at the set of transactions and say "yeah I can see these transactions
all depend on the same records".
>> Postgres doesn't take block level locks or table level locks to do
>> row-level operations. You can write code and know that it's safe
>> from deadlocks.
>
> Who's talking about deadlocks? If you're speaking more broadly of all
> serialization failures, you can certainly get them in PostgreSQL. So
> one of us is not understanding the other here. To clarify what I'm
> talking about -- this technique introduces no blocking and cannot
> cause a deadlock.
Sorry, I meant to type a second paragraph there to draw the analogy.
Just as carefully written SQL code can be written to avoid deadlocks I
would expect to be able to look at SQL code and know it's safe from
serialization failures, or at least know where they might occur.
--
greg
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2009-06-02 00:04:07 | Re: Patch: AdjustIntervalForTypmod shouldn't discard high-order data |
Previous Message | Kevin Grittner | 2009-06-01 23:34:36 | Re: It's June 1; do you know where your release is? |