Re: Table Lock issue

Lists: pgsql-general
From: "Uwe C(dot) Schroeder" <uwe(at)oss4u(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Table Lock issue
Date: 2004-04-15 18:35:28
Message-ID: 200404151135.28579.uwe@oss4u.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

maybe my mind is stuck, but here's something strange.
This is the classic "counter" thing, where you can't / won't use sequences.
Basically I need to assemble an identifier like

AAA-000012

where AAA- is the prefix and the number is sequencially counted up.

The table is

CREATE TABLE ib_counter (
name character varying(64) NOT NULL,
prefix character varying(64) NOT NULL,
last_value integer NOT NULL,
display_length integer DEFAULT 0,
renewal_prefix character varying(64),
renewal_extension boolean,
display_initially boolean,
renewal_start integer
);

I use a stored proc to get the next identifier:

CREATE OR REPLACE FUNCTION ib_nextval (character varying) RETURNS character
varying
AS 'DECLARE
countername ALIAS FOR $1;
cprefix varchar;
counter integer;
dlen integer;
complete varchar;
format varchar;

BEGIN
LOCK TABLE ib_counter IN ACCESS EXCLUSIVE MODE;
UPDATE ib_counter SET last_value=last_value+1 WHERE name=countername;
SELECT INTO cprefix,counter,dlen prefix,last_value,display_length FROM
ib_counter WHERE name=countername;
format := ''FM'';
FOR i IN 1..dlen LOOP
format := format || ''0'';
END LOOP;
complete := cprefix || to_char(counter,format);
RAISE NOTICE ''result is %,%,%,%'',complete,cprefix,counter,dlen;
RETURN complete;
END;

Here the point. I can create duplicate identifiers. The stored proc is called
within a transaction like (pseudocode)

Begin
value=ib_nextval('mycounter')
do something with value
commit

My assumption would be that if I do an exclusive lock on the table I can't do
the update or a second exclusive lock, so the stored proc should block (or
fail).
Obviously it doesn't work that way, since as said I get duplicates.

Any ideas anyone ?

Thanks

UC

- --
Open Source Solutions 4U, LLC 2570 Fleetwood Drive
Phone: +1 650 872 2425 San Bruno, CA 94066
Cell: +1 650 302 2405 United States
Fax: +1 650 872 2417
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)

iD8DBQFAftXwjqGXBvRToM4RAiZEAKDRlceKo84vzQZ82iT4R45+gYPamgCfbQYT
9cqaTBxsn1aiPni9+X4j1MM=
=2tXJ
-----END PGP SIGNATURE-----


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Uwe C(dot) Schroeder" <uwe(at)oss4u(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Table Lock issue
Date: 2004-04-15 23:20:14
Message-ID: 22140.1082071214@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

"Uwe C. Schroeder" <uwe(at)oss4u(dot)com> writes:
> I use a stored proc to get the next identifier:

> CREATE OR REPLACE FUNCTION ib_nextval (character varying) RETURNS character
> varying
> ...
> BEGIN
> LOCK TABLE ib_counter IN ACCESS EXCLUSIVE MODE;
> UPDATE ib_counter SET last_value=last_value+1 WHERE name=countername;
> SELECT INTO cprefix,counter,dlen prefix,last_value,display_length FROM ib_counter WHERE name=countername;

> My assumption would be that if I do an exclusive lock on the table I
> can't do the update or a second exclusive lock, so the stored proc
> should block (or fail).

It does block, and it does do the update correctly (at least if you're
not doing this in serializable mode). The problem is that the SELECT
doesn't get the right result. The SELECT actually sees two row versions
as being valid: the one you just created by UPDATE, and whichever one
was current when the outer transaction started. But SELECT INTO will
return at most one row, so it's roll-of-the-dice which one you get.
You can avoid this by attaching FOR UPDATE to the SELECT.

There have been discussions about this effect in the past (try searching
the pghackers archives for mentions of SetQuerySnapshot). In this
particular example it definitely seems like a bug, but if we fix it by
performing SetQuerySnapshot between statements of a plpgsql function,
we may break existing applications that aren't expecting that to happen.
So far there's not been a consensus to change the behavior.

BTW, I'd lose the LOCK if I were you; it doesn't do anything for you
except prevent concurrent updates of different counters. The row lock
obtained by the UPDATE is sufficient.

regards, tom lane


From: "Uwe C(dot) Schroeder" <uwe(at)oss4u(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Table Lock issue
Date: 2004-04-16 00:44:28
Message-ID: 200404151744.28837.uwe@oss4u.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Thursday 15 April 2004 04:20 pm, Tom Lane wrote:
> "Uwe C. Schroeder" <uwe(at)oss4u(dot)com> writes:
> > I use a stored proc to get the next identifier:
> >
> > CREATE OR REPLACE FUNCTION ib_nextval (character varying) RETURNS
> > character varying
> > ...
> > BEGIN
> > LOCK TABLE ib_counter IN ACCESS EXCLUSIVE MODE;
> > UPDATE ib_counter SET last_value=last_value+1 WHERE
> > name=countername; SELECT INTO cprefix,counter,dlen
> > prefix,last_value,display_length FROM ib_counter WHERE name=countername;
> >
> > My assumption would be that if I do an exclusive lock on the table I
> > can't do the update or a second exclusive lock, so the stored proc
> > should block (or fail).
>
> It does block, and it does do the update correctly (at least if you're
> not doing this in serializable mode). The problem is that the SELECT
> doesn't get the right result. The SELECT actually sees two row versions
> as being valid: the one you just created by UPDATE, and whichever one
> was current when the outer transaction started.

One question to "was current when the outer transaction started". Does that
mean that if I have a long running transaction, all selects inside that
transaction will only see what was committed as of the start of that
transaction ? So if I do a "update xxx set ...." outside of the transaction,
nothing inside the transaction will ever see that change, although it's
committed ?
This is "read committed" isolation level, where I would expect the selects
inside the transaction see anything that is committed, not what WAS committed
at the start of the transaction.

> But SELECT INTO will
> return at most one row, so it's roll-of-the-dice which one you get.
> You can avoid this by attaching FOR UPDATE to the SELECT.
>
> There have been discussions about this effect in the past (try searching
> the pghackers archives for mentions of SetQuerySnapshot). In this
> particular example it definitely seems like a bug, but if we fix it by
> performing SetQuerySnapshot between statements of a plpgsql function,
> we may break existing applications that aren't expecting that to happen.
> So far there's not been a consensus to change the behavior.
>
> BTW, I'd lose the LOCK if I were you; it doesn't do anything for you
> except prevent concurrent updates of different counters. The row lock
> obtained by the UPDATE is sufficient.
>
> regards, tom lane

- --
UC

- --
Open Source Solutions 4U, LLC 2570 Fleetwood Drive
Phone: +1 650 872 2425 San Bruno, CA 94066
Cell: +1 650 302 2405 United States
Fax: +1 650 872 2417
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)

iD8DBQFAfyxsjqGXBvRToM4RAs0pAJ0cwAE/BdrLL/lq3Y2jBnmnW7rMFwCg0mXN
6EXDA/UH1kBRdnz0sm+NgSE=
=hT3X
-----END PGP SIGNATURE-----


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Uwe C(dot) Schroeder" <uwe(at)oss4u(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Table Lock issue
Date: 2004-04-16 04:17:32
Message-ID: 24436.1082089052@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

"Uwe C. Schroeder" <uwe(at)oss4u(dot)com> writes:
> On Thursday 15 April 2004 04:20 pm, Tom Lane wrote:
>> ... The problem is that the SELECT
>> doesn't get the right result. The SELECT actually sees two row versions
>> as being valid: the one you just created by UPDATE, and whichever one
>> was current when the outer transaction started.

> One question to "was current when the outer transaction started".

I knew I wasn't going to get away with that explanation ;-). Okay,
here's the long version. Let's imagine this situation: initially
there's one row with last_value 1. Transaction A comes and does

begin;
select ib_nextval('mycounter');

but doesn't commit yet. Now there are two rows: one with last_value 1,
which is marked as created by some past committed transaction, and as
deleted by transaction A which is as yet uncommitted. There is also
a row with last_value 2, which is marked as created by transaction A and
deleted by nobody.

Now transaction B comes and does

select ib_nextval('mycounter');

It's going to block until A commits --- in your version, it blocks at
the LOCK-table-exclusively command, in mine at the UPDATE because the
UPDATE sees the row lock on the updated row. But in either case it
waits. Once xact A commits, B proceeds to mark the row with last_value
2 as deleted by itself, and creates a row with last_value 3, created by
itself and deleted by nobody. (This row will have last_value 3, not
something less, because the UPDATE will use the latest available
committed row as the starting point for its "last_value+1" computation.)

Now we come to the tricky part: transaction B does its SELECT. Which of
the three available rows will it consider valid? Because this SELECT is
inside a function, and we don't advance the QuerySnapshot inside a
function, the SELECT will be applying MVCC rules with respect to a
snapshot that was taken when the outer "select ib_nextval()" began ---
in other words, before transaction A committed. So the validity checks
stack up like this:

* original row with last_value 1: created by a long-since-committed
transaction, and deleted by a transaction (xact A) that had not
committed at the time of the snapshot. Ergo, good.

* second row with last_value 2: created by a not-yet-committed xact (A)
and deleted by my own transaction. Loses on either count; not good.

* third row with last_value 3: created by my own transaction and deleted
by nobody. Ergo, good.

So both last_value 1 and last_value 3 are visible to the SELECT, and
it's a crapshoot which will come up first in SELECT INTO.

If we were to advance the QuerySnaphot between statements of a plpgsql
function, the problem would go away because the SELECT would see
transaction A as already committed, making the original row not-good.

Now in this situation it is good to recognize the effects of other
transactions between statements of a plpgsql function, but it's not hard
to think up cases in which plpgsql functions would break if the visible
database state changes between statements. So it's a bit of a tough
choice what to do. I'm personally starting to think that we *should*
advance the QuerySnapshot, but as I said there's not yet a consensus
about it.

Oh, one other point: SELECT FOR UPDATE fixes this because it has
different visibility rules. Like UPDATE, it will *never* consider good
a row version that is marked as deleted by any committed transaction.

regards, tom lane


From: Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Uwe C(dot) Schroeder" <uwe(at)oss4u(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Table Lock issue
Date: 2004-04-16 17:32:42
Message-ID: 1082136762.23008.878.camel@camel
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Fri, 2004-04-16 at 00:17, Tom Lane wrote:
> "Uwe C. Schroeder" <uwe(at)oss4u(dot)com> writes:
> > On Thursday 15 April 2004 04:20 pm, Tom Lane wrote:
<snip most excellent summary of how this problem manifests itself>
>
> If we were to advance the QuerySnaphot between statements of a plpgsql
> function, the problem would go away because the SELECT would see
> transaction A as already committed, making the original row not-good.
>
> Now in this situation it is good to recognize the effects of other
> transactions between statements of a plpgsql function, but it's not hard
> to think up cases in which plpgsql functions would break if the visible
> database state changes between statements. So it's a bit of a tough
> choice what to do. I'm personally starting to think that we *should*
> advance the QuerySnapshot, but as I said there's not yet a consensus
> about it.
>

The problem is that, while the people for changing this behavior keep
stacking up, theres no way to quantify how many people it would cause
trouble for... really we're only going on the theory that it could cause
trouble for people; i don't recall anyone posting a real world example
that requires the current semantics.

> Oh, one other point: SELECT FOR UPDATE fixes this because it has
> different visibility rules. Like UPDATE, it will *never* consider good
> a row version that is marked as deleted by any committed transaction.
>

The proposal to update the query snapshot inside plpgsql whenever a lock
table in exclusive access is issued follows along this line and would
seem like one way to help most people get around this problem since it's
hard to imagine any real world scenario where one would want to lock a
table exclusively and still see rows that are modified by other
transactions.

Robert Treat
--
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>
Cc: "Uwe C(dot) Schroeder" <uwe(at)oss4u(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Table Lock issue
Date: 2004-04-16 17:52:07
Message-ID: 1204.1082137927@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Robert Treat <xzilla(at)users(dot)sourceforge(dot)net> writes:
> On Fri, 2004-04-16 at 00:17, Tom Lane wrote:
>> Now in this situation it is good to recognize the effects of other
>> transactions between statements of a plpgsql function, but it's not hard
>> to think up cases in which plpgsql functions would break if the visible
>> database state changes between statements. So it's a bit of a tough
>> choice what to do. I'm personally starting to think that we *should*
>> advance the QuerySnapshot, but as I said there's not yet a consensus
>> about it.

> The problem is that, while the people for changing this behavior keep
> stacking up, theres no way to quantify how many people it would cause
> trouble for... really we're only going on the theory that it could cause
> trouble for people; i don't recall anyone posting a real world example
> that requires the current semantics.

Well, that's because the people who need the current behavior (if any)
haven't had reason to complain ;-). In general however it seems like
one would expect a series of statements issued inside a plpgsql function
to behave the same as if they were issued interactively (inside a
transaction block of course). Right now that is true for SERIALIZABLE
mode but not true for READ COMMITTED mode. So I'm starting to lean to
the idea that we should change it and document it as a potential
incompatibility.

>> Oh, one other point: SELECT FOR UPDATE fixes this because it has
>> different visibility rules. Like UPDATE, it will *never* consider good
>> a row version that is marked as deleted by any committed transaction.

> The proposal to update the query snapshot inside plpgsql whenever a lock
> table in exclusive access is issued follows along this line and would
> seem like one way to help most people get around this problem since it's
> hard to imagine any real world scenario where one would want to lock a
> table exclusively and still see rows that are modified by other
> transactions.

Hm, I don't recall having heard that proposal in this context, and I
can't say that I like it. In the first place we don't really want to be
encouraging people to lock tables exclusively, and in the second place
I'm not sure plpgsql can detect all such cases (what if the LOCK is
issued inside another function, or via EXECUTE?). In the third place
it would create yet another subtle discrepancy in execution behavior
between functions and interactive commands, which seems like the wrong
direction to be going in.

regards, tom lane