Re: Unique index: update error

Lists: pgsql-hackers
From: "Golden Liu" <goldenliu(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Unique index: update error
Date: 2006-09-15 01:16:33
Message-ID: 2d3034200609141816p1555f556ve0bc2633691a4a0a@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Suppose there are too tuples in a table t, named
id
---
1
2

and there is a unique index on id. Now we do an update on table t
update t set id=id+1

Since PG executes the update one tuple at a time, it updates tuple "1"
to "2" and insert it into the index. Before insert into the index, it
check whether the id is still "unique" or not. No, it's not, old tuple
"2" is still in the table. So an error is raised.

I try to solve this problem this way:
First, update the table t but DON'T update the index.
Next, find all the tuples updated by this command and insert them into
the unique index.

By doing so, the problem seemed to be solved. My modifications focus
on the function "ExecutePlan". Here is my patch for PG8.1.4.
Tuplestore is used to record all the tuples being updated by this
command.

Is there any problom with it? Thanks.

Index: backend/executor/execMain.c
====================================================== =============
RCS file: /home/gdliu/cvsroot/postgresql/src/backend/executor/execMain .c,v
retrieving revision 1.1.1.1
diff -c -r1.1.1.1 execMain.c
*** backend/executor/execMain.c 5 Sep 2006 07:19:08 -0000 1.1.1.1
--- backend/executor/execMain.c 7 Sep 2006 08:28:34 -0000
***************
*** 1068,1073 ****
--- 1068,1092 ----
long current_tuple_count;
TupleTableSlot *result;

+
+ MemoryContext holdCtidContext = NULL;
+ Tuplestorestate *holdCtidStore = NULL;
+ MemoryContext oldcxt = NULL;
+ if(operation == CMD_UPDATE) {
+ holdCtidContext =
+ AllocSetContextCreate(CurrentMemoryContext,
+ "HoldUpdateCTIDContext",
+ ALLOCSET_DEFAULT_MINSIZE,
+ ALLOCSET_DEFAULT_INITSIZE,
+ ALLOCSET_DEFAULT_MAXSIZE);
+
+ oldcxt = MemoryContextSwitchTo(holdCtidContext);
+ holdCtidStore = tuplestore_begin_heap(false, false, work_mem);
+ MemoryContextSwitchTo(oldcxt);
+ }
+
/*
* initialize local variables
*/
***************
*** 1287,1293 ****
--- 1306,1319 ----
break;

case CMD_UPDATE:
ExecUpdate(slot, tupleid, estate);
+
+ oldcxt = MemoryContextSwitchTo(holdCtidContext);
+ slot->tts_tuple->t_data->t_ ctid = slot->tts_tuple->t_self;
+ tuplestore_puttuple(holdCtidStore,slot->tts_tuple);
+ MemoryContextSwitchTo(oldcxt);
result = NULL;
break;

***************
*** 1308,1313 ****
--- 1334,1372 ----
break;
}

+
+ //insert index
+ if(operation == CMD_UPDATE &&
+ estate->es_result_relation_info->ri_NumIndices > 0) {
+ HeapTuple tuple = NULL;
+ bool should_free = false;
+
+ oldcxt = MemoryContextSwitchTo(holdCtidContext);
+
+ tuplestore_rescan(holdCtidStore);
+ for(;;) {
+ tuple = (HeapTuple)tuplestore_gettuple(holdCtidStore,true,&shoul d_free);
+ if(!tuple)
+ break;
+ tuple->t_self = tuple->t_data->t_ctid;
+ ItemPointerSetInvalid(&(tuple->t_data-> ;t_ctid));
+ if(!ItemPointerIsValid(&(tuple->t_self))) {
+ elog(ERROR, "Insert Index: ctid is invalid.");
+ }
+ ExecStoreTuple(tuple, slot, InvalidBuffer, false);
+ ExecInsertIndexTuples(slot, &(tuple->t_self), estate, false);
+ if(should_free)
+ pfree(tuple);
+ }
+ tuplestore_end(holdCtidStore);
+ MemoryContextSwitchTo(holdCtidStore);
+ holdCtidContext->methods->delete(holdCtidContext);
+ }
+ //*/
/*
* Process AFTER EACH STATEMENT triggers
*/


From: Jim Nasby <jimn(at)enterprisedb(dot)com>
To: Golden Liu <goldenliu(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Unique index: update error
Date: 2006-09-18 03:35:12
Message-ID: C41306B6-3B93-4344-B556-9B029C99A910@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sep 14, 2006, at 9:16 PM, Golden Liu wrote:
> Suppose there are too tuples in a table t, named
> id
> ---
> 1
> 2
>
> and there is a unique index on id. Now we do an update on table t
> update t set id=id+1
>
> Since PG executes the update one tuple at a time, it updates tuple "1"
> to "2" and insert it into the index. Before insert into the index, it
> check whether the id is still "unique" or not. No, it's not, old tuple
> "2" is still in the table. So an error is raised.
>
> I try to solve this problem this way:
> First, update the table t but DON'T update the index.
> Next, find all the tuples updated by this command and insert them into
> the unique index.

Isn't that what a deferred constraint normally does?

I suspect that your change adds a non-trivial overhead, which means
we don't want it to be the normal case.
--
Jim Nasby jim(at)nasby(dot)net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)

--
Jim Nasby jim(at)nasby(dot)net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jim Nasby <jimn(at)enterprisedb(dot)com>
Cc: Golden Liu <goldenliu(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Unique index: update error
Date: 2006-09-18 04:14:14
Message-ID: 15231.1158552854@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Jim Nasby <jimn(at)enterprisedb(dot)com> writes:
> On Sep 14, 2006, at 9:16 PM, Golden Liu wrote:
>> I try to solve this problem this way:
>> First, update the table t but DON'T update the index.
>> Next, find all the tuples updated by this command and insert them into
>> the unique index.

> I suspect that your change adds a non-trivial overhead, which means
> we don't want it to be the normal case.

There's a bigger problem:

begin;
update tab set col1 = ... where unique_key = ...;
update tab set col2 = ... where unique_key = ...;
commit;

If the first update doesn't insert index entries into unique_key's
index, then the second update won't find the tuples it needs to update
(unless we hack the planner to not trust the index as valid ... and
then it'd fall back on a seqscan, which is hardly acceptable anyway).

The scheme that I've thought about involves inserting index entries as
usual, but instead of having the aminsert code error out immediately
upon finding a duplicate, have it make an entry in a list of things
that need to be rechecked before commit. This wins as long as potential
conflicts are uncommon. Performance could suck if the list gets too
large --- but we have more or less the same hazard now for foreign-key
checks, and it mostly works well enough. (In fact, maybe the existing
deferred trigger event list is the thing to use for the deferred
conflict rechecks.)

regards, tom lane


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Golden Liu <goldenliu(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Unique index: update error
Date: 2006-09-18 12:30:28
Message-ID: 1158582628.2696.60.camel@holly
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, 2006-09-15 at 09:16 +0800, Golden Liu wrote:
> this problem

I'm sorry but I don't see any problem. Why would you want to issue that
kind of SQL statement?

Assuming you really do, why not just DELETE/re-INSERT ?

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com


From: "Golden Liu" <goldenliu(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Unique index: update error
Date: 2006-09-19 01:33:15
Message-ID: 2d3034200609181833u448c1a76ib332ea09089d9993@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 9/18/06, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Jim Nasby <jimn(at)enterprisedb(dot)com> writes:
> > On Sep 14, 2006, at 9:16 PM, Golden Liu wrote:
> >> I try to solve this problem this way:
> >> First, update the table t but DON'T update the index.
> >> Next, find all the tuples updated by this command and insert them into
> >> the unique index.
>
> > I suspect that your change adds a non-trivial overhead, which means
> > we don't want it to be the normal case.
>
> There's a bigger problem:
>
> begin;
> update tab set col1 = ... where unique_key = ...;
> update tab set col2 = ... where unique_key = ...;
> commit;
>
> If the first update doesn't insert index entries into unique_key's
> index, then the second update won't find the tuples it needs to update
> (unless we hack the planner to not trust the index as valid ... and
> then it'd fall back on a seqscan, which is hardly acceptable anyway).
>
The first update DOES insert index entries into unique_key's index. In
fact, index entries will be inserted after each command, not each
transaction. Is this right? Or should we insert index entries after each
transaction?

> The scheme that I've thought about involves inserting index entries as
> usual, but instead of having the aminsert code error out immediately
> upon finding a duplicate, have it make an entry in a list of things
> that need to be rechecked before commit. This wins as long as potential
> conflicts are uncommon. Performance could suck if the list gets too
> large --- but we have more or less the same hazard now for foreign-key
> checks, and it mostly works well enough. (In fact, maybe the existing
> deferred trigger event list is the thing to use for the deferred
> conflict rechecks.)
>
> regards, tom lane
>


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Golden Liu" <goldenliu(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Unique index: update error
Date: 2006-09-19 02:55:59
Message-ID: 27000.1158634559@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Golden Liu" <goldenliu(at)gmail(dot)com> writes:
> ... The first update DOES insert index entries into unique_key's index.

Right, but weren't you proposing to make it not do so?

regards, tom lane