Re: WIP: Deferrable unique constraints

From: Dean Rasheed <dean(dot)a(dot)rasheed(at)googlemail(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: Deferrable unique constraints
Date: 2009-07-08 09:17:24
Message-ID: 4A546424.9010404@googlemail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Jeff Davis wrote:
> On Tue, 2009-07-07 at 19:38 +0100, Dean Rasheed wrote:
>> This approach works well if the number of potential conflicts is
>> small.
>
> [...]
>
>> Curing the scalability problem by spooling the queue to disk shouldn't
>> be too hard to do, but that doesn't address the problem that if a
>> significant proportion of rows from the table need to be checked, it
>> is far quicker to scan the whole index once than check row by row.
>
> Another approach that might be worth considering is to build a temporary
> index and try to merge them at constraint-checking time. That might work
> well for unique.
>

I'm not really sure what you mean by a "temporary index". Do you mean
one that you would just throw away at the end of the statement? That
seems a bit heavy-weight.

Also it seems too specific to unique constraints. I think it would be
better to cure the scalability issues for all constraints and triggers
in one place, in the after triggers queue code.

I had hoped that after doing deferrable unique constraints, I might
apply a similar approach to other constraints, eg. a deferrable check
constraint. In that case, an index doesn't help, and there is no choice
but to check the rows one at a time.

Unique (and also FK) are special, in that they have potentially more
optimal ways of checking them in bulk. ISTM that this is an orthogonal
concept to the issue of making the trigger queue scalable, except that
there ought to be an efficient way of discarding all the queued entries
for a particular constraint, if we decide to check it en masse (perhaps
a separate queue per constraint, or per trigger).

- Dean

> However, there are some potential issues. I didn't think this through
> yet, but here is a quick list just to get some thoughts down:
>
> 1. It would be tricky to merge while checking constraints if we are
> supporting more general constraints like in my proposal
> ( http://archives.postgresql.org/pgsql-hackers/2009-07/msg00302.php ).
>
> 2. Which indexes can be merged efficiently, and how much effort would it
> take to make this work?
>
> 3. A related issue: making indexes mergeable would be useful for bulk
> inserts as well.
>
> 4. At the end of the command, the index needs to work, meaning that
> queries would have to search two indexes. That may be difficult (but
> check the GIN fast insert code, which does something similar).
>
> 5. The temporary index still can't be enforcing constraints if they are
> deferred, so it won't solve all the issues here.
>
> Regards,
> Jeff Davis
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bernd Helmle 2009-07-08 09:23:32 Re: bytea vs. pg_dump
Previous Message Jan Urbański 2009-07-08 08:23:42 Re: *_collapse_limit, geqo_threshold