Re: Validating CHECK constraints with SPI

Lists: pgsql-hackers
From: Dan Robinson <dan(at)drob(dot)us>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Validating CHECK constraints with SPI
Date: 2014-10-29 12:24:38
Message-ID: CAKE9wfY8ADFokEGbihWLb2Ur6+TjxMpcZgZVempgjDuQDQ5csA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi all,

If I'm reading correctly in src/backend/commands/tablecmds.c, it looks like
PostgreSQL does a full table scan in validateCheckConstraint and in the
constraint validation portion of ATRewriteTable.

Since the table is locked to updates while the constraint is validating,
this means you have to jump through hoops if you want to add a CHECK
constraint to a large table in a production setting. This validation could
be considerably faster if we enabled it to use relevant indexes or other
constraints. Is there a reason not to make an SPI call here, instead?

This would make it possible to do something like:
postgres=# CREATE INDEX CONCURRENTLY foo_temp_idx ON my_table (id) WHERE
foo = 'bar';
postgres=# SELECT COUNT(*) FROM my_table WHERE foo = 'bar'; -- Make sure
this is 0.
postgres=# ALTER TABLE my_table ADD CONSTRAINT my_check CHECK (foo !=
'bar');
postgres=# DROP INDEX foo_temp_idx;

The third step here would be fast, because it would be able to use
foo_temp_idx under the hood. Additionally, it would be easy to get a sense
for how long this operation will lock your table by timing the query in the
second step. (I suppose the latter is true already if you do the same with
enable_indexscan off, but that requires knowing that PostgreSQL is going to
do the seq scan no matter what.)

Would y'all be open to a patch that made this change?

Best,
-Dan


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Dan Robinson <dan(at)drob(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Validating CHECK constraints with SPI
Date: 2014-10-29 14:17:40
Message-ID: 20141029141740.GF1791@alvin.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dan Robinson wrote:
> Hi all,
>
> If I'm reading correctly in src/backend/commands/tablecmds.c, it looks like
> PostgreSQL does a full table scan in validateCheckConstraint and in the
> constraint validation portion of ATRewriteTable.
>
> Since the table is locked to updates while the constraint is validating,
> this means you have to jump through hoops if you want to add a CHECK
> constraint to a large table in a production setting. This validation could
> be considerably faster if we enabled it to use relevant indexes or other
> constraints. Is there a reason not to make an SPI call here, instead?

I don't think SPI would help you here. But I think you would like to
add the constraint as NOT VALID and then do an ALTER TABLE .. VALIDATE
CONSTRAINT command afterwards. In 9.4, this doesn't require
AccessExclusive lock on the table.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Dan Robinson <dan(at)drob(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Validating CHECK constraints with SPI
Date: 2014-10-29 14:24:26
Message-ID: 18229.1414592666@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dan Robinson <dan(at)drob(dot)us> writes:
> Since the table is locked to updates while the constraint is validating,
> this means you have to jump through hoops if you want to add a CHECK
> constraint to a large table in a production setting. This validation could
> be considerably faster if we enabled it to use relevant indexes or other
> constraints. Is there a reason not to make an SPI call here, instead?

This seems like a lot of work for a gain that would only occur sometimes,
ie if the CHECK happened to correspond to a usable index condition.
I realize your point is that a clever DBA might intentionally create
such an index, but I don't think that people would bother in practice.
It's not any simpler, nor faster, than using the existing approach with
ALTER TABLE ADD CONSTRAINT NOT VALID followed by ALTER TABLE VALIDATE
CONSTRAINT.

regards, tom lane


From: Dan Robinson <dan(at)drob(dot)us>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Validating CHECK constraints with SPI
Date: 2014-10-30 02:20:10
Message-ID: CAKE9wfbejOdDcZdEb8wXzePAeuQ5Ce_=y6iEqSfgs7R7RAYHQQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Oct 29, 2014 at 7:17 AM, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
wrote:

> Dan Robinson wrote:
> > Hi all,
> >
> > If I'm reading correctly in src/backend/commands/tablecmds.c, it looks
> like
> > PostgreSQL does a full table scan in validateCheckConstraint and in the
> > constraint validation portion of ATRewriteTable.
> >
> > Since the table is locked to updates while the constraint is validating,
> > this means you have to jump through hoops if you want to add a CHECK
> > constraint to a large table in a production setting. This validation
> could
> > be considerably faster if we enabled it to use relevant indexes or other
> > constraints. Is there a reason not to make an SPI call here, instead?
>
> I don't think SPI would help you here. But I think you would like to
> add the constraint as NOT VALID and then do an ALTER TABLE .. VALIDATE
> CONSTRAINT command afterwards. In 9.4, this doesn't require
> AccessExclusive lock on the table.

Interesting! I hadn't seen the patch that makes ALTER TABLE ... VALIDATE
CONSTRAINT require only ShareUpdateExclusive. Very cool.

Yes, that makes this change totally unnecessary.

-Dan


From: Noah Misch <noah(at)leadboat(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Dan Robinson <dan(at)drob(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Validating CHECK constraints with SPI
Date: 2014-10-30 03:48:40
Message-ID: 20141030034840.GA407075@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Oct 29, 2014 at 10:24:26AM -0400, Tom Lane wrote:
> Dan Robinson <dan(at)drob(dot)us> writes:
> > Since the table is locked to updates while the constraint is validating,
> > this means you have to jump through hoops if you want to add a CHECK
> > constraint to a large table in a production setting. This validation could
> > be considerably faster if we enabled it to use relevant indexes or other
> > constraints. Is there a reason not to make an SPI call here, instead?
>
> This seems like a lot of work for a gain that would only occur sometimes,
> ie if the CHECK happened to correspond to a usable index condition.
> I realize your point is that a clever DBA might intentionally create
> such an index, but I don't think that people would bother in practice.

Consider the case of adding a NOT NULL constraint. Most single-column btree
indexes can quickly determine whether the column contains nulls, so the DBA
may well get the benefit on the strength of an already-present index.

> It's not any simpler, nor faster, than using the existing approach with
> ALTER TABLE ADD CONSTRAINT NOT VALID followed by ALTER TABLE VALIDATE
> CONSTRAINT.

There will be no point in building a throwaway index for this, agreed.