plpgsql: check domain constraints

Lists: pgsql-hackerspgsql-patches
From: Neil Conway <neilc(at)samurai(dot)com>
To: pgsql-patches <pgsql-patches(at)postgresql(dot)org>
Subject: plpgsql: check domain constraints
Date: 2006-01-09 04:30:40
Message-ID: 1136781040.8718.24.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Attached is a patch that makes the following changes:

(1) refactor execQual.c to expose a function for checking an instance of
a domain value against a list of domain constraints

(2) adjust pl/pgsql to fetch the constraints associated with the
function's return value. Because this is expensive, the constraints are
fetched once per session, when the function is compiled. I then modified
pl/pgsql to check any applicable constraints on the return value of a
function before returning it to the backend.

(3) add some regression tests for #2

The patch does NOT add checking of domain constraints for other PLs, or
for intermediate values in PL/PgSQL -- I plan to take a look at doing
one or both of those soon.

I also made a few semi-related cleanups. In pl_exec.c, it seems to me
that estate.eval_econtext MUST be non-NULL during the guts of both
plpgsql_exec_trigger() and plpgsql_exec_function(). Therefore checking
that estate.eval_econtext is non-NULL when cleaning up is unnecessary
(line 649 and 417 in current sources, respectively), so I've removed the
checks. Am I missing something?

Barring any objections I'll apply this patch tomorrow some time.

-Neil

Attachment Content-Type Size
plpgsql_domain_constr-2.patch text/x-patch 10.5 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Neil Conway <neilc(at)samurai(dot)com>
Cc: pgsql-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: plpgsql: check domain constraints
Date: 2006-01-09 04:56:52
Message-ID: 25708.1136782612@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Neil Conway <neilc(at)samurai(dot)com> writes:
> (2) adjust pl/pgsql to fetch the constraints associated with the
> function's return value. Because this is expensive, the constraints are
> fetched once per session, when the function is compiled.

We have gone out of our way to make sure that domain constraint checking
responds promptly (ie, within one query) to updates of the domain
definition. Caching at the session level in plpgsql would be a
significant regression from that, and I don't think it's acceptable.
If you had a way of invalidating the cache when needed, it'd be great
... but not without that.

> I also made a few semi-related cleanups. In pl_exec.c, it seems to me
> that estate.eval_econtext MUST be non-NULL during the guts of both
> plpgsql_exec_trigger() and plpgsql_exec_function(). Therefore checking
> that estate.eval_econtext is non-NULL when cleaning up is unnecessary
> (line 649 and 417 in current sources, respectively), so I've removed the
> checks. Am I missing something?

The code doesn't currently assume that, and it doesn't seem to me that
saving one simple if-test is a reason to add the assumption ...

regards, tom lane


From: Neil Conway <neilc(at)samurai(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: plpgsql: check domain constraints
Date: 2006-01-09 08:16:58
Message-ID: 1136794618.8851.10.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Sun, 2006-01-08 at 23:56 -0500, Tom Lane wrote:
> We have gone out of our way to make sure that domain constraint checking
> responds promptly (ie, within one query) to updates of the domain
> definition. Caching at the session level in plpgsql would be a
> significant regression from that, and I don't think it's acceptable.
> If you had a way of invalidating the cache when needed, it'd be great
> ... but not without that.

GetDomainConstraints() looks fairly expensive, so it would be nice to do
some caching. What would the best way to implement this be? I had
thought that perhaps the typcache would work, but there seems to be no
method to flush stale typcache data. Perhaps we could add support for
typcache invalidation (via a new sinval message), and then add the
domain constraint information to the typcache. Or is there an easier
way?

> > I also made a few semi-related cleanups. In pl_exec.c, it seems to me
> > that estate.eval_econtext MUST be non-NULL during the guts of both
> > plpgsql_exec_trigger() and plpgsql_exec_function(). Therefore checking
> > that estate.eval_econtext is non-NULL when cleaning up is unnecessary
> > (line 649 and 417 in current sources, respectively), so I've removed the
> > checks. Am I missing something?
>
> The code doesn't currently assume that, and it doesn't seem to me that
> saving one simple if-test is a reason to add the assumption ...

It's not a matter of "saving an if-test", it's a matter of code clarity.
Code ought to be consistent about whether any given pointer variable
might be NULL. This makes it easier for a programmer to tell if new code
ought to check for NULL-ness before using the pointer. (In fact, when
modifying plpgsql_exec_function() for this patch, I had to spend a few
minutes checking for the situations in which estate.eval_econtext might
be NULL.)

Some languages (e.g. ML) actually distinguish in the type system between
"references that might be NULL" and those that cannot be. That's not
possible in C, but consistently treating NULL-ness makes it easier to
determine this by hand.

-Neil


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Neil Conway <neilc(at)samurai(dot)com>
Cc: pgsql-patches <pgsql-patches(at)postgresql(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCHES] plpgsql: check domain constraints
Date: 2006-01-09 18:20:21
Message-ID: 6542.1136830821@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Neil Conway <neilc(at)samurai(dot)com> writes:
> GetDomainConstraints() looks fairly expensive, so it would be nice to do
> some caching. What would the best way to implement this be? I had
> thought that perhaps the typcache would work, but there seems to be no
> method to flush stale typcache data. Perhaps we could add support for
> typcache invalidation (via a new sinval message), and then add the
> domain constraint information to the typcache. Or is there an easier
> way?

Yeah, I had been thinking of exactly the same thing a few months ago
after noting that GetDomainConstraints() can be pretty dang slow ---
it seemed to be a major bottleneck for Kevin Grittner here:
http://archives.postgresql.org/pgsql-hackers/2005-09/msg01135.php
Unfortunately the rest of that conversation was unintentionally
off-list, but we identified heavy use of pg_constraint_contypid_index
as the source of his issue, and I said

: Hmm. The only commonly-used code path I can see that would touch
: pg_constraint_contypid_index is GetDomainConstraints(), which would be
: called (once) during startup of a command that involves a CoerceToDomain
: expression node. So if the heavily-hit table has any domain-type
: columns, it's possible that a steady stream of inserts or updates
: could have kept that index tied up.
:
: It might be worth introducing a system cache that could be used to
: extract the constraints for a domain without going to the catalog for
: every single command. There's been very little work done to date on
: optimizing operations on domains :-(

The lack of typcache invalidation is something that will eventually
bite us in other ways, so we need to add the facility anyway. We
don't really need a "new sinval message", as an inval on the pg_type
row will serve perfectly well --- what we need is something comparable
to CacheInvalidateRelcache() to cause such a message to be sent when
we haven't actually changed the pg_type row itself.

Do you want to work on this? I can if you don't.

BTW, in connection with the lookup_rowtype_tupdesc fiasco, it's pretty
obvious that any data structure returned by this function will need
to be either copied or reference-counted.

regards, tom lane