[Review] Include detailed information about a row failing a CHECK constraint into the error message

Lists: pgsql-hackers
From: José Arthur Benetasso Villanova <jose(dot)arthur(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [patch] Include detailed information about a row failing a CHECK constraint into the error message
Date: 2011-11-09 23:48:09
Message-ID: CA+soXCMJZ7qu5CZ8LDHHjjv53CNrbeL-=x2zzXmyi2LbwwkT2A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> Hi,
> when I insert/update many rows at once using INSERT ... SELECT into a
> table which has plenty of CHECK constraints, the error message that
> Postgres returns has no indication of which row failed the constraint
> check. The attached patch tries to provide information in a similar way
> to how duplicate items in a UNIQUE constraint are handled.

> Originally, I tried to simply check the new row's t_ctid, but it was
> always (0,0) -- I guess that's expected, maybe it's still in memory at
> that time and maybe such nodes don't have a ctid assigned yet.

> Please let me know if this patch is suitable for inclusion. It's based
> on REL9_0_STABLE, because that's the version I'm running.

> I'd like to thank intgr on IRC for his feedback when I was wondering
> about the t_ctid.

> With kind regards,
> Jan

Hi Jan / all.

I'm looking for a simple patch to review and this one doesn't look too
complicate.

The patch seens to be useful, it adds a better feedback.

First, I couldn't apply it as in the email, even in REL9_0_STABLE: the
offset doesn't look right. Which commit are your repository in?

Anyway, I could copy / paste it at the correct place, using the
current master. I could compile it, put a postgres with it running and
it's working:

postgres=# create table test1(id serial primary key, value text);
NOTICE:  CREATE TABLE will create implicit sequence "test1_id_seq" for
serial column "test1.id"
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index
"test1_pkey" for table "test1"
CREATE TABLE
postgres=# ALTER TABLE test1 ADD CONSTRAINT must_be_unique unique (value);
NOTICE:  ALTER TABLE / ADD UNIQUE will create implicit index
"must_be_unique" for table "test1"
ALTER TABLE
postgres=# insert into test1 values (default, 'Hello World');
INSERT 0 1
postgres=# insert into test1 values (default, 'Hello World');
ERROR:  duplicate key value violates unique constraint "must_be_unique"
DETAIL:  Key (value)=(Hello World) already exists.

The patch I've used:

diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index fd7a9ed..57894cf 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -1574,10 +1574,32 @@ ExecConstraints(ResultRelInfo *resultRelInfo,
const char *failed;

if ((failed = ExecRelCheck(resultRelInfo, slot, estate)) != NULL)
+ {
+ StringInfoData buf;
+ int natts = rel->rd_att->natts;
+ int i;
+ initStringInfo(&buf);
+ for (i = 0; i < natts; ++i)
+ {
+ char *val;
+ Oid foutoid;
+ bool typisvarlena;
+ getTypeOutputInfo(rel->rd_att->attrs[i]->atttypid, &foutoid,
&typisvarlena);
+ if (slot->tts_isnull[i])
+ val = "NULL";
+ else
+ val = OidOutputFunctionCall(foutoid, slot->tts_values[i]);
+ if (i > 0)
+ appendStringInfoString(&buf, ", ");
+ appendStringInfoString(&buf, val);
+ }
ereport(ERROR,
(errcode(ERRCODE_CHECK_VIOLATION),
errmsg("new row for relation \"%s\" violates check constraint \"%s\"",
- RelationGetRelationName(rel), failed)));
+ RelationGetRelationName(rel), failed),
+ errdetail("New row with data (%s) violates check constraint \"%s\".",
+ buf.data, failed)));
+ }
}
}

--
José Arthur Benetasso Villanova


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: José Arthur Benetasso Villanova <jose(dot)arthur(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: [patch] Include detailed information about a row failing a CHECK constraint into the error message
Date: 2011-11-10 02:47:56
Message-ID: CA+TgmoYMWSG0LixqgyfFuE8tfJ8ToA=3+hzCXs5A5tHN4o2U=w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2011/11/9 José Arthur Benetasso Villanova <jose(dot)arthur(at)gmail(dot)com>:
> postgres=# create table test1(id serial primary key, value text);
> NOTICE:  CREATE TABLE will create implicit sequence "test1_id_seq" for
> serial column "test1.id"
> NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index
> "test1_pkey" for table "test1"
> CREATE TABLE
> postgres=# ALTER TABLE test1 ADD CONSTRAINT must_be_unique unique (value);
> NOTICE:  ALTER TABLE / ADD UNIQUE will create implicit index
> "must_be_unique" for table "test1"
> ALTER TABLE
> postgres=# insert into test1 values (default, 'Hello World');
> INSERT 0 1
> postgres=# insert into test1 values (default, 'Hello World');
> ERROR:  duplicate key value violates unique constraint "must_be_unique"
> DETAIL:  Key (value)=(Hello World) already exists.

It does this already, without this patch. This patch is about CHECK
constraints, not UNIQUE ones.

I believe we've previously rejected patches along these lines on the
grounds that the output could realistically be extremely long.
Imagine that you have a table with an integer primary key column and a
text column. The integer column has a check constraint on it. But
the text column might contain a kilobyte, or a megabyte, or even a
gigabyte worth of text, and we don't necessarily want to spit that all
out on an error. For unique constraints, we only emit the values that
are part of the constraint, which in most cases will be relatively
short (if they're more than 8kB, they won't fit into an index block);
but here the patch wants to dump the whole tuple, and that could be
really big.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Jan Kundrát <kundratj(at)fzu(dot)cz>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: [patch] Include detailed information about a row failing a CHECK constraint into the error message
Date: 2011-11-10 09:25:09
Message-ID: 4EBB9875.5000105@fzu.cz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 11/10/11 00:48, José Arthur Benetasso Villanova wrote:
> First, I couldn't apply it as in the email, even in REL9_0_STABLE: the
> offset doesn't look right. Which commit are your repository in?

Hi Jose, thanks for looking at the patch. It's based on
b07b2bdc570cfbb39564c8a70783dbce1edcb3d6, which was REL9_0_STABLE at the
time I made the change.

Cheers,
Jan


From: Jan Kundrát <jkt(at)gentoo(dot)org>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: [patch] Include detailed information about a row failing a CHECK constraint into the error message
Date: 2011-11-10 10:29:20
Message-ID: 4EBBA780.2010305@gentoo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi José and Robert, thanks for your time and a review. Comments below.

On 11/10/11 03:47, Robert Haas wrote:
> It does this already, without this patch. This patch is about CHECK
> constraints, not UNIQUE ones.

That's right. This is how to check what the patch changes:

jkt=> CREATE TABLE tbl (name TEXT PRIMARY KEY, a INTEGER CHECK (a>0));
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index
"tbl_pkey" for table "tbl"
CREATE TABLE
jkt=> INSERT INTO tbl (name, a) VALUES ('x', 10);
INSERT 0 1
jkt=> UPDATE tbl SET a = -a;
ERROR: new row for relation "tbl" violates check constraint "tbl_a_check"
DETAIL: New row with data (x, -10) violates check constraint "tbl_a_check".

The last line, the detailed error message, is added by the patch.

> I believe we've previously rejected patches along these lines on the
> grounds that the output could realistically be extremely long.
> Imagine that you have a table with an integer primary key column and a
> text column. The integer column has a check constraint on it. But
> the text column might contain a kilobyte, or a megabyte, or even a
> gigabyte worth of text, and we don't necessarily want to spit that all
> out on an error. For unique constraints, we only emit the values that
> are part of the constraint, which in most cases will be relatively
> short (if they're more than 8kB, they won't fit into an index block);
> but here the patch wants to dump the whole tuple, and that could be
> really big.

That's an interesting thought. I suppose the same thing is an issue with
unique keys, but they tend to not be created over huge columns, so it
isn't really a problem, right?

Would you object to a patch which outputs just the first 8kB of each
column? Having at least some form of context is very useful in my case.

(And as a side note, I'm not really familiar with Postgres' internals,
so it took me roughly six hours to arrive to a working patch from the
very start. I'd therefore welcome pointers about the best way to achieve
that with Postgres' string stream interface.)

With kind regards,
Jan

--
Trojita, a fast e-mail client -- http://trojita.flaska.net/


From: Jan Kundrát <jkt(at)flaska(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: [patch] Include detailed information about a row failing a CHECK constraint into the error message
Date: 2011-11-10 10:40:58
Message-ID: 4EBBAA3A.10802@flaska.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi José and Robert, thanks for your time and a review. Comments below.

On 11/10/11 03:47, Robert Haas wrote:
> It does this already, without this patch. This patch is about CHECK
> constraints, not UNIQUE ones.

That's right. This is how to check what the patch changes:

jkt=> CREATE TABLE tbl (name TEXT PRIMARY KEY, a INTEGER CHECK (a>0));
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index
"tbl_pkey" for table "tbl"
CREATE TABLE
jkt=> INSERT INTO tbl (name, a) VALUES ('x', 10);
INSERT 0 1
jkt=> UPDATE tbl SET a = -a;
ERROR: new row for relation "tbl" violates check constraint "tbl_a_check"
DETAIL: New row with data (x, -10) violates check constraint "tbl_a_check".

The last line, the detailed error message, is added by the patch.

> I believe we've previously rejected patches along these lines on the
> grounds that the output could realistically be extremely long.
> Imagine that you have a table with an integer primary key column and a
> text column. The integer column has a check constraint on it. But
> the text column might contain a kilobyte, or a megabyte, or even a
> gigabyte worth of text, and we don't necessarily want to spit that all
> out on an error. For unique constraints, we only emit the values that
> are part of the constraint, which in most cases will be relatively
> short (if they're more than 8kB, they won't fit into an index block);
> but here the patch wants to dump the whole tuple, and that could be
> really big.

That's an interesting thought. I suppose the same thing is an issue with
unique keys, but they tend to not be created over huge columns, so it
isn't really a problem, right?

Would you object to a patch which outputs just the first 8kB of each
column? Having at least some form of context is very useful in my case.

(And as a side note, I'm not really familiar with Postgres' internals,
so it took me roughly six hours to arrive to a working patch from the
very start. I'd therefore welcome pointers about the best way to achieve
that with Postgres' string stream interface.)

With kind regards,
Jan

--
Trojita, a fast e-mail client -- http://trojita.flaska.net/


From: "Dickson S(dot) Guedes" <listas(at)guedesoft(dot)net>
To: Jan Kundrát <jkt(at)flaska(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org, Robert Haas <robertmhaas(at)gmail(dot)com>, jose(dot)arthur(at)gmail(dot)com
Subject: Re: Re: [patch] Include detailed information about a row failing a CHECK constraint into the error message
Date: 2011-11-10 11:41:00
Message-ID: CAHHcreq_s7TunxZw4N=62XrYNSHgwE_Agi6qFqETcXR0+CTpTQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2011/11/10 Jan Kundrát <jkt(at)flaska(dot)net>:
> On 11/10/11 03:47, Robert Haas wrote:
>> It does this already, without this patch.  This patch is about CHECK
>> constraints, not UNIQUE ones.
>
> That's right. This is how to check what the patch changes:
>
> jkt=> CREATE TABLE tbl (name TEXT PRIMARY KEY, a INTEGER CHECK (a>0));
> NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index
> "tbl_pkey" for table "tbl"
> CREATE TABLE
> jkt=> INSERT INTO tbl (name, a) VALUES ('x', 10);
> INSERT 0 1
> jkt=> UPDATE tbl SET a = -a;
> ERROR:  new row for relation "tbl" violates check constraint "tbl_a_check"
> DETAIL:  New row with data (x, -10) violates check constraint "tbl_a_check".
>
> The last line, the detailed error message, is added by the patch.

The patch uses 'New row with data ....' but it was an UPDATE, if you
go further with this patch, IMO the message should be fixed too.

--
Dickson S. Guedes
mail/xmpp: guedes(at)guedesoft(dot)net - skype: guediz
http://guedesoft.net - http://www.postgresql.org.br


From: Jan Kundrát <jkt(at)flaska(dot)net>
To: "Dickson S(dot) Guedes" <listas(at)guedesoft(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org, Robert Haas <robertmhaas(at)gmail(dot)com>, jose(dot)arthur(at)gmail(dot)com
Subject: Re: Re: [patch] Include detailed information about a row failing a CHECK constraint into the error message
Date: 2011-11-10 11:46:36
Message-ID: 4EBBB99C.70306@flaska.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 11/10/11 12:41, Dickson S. Guedes wrote:
>> jkt=> UPDATE tbl SET a = -a;
>> ERROR: new row for relation "tbl" violates check constraint "tbl_a_check"
>> DETAIL: New row with data (x, -10) violates check constraint "tbl_a_check".
>>
>> The last line, the detailed error message, is added by the patch.
>
> The patch uses 'New row with data ....' but it was an UPDATE, if you
> go further with this patch, IMO the message should be fixed too.

I'm not sure whether the code can determine whether the check gets
triggered by an UPDATE or an INSERT (both commands lead to this code
path). Please note that the already-existing error message (the "ERROR:
" line in the output I enclosed) already uses the phrase "new row".

That said, I'll of course be more than happy to include whatever string
which fits better, and am open to any suggestions.

Cheers,
Jan

--
Trojita, a fast e-mail client -- http://trojita.flaska.net/


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jan Kundrát <jkt(at)flaska(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: [patch] Include detailed information about a row failing a CHECK constraint into the error message
Date: 2011-11-10 12:04:52
Message-ID: CA+TgmoaQcsNcMyP5-HMjx1CTY1CALocFCZQ+aonV5C77Y3QZ8Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Nov 10, 2011 at 5:40 AM, Jan Kundrát <jkt(at)flaska(dot)net> wrote:
> That's an interesting thought. I suppose the same thing is an issue with
> unique keys, but they tend to not be created over huge columns, so it
> isn't really a problem, right?

Pretty much.

> Would you object to a patch which outputs just the first 8kB of each
> column? Having at least some form of context is very useful in my case.

Well, if we're going to try to emit some context here, I'd suggest
that we try to output only the columns implicated in the CHECK
constraint, rather than the whole tuple. I'm not sure whether
emitting only a certain amount of output (either total, or for each
column) can be made to work nicely, or whether the feature overall is
something we want. It seems like a trade-off between possibly useful
context and possibly annoying log clutter, and I guess I don't have a
strong opinion on which way to go with it.

Anyone else have an opinion?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Jan Kundrát <jkt(at)flaska(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: [patch] Include detailed information about a row failing a CHECK constraint into the error message
Date: 2011-11-10 13:40:14
Message-ID: 4EBBD43E.1070907@flaska.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 11/10/11 13:04, Robert Haas wrote:
> Well, if we're going to try to emit some context here, I'd suggest
> that we try to output only the columns implicated in the CHECK
> constraint, rather than the whole tuple. I'm not sure whether
> emitting only a certain amount of output (either total, or for each
> column) can be made to work nicely, or whether the feature overall is
> something we want. It seems like a trade-off between possibly useful
> context and possibly annoying log clutter, and I guess I don't have a
> strong opinion on which way to go with it.

OK, let me start with some background on why I actually want to have
such a feature. The project which we're working on [1] (and [2] for
some context about why the hell we bother) allows users to define layout
of their DB tables using standard CREATE TABLE ... stanzas, including
various triggers, check constraints etc etc. What our project does is
generating plenty of stored procedures which essentially built a
version-control infrastructure around the user-specified table layout.

Our workflow utilizes something similar to the concept of a working copy
in Subversion. It means that any modifications that users perform are
executed on an extra table (the history one) which does not enforce any
user-specified constraints. It's only at the time of a commit, where
data is moved by `UPDATE tabl SELECT ... FROM tbl_history where revision
= $pending_changeset` to its final destination and all the checks,
triggers and constraints are enforced.

The issue which we've hit is that when the user has specified a CHECK
constraint and tries to save many rows at once, we don't have any
information about what went wrong besides the name of the check which
failed. It's better than nothing, but given that Pg provides very
similar information for UNIQUE columns, it looked like a good feature to
implement.

What I want to find in the end is something which tells me "this row
causes the error". Unfortunately, as the new row of the table with the
constraint is not yet on disk, it doesn't really have its own ctid, and
therefore I cannot report that. (Which makes sense, obviously.) I also
realize that our use case is a bit esoteric and very far from the
mainstream Postgres applications, but I believe that simply having
detailed error messages is a good thing overall. Of course it's clearly
possible that we're doing it completely wrong, so if someone has a
suggestion or would like to chat about that, I'm all ears (feel free to
go off-list here).

Now I realize that there might be some concerns about error log
cluttering etc. On the other hand, I'd take it for granted that it's a
good idea to include at least *some* context in the error messages (and
I assume that's what the detail field is for). If it's acceptable for
UNIQUE constraints to show the index values (which are enough to
identify the troublesome row), it seems to me that extending this to
CHECKs is a natural further development and leads to better consistency.

As I've said earlier, I'm not at all familiar with Postgres' internals,
so before I go ahead and spend another night finding out how to look at
the table/check metadata and print just the columns which are referenced
by a CHECK, if that's even possible, I'd like to know whether such a
patch would be welcome and accepted or not :).

Again, a big thank you for your review -- it's much appreciated.

Cheers,
Jan

[1] https://projects.flaska.net/projects/deska
[2]
https://projects.flaska.net/attachments/download/74/2011-11-10-deska-18e4c5b.pdf

--
Trojita, a fast e-mail client -- http://trojita.flaska.net/


From: Kääriäinen Anssi <anssi(dot)kaariainen(at)thl(dot)fi>
To: Jan Kundrát <jkt(at)flaska(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: [patch] Include detailed information about a row failing a CHECK constraint into the error message
Date: 2011-11-10 14:27:57
Message-ID: BC19EF15D84DC143A22D6A8F2590F0A78864133112@EXMAIL.stakes.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


"""
What I want to find in the end is something which tells me "this row
causes the error". Unfortunately, as the new row of the table with the
constraint is not yet on disk, it doesn't really have its own ctid, and
therefore I cannot report that. (Which makes sense, obviously.)
"""

Would an error with the row's PK value be useful? Something like "row
with primary key 'pk_val' fails check 'foo_check'". That would be limited
in size, yet give some context.

There are two problems I can see:
- The PK value doesn't necessarily identify the row in any useful
manner (SERIAL primary key in INSERT).
- The table might lack PK constraint (skip the detail in this case?)

- Anssi


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Jan Kundrát <jkt(at)flaska(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: [patch] Include detailed information about a row failing a CHECK constraint into the error message
Date: 2011-11-10 15:05:51
Message-ID: 7717.1320937551@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Thu, Nov 10, 2011 at 5:40 AM, Jan Kundrt <jkt(at)flaska(dot)net> wrote:
>> Would you object to a patch which outputs just the first 8kB of each
>> column? Having at least some form of context is very useful in my case.

> Well, if we're going to try to emit some context here, I'd suggest
> that we try to output only the columns implicated in the CHECK
> constraint, rather than the whole tuple.

I think that's likely to be impractical, or at least much more trouble
than the feature is worth. Also, if you might emit only a subset of
columns, then you have to label them, a la the FK error messages:
Key (x,y,z) = (this,that,theother)
That's going to make the line length problem worse not better.

I concur with just length-limiting the dumped values, and in fact would
prefer a limit much more draconian than 8K. Don't we limit the key
lengths to 1K or so in FK and unique-key messages? If the goal is to
identify the problematic line, I would think that a few dozen bytes per
column would be plenty.

> I'm not sure whether
> emitting only a certain amount of output (either total, or for each
> column) can be made to work nicely, or whether the feature overall is
> something we want. It seems like a trade-off between possibly useful
> context and possibly annoying log clutter, and I guess I don't have a
> strong opinion on which way to go with it.

I agree with Jan that this is probably useful; I'm pretty sure there
have been requests for it before. We just have to make sure that the
length of the message stays in bounds.

One tip for keeping the length down: there is no value in repeating
information from the primary error message, such as the name of the
constraint.

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jan Kundrát <jkt(at)flaska(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: [patch] Include detailed information about a row failing a CHECK constraint into the error message
Date: 2011-11-10 15:21:45
Message-ID: CA+TgmoYye7_LDjSpSF0ze+ZhOj8YNp58Qh1jM2EutDOKMeVfyQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Nov 10, 2011 at 10:05 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On Thu, Nov 10, 2011 at 5:40 AM, Jan Kundrát <jkt(at)flaska(dot)net> wrote:
>>> Would you object to a patch which outputs just the first 8kB of each
>>> column? Having at least some form of context is very useful in my case.
>
>> Well, if we're going to try to emit some context here, I'd suggest
>> that we try to output only the columns implicated in the CHECK
>> constraint, rather than the whole tuple.
>
> I think that's likely to be impractical, or at least much more trouble
> than the feature is worth.  Also, if you might emit only a subset of
> columns, then you have to label them, a la the FK error messages:
>        Key (x,y,z) = (this,that,theother)
> That's going to make the line length problem worse not better.

Depends. A lot of CHECK constraints may only reference one column:
CHECK (a > 0). The whole record is likely to be a lot longer than
(a)=(-32768), and frankly tuples without column names aren't that
readable anyway.

I'd argue that to some degree, CHECK constraints, like UNIQUE
constraints, probably tend to be placed primarily on relatively short
columns. Now, UNIQUE constraints have a hard limitation, because a
too-large value won't fit into an index block. And certainly you
could do CHECK (document_is_valid_json(mumbleblump)). But many things
that contain large amounts of text will just be free text fields, they
won't be part of any constraint, and including them will just make
things unreadable.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Jan Kundrát <jkt(at)flaska(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: [patch] Include detailed information about a row failing a CHECK constraint into the error message
Date: 2011-11-10 15:37:41
Message-ID: 8367.1320939461@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Thu, Nov 10, 2011 at 10:05 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>>> Well, if we're going to try to emit some context here, I'd suggest
>>> that we try to output only the columns implicated in the CHECK
>>> constraint, rather than the whole tuple.

>> I think that's likely to be impractical, or at least much more trouble
>> than the feature is worth. Also, if you might emit only a subset of
>> columns, then you have to label them, a la the FK error messages:
>> Key (x,y,z) = (this,that,theother)
>> That's going to make the line length problem worse not better.

> Depends. A lot of CHECK constraints may only reference one column:
> CHECK (a > 0). The whole record is likely to be a lot longer than
> (a)=(-32768), and frankly tuples without column names aren't that
> readable anyway.

Well, the other concern here is: how much context does it take to
identify the problematic row? It's entirely likely that showing only
the value of "a" isn't enough to solve the user's problem anyhow.
So I think the argument for showing a subset of columns is quite weak.

regards, tom lane


From: Jan Kundrát <jkt(at)flaska(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: [patch] Include detailed information about a row failing a CHECK constraint into the error message
Date: 2011-11-10 15:56:32
Message-ID: 4EBBF430.3090904@flaska.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 11/10/11 16:05, Tom Lane wrote:
> I agree with Jan that this is probably useful; I'm pretty sure there
> have been requests for it before. We just have to make sure that the
> length of the message stays in bounds.
>
> One tip for keeping the length down: there is no value in repeating
> information from the primary error message, such as the name of the
> constraint.

Thanks to your comments and suggestions, I appreciate the time of the
reviewers.

Attached is a second version of this patch which keeps the size of the
output at 64 characters per column (which is an arbitrary value defined
as a const int, which I hope matches your style). Longer values have
their last three characters replaced by "...", so there's no way to
distinguish them from a legitimate string that ends with just that.
There's also no escaping of special-string values, similar to how the
BuildIndexValueDescription operates.

Cheers,
Jan

--
Trojita, a fast e-mail client -- http://trojita.flaska.net/

Attachment Content-Type Size
context_in_check_constraints-v2.patch text/plain 1.4 KB

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Jan Kundrát <jkt(at)gentoo(dot)org>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: [patch] Include detailed information about a row failing a CHECK constraint into the error message
Date: 2011-11-13 08:50:39
Message-ID: CAFj8pRDpd60=JhMZPHjOM=r-rARfuwngL+_n0ExRU1L+WPrEdw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello
>
> (And as a side note, I'm not really familiar with Postgres' internals,
> so it took me roughly six hours to arrive to a working patch from the
> very start. I'd therefore welcome pointers about the best way to achieve
> that with Postgres' string stream interface.)
>

http://www.pgsql.cz/index.php/C_a_PostgreSQL_-_intern%C3%AD_mechanismy

Regards

Pavel

> With kind regards,
> Jan
>
> --
> Trojita, a fast e-mail client -- http://trojita.flaska.net/
>
>


From: Royce Ausburn <royce(dot)ml(at)inomial(dot)com>
To: Jan Kundrát <jkt(at)flaska(dot)net>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: [Review] Include detailed information about a row failing a CHECK constraint into the error message
Date: 2011-11-16 22:13:33
Message-ID: A48A1365-4788-45F9-9854-EE8CD476445E@inomial.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


The patch applies cleanly to the current git master and is in context diff format.

The patch fails the regression tests because it is outputting new DETAIL line which four of tests aren't expecting. The tests will need to be updated.

Functionality:
The patch works as advertised. An insert or update that fails a check condition indeed logs the row that has failed:

test=# create table test (
test(# a integer,
test(# b integer CHECK (b > 2),
test(# c text,
test(# d integer
test(#
test(#
test(# );
CREATE TABLE
test=#
test=# insert into test select 1, 2, 'Test', 4;
ERROR: new row for relation "test" violates check constraint "test_b_check"
DETAIL: Failing row: (1, 2, Test, 4).

One comment I have on the output is that strings are not in quotes. It's a little jarring, but might not be that big a deal. A contrived case that is pretty confusing:

test=# insert into test select 1, 2, '3, 4', 4;
ERROR: new row for relation "test" violates check constraint "test_b_check"
DETAIL: Failing row: (1, 2, 3, 4, 4).

A select inserting 4 columns seemingly results in a 5 column row ;)

Another super minor thing, postgres doesn't seem to put periods at the end of log messages, yet this new detail line does.

Code

I'm no C or postgres expert, but the code looks okay to me.

Attached is a small script I used to test/demo the functionality.

--Royce

On 11/11/2011, at 2:56 AM, Jan Kundrát wrote:

> On 11/10/11 16:05, Tom Lane wrote:
>> I agree with Jan that this is probably useful; I'm pretty sure there
>> have been requests for it before. We just have to make sure that the
>> length of the message stays in bounds.
>>
>> One tip for keeping the length down: there is no value in repeating
>> information from the primary error message, such as the name of the
>> constraint.
>
> Thanks to your comments and suggestions, I appreciate the time of the
> reviewers.
>
> Attached is a second version of this patch which keeps the size of the
> output at 64 characters per column (which is an arbitrary value defined
> as a const int, which I hope matches your style). Longer values have
> their last three characters replaced by "...", so there's no way to
> distinguish them from a legitimate string that ends with just that.
> There's also no escaping of special-string values, similar to how the
> BuildIndexValueDescription operates.
>
> Cheers,
> Jan
>
> --
> Trojita, a fast e-mail client -- http://trojita.flaska.net/
> <context_in_check_constraints-v2.patch>


From: Jan Kundrát <jkt(at)flaska(dot)net>
To: Royce Ausburn <royce(dot)ml(at)inomial(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [Review] Include detailed information about a row failing a CHECK constraint into the error message
Date: 2011-11-21 13:59:59
Message-ID: 4ECA595F.6060707@flaska.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 11/16/11 23:13, Royce Ausburn wrote:
> The patch fails the regression tests because it is outputting new DETAIL
> line which four of tests aren't expecting. The tests will need to be
> updated.

Hi Royce, thanks for your time which you've put into this review.

What is the suggested way to go form here? Shall I update the unit tests?

> One comment I have on the output is that strings are not in quotes.
> It's a little jarring, but might not be that big a deal. A contrived
> case that is pretty confusing:
>
> test=# insert into test select 1, 2, '3, 4', 4;
> ERROR: new row for relation "test" violates check constraint "test_b_check"
> DETAIL: Failing row: (1, 2, 3, 4, 4).
>
> A select inserting 4 columns seemingly results in a 5 column row ;)

Yes, I agree that the unescaped format of strings leads to ambiguous
results here. The code was copy-pasted from the checks which handle the
UNIQUE constraints, so if there's an obvious improvement, it should
probably be applied in there as well.

> Another super minor thing, postgres doesn't seem to put periods at the
> end of log messages, yet this new detail line does.

Again, I'm not familiar with the correct procedure. Shall I send a
revised patch for this one?

With kind regards,
Jan

--
Trojita, a fast e-mail client -- http://trojita.flaska.net/


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jan Kundrát <jkt(at)flaska(dot)net>
Cc: Royce Ausburn <royce(dot)ml(at)inomial(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [Review] Include detailed information about a row failing a CHECK constraint into the error message
Date: 2011-11-21 14:52:56
Message-ID: 9066.1321887176@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

=?UTF-8?B?SmFuIEt1bmRyw6F0?= <jkt(at)flaska(dot)net> writes:
> On 11/16/11 23:13, Royce Ausburn wrote:
>> Another super minor thing, postgres doesn't seem to put periods at the
>> end of log messages, yet this new detail line does.

> Again, I'm not familiar with the correct procedure. Shall I send a
> revised patch for this one?

Please read the message style guide (we do have one)
http://developer.postgresql.org/pgdocs/postgres/error-style-guide.html

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jan Kundrát <jkt(at)flaska(dot)net>
Cc: Royce Ausburn <royce(dot)ml(at)inomial(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [Review] Include detailed information about a row failing a CHECK constraint into the error message
Date: 2011-11-23 22:05:07
Message-ID: CA+TgmoZUOYgxH1UXEcfKd=LFx4a7jomHN==Dn_ahz-ykEDqKdA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Nov 21, 2011 at 8:59 AM, Jan Kundrát <jkt(at)flaska(dot)net> wrote:
> What is the suggested way to go form here? Shall I update the unit tests?

Yes.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Royce Ausburn <royce(dot)ml(at)inomial(dot)com>
Cc: Jan Kundrát <jkt(at)flaska(dot)net>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [Review] Include detailed information about a row failing a CHECK constraint into the error message
Date: 2011-11-29 17:51:55
Message-ID: 23481.1322589115@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Royce Ausburn <royce(dot)ml(at)inomial(dot)com> writes:
> One comment I have on the output is that strings are not in quotes. It's a little jarring, but might not be that big a deal. A contrived case that is pretty confusing:

> test=# insert into test select 1, 2, '3, 4', 4;
> ERROR: new row for relation "test" violates check constraint "test_b_check"
> DETAIL: Failing row: (1, 2, 3, 4, 4).

> A select inserting 4 columns seemingly results in a 5 column row ;)

FWIW, I don't think it's the province of this patch to resolve this
issue, since as Jan noted, unique-index violation reports do it the same
way. I think for the moment we should have it emit the same unquoted
format that BuildIndexValueDescription does, with the exception of
truncating long field values. (BuildIndexValueDescription doesn't
currently worry about that since long values are much less likely in
index entries; though perhaps we should make it do so.)

If we decide that it's important to quote or escape values reported in
these messages, that should be the subject of a separate patch that also
changes BuildIndexValueDescription. But personally I doubt it's
necessary.

In short, I'm inclined to go ahead and apply this patch, after a bit of
cleanup to make it match our house style better and not break multibyte
characters.

regards, tom lane


From: Jan Kundrát <jkt(at)flaska(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [Review] Include detailed information about a row failing a CHECK constraint into the error message
Date: 2011-11-29 18:39:40
Message-ID: 4ED526EC.6060701@flaska.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 11/29/11 18:51, Tom Lane wrote:
> In short, I'm inclined to go ahead and apply this patch, after a bit of
> cleanup to make it match our house style better and not break multibyte
> characters.

Thanks a lot (and sorry for being lazy and not having updated the test
cases yet).

With kind regards,
Jan

--
Trojita, a fast e-mail client -- http://trojita.flaska.net/


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jan Kundrát <jkt(at)flaska(dot)net>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: [patch] Include detailed information about a row failing a CHECK constraint into the error message
Date: 2011-11-29 20:09:59
Message-ID: 5747.1322597399@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

=?UTF-8?B?SmFuIEt1bmRyw6F0?= <jkt(at)flaska(dot)net> writes:
> Attached is a second version of this patch which keeps the size of the
> output at 64 characters per column (which is an arbitrary value defined
> as a const int, which I hope matches your style). Longer values have
> their last three characters replaced by "...", so there's no way to
> distinguish them from a legitimate string that ends with just that.
> There's also no escaping of special-string values, similar to how the
> BuildIndexValueDescription operates.

Applied with some revisions; notably, that I pulled the code out into a
separate subroutine so that it could be used for more than one thing.

I was wondering in particular whether it wouldn't be appropriate to
include the same errdetail in ExecConstraints's other check, the one
for null in not-null columns. Arguably a not-null check is just a
slightly optimized form of a CHECK constraint, and anyway if you think
you need row identification info for a CHECK failure I don't see why
you'd not want it for NOT NULL failure. Comments?

regards, tom lane


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: <jkt(at)flaska(dot)net>,"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Robert Haas" <robertmhaas(at)gmail(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: [patch] Include detailed information about a row failing a CHECK constraint into the error message
Date: 2011-11-29 20:24:40
Message-ID: 4ED4EB2802000025000435F2@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> I was wondering in particular whether it wouldn't be appropriate
> to include the same errdetail in ExecConstraints's other check,
> the one for null in not-null columns. Arguably a not-null check
> is just a slightly optimized form of a CHECK constraint, and
> anyway if you think you need row identification info for a CHECK
> failure I don't see why you'd not want it for NOT NULL failure.
> Comments?

Makes sense to me.

-Kevin