8.0.0beta5 FailedAssertion (Crash) when casting composite types

Lists: pgsql-hackerspgsql-patches
From: kris(dot)shannon(at)gmail(dot)com
To: Postgresql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: 8.0.0beta5 FailedAssertion (Crash) when casting composite types
Date: 2004-12-03 21:31:21
Message-ID: bf38a9f04120313313955d538@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

template2=# SELECT version();
version
----------------------------------------------------------------------------------------------
PostgreSQL 8.0.0beta5 on i686-pc-linux-gnu, compiled by GCC gcc (GCC)
3.4.2 (Debian 3.4.2-3)
(1 row)
template2=# CREATE TABLE base (i integer);
CREATE TABLE
template2=# CREATE TABLE derived () INHERITS (base);
CREATE TABLE
template2=# INSERT INTO derived (i) VALUES (0);
INSERT 0 1
template2=# SELECT derived::base FROM derived;

TRAP: FailedAssertion("!(typeId == (
(olddata)->t_choice.t_datum.datum_typeid ))", File: "tuptoaster.c",
Line: 830)

Ouch!

The actual case that I discovered this on had extra columns in the
derived table and I was expecting some sort of error message (and I
guess I got one - just a little more extreme than I was expecting)

--
Kris Shannon <kris(dot)shannon(at)gmail(dot)com>


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Kris Shannon <kris(dot)shannon(at)gmail(dot)com>
Cc: Postgresql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: 8.0.0beta5 FailedAssertion (Crash) when casting composite types
Date: 2004-12-05 22:42:10
Message-ID: 19420.1102286530@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

kris(dot)shannon(at)gmail(dot)com writes:
> template2=# CREATE TABLE base (i integer);
> template2=# CREATE TABLE derived () INHERITS (base);
> template2=# INSERT INTO derived (i) VALUES (0);
> template2=# SELECT derived::base FROM derived;
> TRAP: FailedAssertion

The cause of this failure is that parse_coerce.c thinks that a child
table's rowtype is binary-compatible with its parent's rowtype:

if (typeInheritsFrom(inputTypeId, targetTypeId))
{
/*
* Input class type is a subclass of target, so nothing to do ---
* except relabel the type. This is binary compatibility for
* complex types.
*/
return (Node *) makeRelabelType((Expr *) node,
targetTypeId, -1,
cformat);
}

As of 8.0 this isn't really true anymore, because a tuple being used as
a rowtype Datum is supposed to carry its rowtype OID in a field of the
tuple header. A proper coercion would require replacing the type OID in
the header.

In a larger sense, however, this code has been wrong since day one.
A child table must have all the same columns its parent does, but it
need not have them in the same column positions. This can happen after
ALTER TABLE ADD COLUMN on the parent, and it can happen even without
ALTER TABLE by use of multiple inheritance --- columns inherited from a
non-first parent are very unlikely to be in the same positions as they
are in that parent. If the columns aren't in the same positions then
the rowtypes certainly aren't binary-compatible. So at least some
flavors of the bug go back a long time, probably to Berkeley days.
For instance, this dumps core in all versions back to at least 7.1
(though you need to adjust the syntax before 7.4):

regression=# create table p1(ff1 int);
CREATE TABLE
regression=# create table p2(f1 text);
CREATE TABLE
regression=# create function p2text(p2) returns text as 'select $1.f1' language sql;
CREATE FUNCTION
regression=# create table c1(f3 int) inherits(p1,p2);
CREATE TABLE
regression=# insert into c1 values(123456789,'hi', 42);
INSERT 589036 1
regression=# select p2text(c1.*) from c1;
server closed the connection unexpectedly

(BTW, "select p2text(p2.*) from p2" works, at least in the last couple
versions, because the planner understands that it must convert child
rowtypes to match the parent. But that path isn't taken in Kris' example.)

The Really Clean And Correct fix to this, IMHO, would be to invent a new
expression node type that represents coercing a rowtype expression to a
different rowtype. Execution of this node would disassemble and
reassemble the tuple Datum, using code not too much different from
execJunk.c, to produce the right column order and the right rowtype OID
label. However that seems much too large a change for post-RC.
(You could also argue that it requires an initdb, though I'd take the
position that it doesn't because there are no working databases that
would be affected.)

A less clean but much more localized fix would be to cause ExecEvalExpr
to do that work when it finds a RelabelType node whose output type is a
composite type. (We could arrange to lookup the pg_type entry only once
per query, during ExecInitExpr, so the performance hit on normal uses of
RelabelType wouldn't be too bad.) A rough estimate is that this would
require about 100 lines of new code in execQual.c, much of which could
be adapted from other places.

Another possibility is to #ifdef out the code in parse_coerce.c that
thinks it can coerce one rowtype to another this way, causing attempts
to do rowtype coercions to fail with a "can't convert type" kind of error.
I've tried this and it does not seem to break any of the regression
tests, but I'm afraid that it will break cases that people are depending
on in practice. Still, given that we are in RC, maybe this is the only
acceptable answer for 8.0. If I were called on to close the hole in 7.4
and before, I'd say this is the only reasonable quick-fix for those
branches, for sure.

Choice #4 is to do nothing, figuring that a bug that has lasted this
long can wait another release cycle to be fixed. But in today's
security atmosphere I doubt that will be considered acceptable.

I definitely intend to do the "clean" fix in 8.1 after we branch,
but I am unsure what to do in 8.0, or in the back branches. Comments?

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Kris Shannon <kris(dot)shannon(at)gmail(dot)com>
Cc: Postgresql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: 8.0.0beta5 FailedAssertion (Crash) when casting composite types
Date: 2004-12-07 22:17:12
Message-ID: 27892.1102457832@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

I wrote:
> The cause of this failure is that parse_coerce.c thinks that a child
> table's rowtype is binary-compatible with its parent's rowtype:
> ...

> The Really Clean And Correct fix to this, IMHO, would be to invent a new
> expression node type that represents coercing a rowtype expression to a
> different rowtype. Execution of this node would disassemble and
> reassemble the tuple Datum, using code not too much different from
> execJunk.c, to produce the right column order and the right rowtype OID
> label. However that seems much too large a change for post-RC.
> (You could also argue that it requires an initdb, though I'd take the
> position that it doesn't because there are no working databases that
> would be affected.)

> A less clean but much more localized fix would be to cause ExecEvalExpr
> to do that work when it finds a RelabelType node whose output type is a
> composite type. (We could arrange to lookup the pg_type entry only once
> per query, during ExecInitExpr, so the performance hit on normal uses of
> RelabelType wouldn't be too bad.) A rough estimate is that this would
> require about 100 lines of new code in execQual.c, much of which could
> be adapted from other places.

I developed a patch based on the second approach, but eventually
realized that it doesn't quite close the gap. The problem is that there
are quite a few places that "know" that RelabelType is a run-time no-op
and feel free to ignore it. We could possibly hack each one to check
whether the result type is composite or not, but that seems unlikely to
provide a trustworthy fix. I now think that we don't really have any
choice but to go with the "clean" solution and represent rowtype
coercion as a new node type.

As I said above, I intend to treat this as a non-initdb-forcing change,
unless anyone vehemently objects.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-patches(at)postgreSQL(dot)org
Subject: Re: [HACKERS] 8.0.0beta5 FailedAssertion (Crash) when casting composite types
Date: 2004-12-11 23:28:15
Message-ID: 19838.1102807695@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

I have applied the attached patch to HEAD in order to fix the problems
discussed in this thread:
http://archives.postgresql.org/pgsql-hackers/2004-12/msg00187.php

regards, tom lane

Attachment Content-Type Size
unknown_filename text/plain 25.4 KB

From: kris(dot)shannon(at)gmail(dot)com
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Postgresql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCHES] 8.0.0beta5 FailedAssertion (Crash) when casting composite types
Date: 2004-12-13 22:25:52
Message-ID: bf38a9f04121314251b56923a@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Sat, 11 Dec 2004 18:28:15 -0500, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I have applied the attached patch to HEAD in order to fix the problems
> discussed in this thread:
> http://archives.postgresql.org/pgsql-hackers/2004-12/msg00187.php
>
> regards, tom lane
>

Well, I was going to have a stab at it, but after looking at the patch I can see
my knowledge of the internals still leaves a bit to be desired (make that a LOT)

Thanks for this anyway. You just simplified my coding immensely

--
Kris Shannon <kris(dot)shannon(at)gmail(dot)com>