Re: [PATCH] Support for foreign keys with arrays

Lists: pgsql-hackers
From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: <simon(at)2ndQuadrant(dot)com>,<misa(dot)simic(at)gmail(dot)com>
Cc: <gabriele(dot)bartolini(at)2ndquadrant(dot)it>, <marco(dot)nenciarini(at)2ndquadrant(dot)it>,<pgsql(at)j-davis(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Support for foreign keys with arrays
Date: 2012-06-17 14:58:17
Message-ID: 4FDDAA39020000250004858F@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs wrote:
Misa Simic wrote:

>> IMO, both approaches make sense...
>
> Agreed.

Can someone provide a practical example of a "foreign key with array"
use case? The only situations I'm able to think of right now are the
same cases where you would now use a table with primary keys of two
tables to provide a many-to-many linkage. Does this proposed feature
handle other cases or handle this type of case better?

The "referencing value is contained by the referenced value" has many
obvious uses. For example, in our courts data we have a statute
table which effectively has a statute cite and effective date range
for the primary key, and we have a charge table with a statute cite
and an offense date used to match it to a statute row.

-Kevin


From: Misa Simic <misa(dot)simic(at)gmail(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: simon(at)2ndquadrant(dot)com, gabriele(dot)bartolini(at)2ndquadrant(dot)it, marco(dot)nenciarini(at)2ndquadrant(dot)it, pgsql(at)j-davis(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Support for foreign keys with arrays
Date: 2012-06-17 15:25:16
Message-ID: CAH3i69kQ3vi0EiP32Eid-7V3r2HTW4OtekNjzsDRPE4MBR2v7w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2012/6/17 Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>

>
>
> Can someone provide a practical example of a "foreign key with array"
> use case? The only situations I'm able to think of right now are the
> same cases where you would now use a table with primary keys of two
> tables to provide a many-to-many linkage. Does this proposed feature
> handle other cases or handle this type of case better?
>

I can't imagine either other usablity... Just many-to-one linkage... or to
have many-to-many link with less rows in middle table... What is better - I
think should be measured...


From: Gianni Ciolli <gianni(dot)ciolli(at)2ndquadrant(dot)it>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: simon(at)2ndQuadrant(dot)com, misa(dot)simic(at)gmail(dot)com, gabriele(dot)bartolini(at)2ndquadrant(dot)it, marco(dot)nenciarini(at)2ndquadrant(dot)it, pgsql(at)j-davis(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Support for foreign keys with arrays
Date: 2012-06-17 16:02:53
Message-ID: 20120617160253.GA26190@leggeri.gi.lan
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Jun 17, 2012 at 09:58:17AM -0500, Kevin Grittner wrote:
> Simon Riggs wrote:
> Misa Simic wrote:
>
> >> IMO, both approaches make sense...
> >
> > Agreed.
>
> Can someone provide a practical example of a "foreign key with array"
> use case? The only situations I'm able to think of right now are the
> same cases where you would now use a table with primary keys of two
> tables to provide a many-to-many linkage. Does this proposed feature
> handle other cases or handle this type of case better?

The way I think about "array foreign keys" is that they represent the
"aggregated" form of a classical foreign key.

In the aggregated form, each row in the referencing side represents a
group of rows in the non-aggregated form.

One advantage is that constraints on each group of rows as a whole are
now possible, because they become constraints on a single row in the
aggregated form.

Example. If you have a table of points, then you can have a table of
polygons where each polygon contains an array of points. The
non-aggregated model would instead require an additional point_polygon
table which references both the point and the polygon table, because
the point <-> polygon relationship is many-to-many. In the aggregated
model, you can easily specify a CHECK constraint that requires each
polygon to have at least three points, while the corresponding
condition cannot be specified in the non-aggregated model.

Cheers,
Dr. Gianni Ciolli - 2ndQuadrant Italia
PostgreSQL Training, Services and Support
gianni(dot)ciolli(at)2ndquadrant(dot)it | www.2ndquadrant.it


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Misa Simic" <misa(dot)simic(at)gmail(dot)com>
Cc: <simon(at)2ndquadrant(dot)com>,<gabriele(dot)bartolini(at)2ndquadrant(dot)it>, <marco(dot)nenciarini(at)2ndquadrant(dot)it>, <pgsql(at)j-davis(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Support for foreign keys with arrays
Date: 2012-06-18 15:27:43
Message-ID: 4FDF029F02000025000485F1@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Misa Simic <misa(dot)simic(at)gmail(dot)com> wrote:
> 2012/6/17 Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>

>> Can someone provide a practical example of a "foreign key with
>> array" use case? The only situations I'm able to think of right
>> now are the same cases where you would now use a table with
>> primary keys of two tables to provide a many-to-many linkage.
>> Does this proposed feature handle other cases or handle this type
>> of case better?
>
> I can't imagine either other usablity... Just many-to-one
> linkage... or to have many-to-many link with less rows in middle
> table...

The many-to-one case seems like it is better handled in the other
direction -- with the referenced table holding the set of valid keys
and the referencing table holding the single key. (I believe the
general case of this is what Jeff called an "inclusion constraint"
-- a feature he wants to add at some point.) I can't think of a use
case where that would not be better for this type of relationship
than putting the array on the referencing side.

The many-to-many relationship does seem like a potentially useful
feature, at least in some cases. For example, a message board where
a limited number of tags can be attached to each message -- the
message could contain an array of tag IDs. Clearly this could be
done as a table holding message_id and tag_id, but it seems
potentially more convenient from a programming perspective to have
an array of tag_id values in the message table. Logically, you are
associating each of these with the primary key of the row it is in.
Are there other obvious use-cases? If nobody can put one forward,
this seems like a good "reality test" for proposed behaviors in any
corner cases where "correct" behavior isn't obvious -- what would
you want to do to the data if it were in the separate table with
just the primary keys to link the two tables?

> What is better - I think should be measured...

It would make an interesting test to compare performance of a
suitably sized data set and reasonable workload for both techniques.
Of course, that's a non-trivial amount of work. It seems like
people may be able to justify this just on ease-of-use, versus
claiming that it's an optimization.

-Kevin


From: Misa Simic <misa(dot)simic(at)gmail(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: simon(at)2ndquadrant(dot)com, gabriele(dot)bartolini(at)2ndquadrant(dot)it, marco(dot)nenciarini(at)2ndquadrant(dot)it, pgsql(at)j-davis(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Support for foreign keys with arrays
Date: 2012-06-19 00:19:28
Message-ID: CAH3i69mHu1dFWkvfTnHOrw8GQOWiC_T+vBCeDX9bzP6B=LFd6g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2012/6/18 Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>

> The many-to-one case seems like it is better handled in the other
> direction -- with the referenced table holding the set of valid keys
> and the referencing table holding the single key. (I believe the
> general case of this is what Jeff called an "inclusion constraint"
> -- a feature he wants to add at some point.) I can't think of a use
> case where that would not be better for this type of relationship
> than putting the array on the referencing side.
>
>
Hi Kevin,

Well, from my point of view "many-to-one" or "one-to-many" is more related
from which point we are looking to the thing... But let me better explain
what I thought...

Example 1)

If we have one table with master data TableA(ID, other properties...) and
TableB(tableAIDFK, propA, propB, propC) -PK of TableB is irrelavant in this
point... and let say a lot of TableA tuples could have the same TableB
properties... So we can have how many common TableA tuples, that many
tuples in TableB with the same values in PropA, PropB, and PropC with FK
the same type as in Table A, or to have 1 tuple in TableB with Array type
as FK field.... So it (1 tuple in TableB) can point many tuples in
TableC... And in the same time simple element can exist in TableA, but
could or doesn't have to exist in TableB...

What test would show is there any gain in this approach - I don't know...
but think it should - especially if propA,PropB, and C should be updated
for all of them...

Example 2)
>From other side, what Jeff propose, and what is also usefull, but different
thing is... to have the main data in TableA, but key field is an range
datatype... what later each element what belong to the range, could have
related tuple in TableB (Also, as the same range datatype - but smaller...
contained by Master one... or simple datatype subtype of the range) - which
is other way around... Opposite from exmaple 1 - but differnet from
functional point of view... Depending what is the Master... Also, for
example 1 - data in FK do not need to be in range.. so basicaly ID [1, 2 ,4
,7] could have 1 tuple with its properties, and [3,5,6] in second tuple
with different properties...

I am not sure Example 2) Jeff called "Inclusion Constraint" - Jeff can
explain it better :)

Based on

Simon Riggs wrote:
>> Do we need something like Exclusion FKs? i.e. the FK partner of
>> Exclusion Constraints?

>Yes, "Inclusion Constraints". I've known we need something like that
>since I did Exclusion Constraints, but I haven't gotten further than
>that.

>Regards,
> Jeff Davis

I have understood it as:

TableA(ID, properties...)
TableB(ID, properties...)

Now if we define FK on TableB to TableA... It means that row inserted in
TableB, must have already row with the same ID value in TableA...

But what would be usefull, to define Exclude FK to table A, to we prevent
insert new row in Table B with ID value what already exist in TableA...

btw, if anyone is happy to point me in right direction, and there is common
feeling it is usefull feature, I am happy to code it... Actually that is
something what I will code anyway for "in-house" solution - but would be
good to do it under Postgres standards...

Kind Regards,

Misa