Re: contrib/rtree_gist into core system?

Lists: pgsql-hackers
From: "John Hansen" <john(at)geeknet(dot)com(dot)au>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Christopher Kings-Lynne" <chriskl(at)familyhealth(dot)com(dot)au>
Cc: "Greg Stark" <gsstark(at)mit(dot)edu>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: contrib/rtree_gist into core system?
Date: 2005-06-27 05:16:52
Message-ID: 5066E5A966339E42AA04BA10BA706AE50A9379@rodrick.geeknet.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane Wrote:

> ... but rtree has always
> been marginal, and it's very hard to see where it can win over gist.

Simplicity!

Implementing rtree operators and support functions is FAR simpler than
implementing the GiST equivalents.

For example, suppose all you want to implement is the ~ operator for a
custom type, then technically all you need is 4 functions (well, 5
including the stub operators)

bool contains(type,type);
type intersect(type,type);
type union(type,type);
void size(type,*float);

And the 6 other operators simply defined as:
bool false(type) { return false; }

For GiST you still need 7 support functions + the operator function,
some of which aren't exactly simple to implement, the picksplit for
instance.

So I'd not recommend getting rid of rtree just yet. At least not until
someone has written an extensive howto on the subject of GiST
implementation.

... John


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "John Hansen" <john(at)geeknet(dot)com(dot)au>
Cc: "Christopher Kings-Lynne" <chriskl(at)familyhealth(dot)com(dot)au>, "Greg Stark" <gsstark(at)mit(dot)edu>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: contrib/rtree_gist into core system?
Date: 2005-06-27 05:51:00
Message-ID: 27955.1119851460@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"John Hansen" <john(at)geeknet(dot)com(dot)au> writes:
> Tom Lane Wrote:
>> ... but rtree has always
>> been marginal, and it's very hard to see where it can win over gist.

> Simplicity!
> Implementing rtree operators and support functions is FAR simpler than
> implementing the GiST equivalents.

Mmm ... not really. It does seem that we could offer some sort of
generic set of gist support routines that understand rtree-like
semantics (all the picksplit routines seem suspiciously similar,
for instance). But seeing that every known instance of rtree support
has been broken since day one, partly for lack of any documentation
about what was needed, it's a bit hard to claim that implementing rtree
is transparently trivial.

> So I'd not recommend getting rid of rtree just yet. At least not until
> someone has written an extensive howto on the subject of GiST
> implementation.

There's no HOWTO for rtree either. Again, my point is not that one
couldn't be written; it's that we would probably be better off spending
the effort on a HOWTO for gist.

regards, tom lane


From: Greg Stark <gsstark(at)mit(dot)edu>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "John Hansen" <john(at)geeknet(dot)com(dot)au>, "Christopher Kings-Lynne" <chriskl(at)familyhealth(dot)com(dot)au>, "Greg Stark" <gsstark(at)mit(dot)edu>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: contrib/rtree_gist into core system?
Date: 2005-06-27 06:47:40
Message-ID: 87slz41ek3.fsf@stark.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

> "John Hansen" <john(at)geeknet(dot)com(dot)au> writes:
>
> > Simplicity!
> > Implementing rtree operators and support functions is FAR simpler than
> > implementing the GiST equivalents.
>
> Mmm ... not really. It does seem that we could offer some sort of
> generic set of gist support routines that understand rtree-like
> semantics (all the picksplit routines seem suspiciously similar,
> for instance).

I think the picksplit part of the API is the strangest part. Normally when you
design data types you think in terms of your data type and the operations on
it. You shouldn't suddenly have to immerse yourself in some nitty gritty ADTs
for index pages.

I believe all the picksplit functions are based on (apparently via copy/paste)
a single algorithm that depends on a single operator: a kind of "distance"
function. Usually it's the same function underlying the penalty gist api
function. That's a natural operator to implement for data types and one that
doesn't involve learning about any extraneous implementation details of gist
indexing.

If gist index operator classes could be activated based entirely on data type
operators like "distance" and "union" and "penalty" instead of this strange
"picksplit" function then it would make implementing them *much* easier.

Moreover it would solve the multicolumn index issue. There are a probably
other options but the simplest would be to simply take the inner product of
the results of the various distance functions.

--
greg


From: Andrew - Supernews <andrew+nonews(at)supernews(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: contrib/rtree_gist into core system?
Date: 2005-06-27 07:48:49
Message-ID: slrndbvbr1.192v.andrew+nonews@trinity.supernews.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2005-06-27, Greg Stark <gsstark(at)mit(dot)edu> wrote:
> I believe all the picksplit functions are based on (apparently via
> copy/paste) a single algorithm that depends on a single operator: a kind
> of "distance" function. Usually it's the same function underlying the
> penalty gist api function.

That's not quite true. There are at least two quite different picksplit
algorithms in those of the contrib/* modules that I've studied, and in
general I do not think it is possible to provide a single generic
picksplit that will work efficiently for _all_ data types. (And it is of
course important not to constrain the types of data that are allowed...)

It might be reasonable to implement a "default" picksplit based on a
user-supplied metric function (_not_ the same metric as "penalty"). But
I think there always needs to be scope for the user to provide their own
split function.

--
Andrew, Supernews
http://www.supernews.com - individual and corporate NNTP services


From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, John Hansen <john(at)geeknet(dot)com(dot)au>, Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: contrib/rtree_gist into core system?
Date: 2005-06-27 08:47:49
Message-ID: 42BFBD35.90703@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> I believe all the picksplit functions are based on (apparently via copy/paste)
> a single algorithm that depends on a single operator: a kind of "distance"
> function. Usually it's the same function underlying the penalty gist api

You are wrong, at least now in contrib it used three basic picksplit algoritm
1 simple sorting for ordered domain( btree_gist, ltree )
2 several variations of Guttmans algorithm (tsearch2, intarray, seg, cube)
3 linear picksplit for rtree_gist
(http://www.sai.msu.su/~megera/postgres/gist/papers/nsplitLN.ps.gz).

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/


From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: John Hansen <john(at)geeknet(dot)com(dot)au>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au>, Greg Stark <gsstark(at)mit(dot)edu>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: contrib/rtree_gist into core system?
Date: 2005-06-27 08:49:20
Message-ID: 42BFBD90.1010706@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

FYI, compress and decompress methods may be trivial.

>
> For GiST you still need 7 support functions + the operator function,
> some of which aren't exactly simple to implement, the picksplit for
> instance.

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/