Re: knngist patch support

Lists: pgsql-hackers
From: "Ragi Y(dot) Burhum" <rburhum(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: knngist patch support
Date: 2010-02-11 00:49:59
Message-ID: 738927441002101649j72418be4y3a35e1e351f04b0b@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello,

I noticed this morning that the k nearest neighbor gist patch
https://commitfest.postgresql.org/action/patch_view?id=230 was still being
considered for inclusion in 9. Sadly, this feature appears to have been
dropped from 9.

It seems to me that the functionality this brings is one of the most common
use cases for mobile applications (or even web in some cases) that are
location enabled. How many times do we find ourselves "Finding the 10
nearest restaurants" in our smart phones? Well, this patch solves exactly
that in the most efficient manner.

Granted, one can currently solve this problem with PostgreSQL/PostGIS as it
stands, however the performance improvements that one can gain for those
types of (extremely common) queries leveraging the Gist enhancements are,
well, exciting.

300x time improvements? Sounds great to me.

I would love if you guys could consider applying this patch for this
upcoming release.

Best Regards,

- Ragi Burhum


From: tomas(at)tuxteam(dot)de
To: "Ragi Y(dot) Burhum" <rburhum(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: knngist patch support
Date: 2010-02-11 05:31:20
Message-ID: 20100211053120.GA25763@tomas
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Wed, Feb 10, 2010 at 04:49:59PM -0800, Ragi Y. Burhum wrote:
> Hello,
>
> I noticed this morning that the k nearest neighbor gist patch
> https://commitfest.postgresql.org/action/patch_view?id=230 was still being
> considered for inclusion in 9. Sadly, this feature appears to have been
> dropped from 9.

This has been discussed recently on this list. Seems the patch would
need more review to be considered stable. So it's the hard choice of
letting the schedule for 9.0 slip or not letting this patch in.

But some prerequisites will go in, that's the good news.

(BTW: I tried to find this discussion in the Web archives, but had no
luck. It's in my mailbox, though --

e.g.

message-ID 603c8f071002070527j1dada7cdseb42e7cbc71bf71a(at)mail(dot)gmail(dot)com

part of the long thread "Damage control mode", starting on Jan 8, 2010;
this one mail is from Feb 7 -- but that might be me)

Regards
- -- tomás
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFLc5YoBcgs9XrR2kYRAoW3AJ94tYWPenLOjH4B4GHD9DCYSSWYOQCeOcoM
RYDhINv+k9YeD23xFHyj9yw=
=K1E0
-----END PGP SIGNATURE-----


From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: tomas(at)tuxteam(dot)de, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc: "Ragi Y(dot) Burhum" <rburhum(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: knngist patch support
Date: 2010-02-11 06:16:09
Message-ID: Pine.LNX.4.64.1002110835430.16860@sn.sai.msu.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

This is very disgraceful from my point of view and reflects real problem
in scheduling of CF. The patch was submitted Nov 23 2009, discussed and
reworked Nov 25. Long holidays in December-January, probably are reason why
there were no any movement on reviewing the patch. People with
inspiration spent time to discuss rbtree, while it was clear, that rbtree is
a minor issue. Now we have no review and great feature is missing. I understand
that some healthy resistance is useful to let developers more accurate and
discipline, but, hey, not dropping great feature ! I'd understand if developer
is missing, or just not willing to contact, but I and Teodor are here and
we readily answer any questions.

I failed to find any documents about commitfest to understand if we already
discussed all possible scenario of feature drop. If we say 'A', when started
to formalize our development process instead of old way discuss&vote
in -hackers, then we should say 'B' - formalize procedure and possible
collision of interests.

Oleg
On Thu, 11 Feb 2010, tomas(at)tuxteam(dot)de wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On Wed, Feb 10, 2010 at 04:49:59PM -0800, Ragi Y. Burhum wrote:
>> Hello,
>>
>> I noticed this morning that the k nearest neighbor gist patch
>> https://commitfest.postgresql.org/action/patch_view?id=230 was still being
>> considered for inclusion in 9. Sadly, this feature appears to have been
>> dropped from 9.
>
> This has been discussed recently on this list. Seems the patch would
> need more review to be considered stable. So it's the hard choice of
> letting the schedule for 9.0 slip or not letting this patch in.
>
> But some prerequisites will go in, that's the good news.
>
> (BTW: I tried to find this discussion in the Web archives, but had no
> luck. It's in my mailbox, though --
>
> e.g.
>
> message-ID 603c8f071002070527j1dada7cdseb42e7cbc71bf71a(at)mail(dot)gmail(dot)com
>
> part of the long thread "Damage control mode", starting on Jan 8, 2010;
> this one mail is from Feb 7 -- but that might be me)
>
> Regards
> - -- tomЪЪs
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.6 (GNU/Linux)
>
> iD8DBQFLc5YoBcgs9XrR2kYRAoW3AJ94tYWPenLOjH4B4GHD9DCYSSWYOQCeOcoM
> RYDhINv+k9YeD23xFHyj9yw=
> =K1E0
> -----END PGP SIGNATURE-----
>
>

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
Cc: tomas(at)tuxteam(dot)de, Teodor Sigaev <teodor(at)sigaev(dot)ru>, "Ragi Y(dot) Burhum" <rburhum(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: knngist patch support
Date: 2010-02-11 06:49:56
Message-ID: 3698.1265870996@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Oleg Bartunov <oleg(at)sai(dot)msu(dot)su> writes:
> This is very disgraceful from my point of view and reflects real problem
> in scheduling of CF. The patch was submitted Nov 23 2009, discussed and
> reworked Nov 25. Long holidays in December-January, probably are reason why
> there were no any movement on reviewing the patch.

There was a scheduling problem all right, which was that this patch *did
not make* the deadline for the November CF. The fact that it got any
review at all in November was more than expected under the CF process.
And I remind you that we stated more than once that we didn't want major
feature patches to show up only at the last CF. If it had come from
anyone other than you and Teodor, there would have not been even a
moment's consideration of letting it into 9.0.

My own feeling about it is that I much preferred the original proposal
of a contrib module with little or no change to core code. I don't want
to be changing core code for this at this late hour. If it were only
touching GIST I'd be willing to rely on your and Teodor's expertise in
that module, but it's not. It whacks around the planner, it makes
questionable changes in the operator class structure, and the last
version I saw hadn't any documentation whatever. It's not committable
on documentation grounds alone, even if everybody was satisfied about
the code.

How do you feel about going back to the original contrib module for now
and resubmitting the builtin version for 9.1?

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
Cc: tomas(at)tuxteam(dot)de, Teodor Sigaev <teodor(at)sigaev(dot)ru>, "Ragi Y(dot) Burhum" <rburhum(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: knngist patch support
Date: 2010-02-11 07:28:25
Message-ID: 603c8f071002102328p187fb559v26eb4f5c99f79aa4@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2010/2/11 Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>:
> This is very disgraceful from my point of view and reflects real problem
> in scheduling of CF. The patch was submitted Nov 23 2009, discussed and
> reworked Nov 25. Long holidays in December-January, probably are reason why
> there were no any movement on reviewing the patch. People with

So... I think the reason why there was no movement between November
25th and January 15th is because no CommitFest started between
November 25th and January 15th. Had you submitted the patch on
November 14th, you would have gotten a lot more feedback in November;
I agree that we don't have a lot of formal documentation about the
CommitFest process, but I would think that much would be pretty clear,
but maybe not. The reason there was no movement after January 15th is
because (1) I couldn't get anyone to volunteer to review it, except
Mark Cave-Ayland who didn't actually do so (or anyway didn't post
anything publicly), and (2) we were still working on rbtree.

Personally, I am a little irritated about the whole way this situation
has unfolded. I devoted a substantial amount of time over my
Christmas vacation to patch review, and many of those patches went on
to be committed. Some of the patches I reviewed were yours. I did
not get paid one dime for any of that work. I expressed candidly,
from the very beginning, that getting such a large patch done by the
end of this CommitFest would likely be difficult, especially given
that it had two precursor patches. In exchange for giving you my
honest opinions about your patches two weeks before the scheduled
start of the CommitFest, over my Christmas vacation, and for free, I
got a long stream of complaints from you and others about how the
process is unfair, and as nearly zero help making the prerequisite
patches committable as it is possible for anyone to achieve. It
regularly took 4-6 days for a new version of the patch to appear, and
as often as not questions in my reviews were ignored for days, if not
weeks. It took a LOT of iterations before my performance concerns
were addressed; and I believe that process could have been done MUCH
more quickly.

Now, it is possible that as you are sitting there reading this email,
you are thinking to yourself "well, your feedback didn't actually make
that patch any better, so this whole thing is just pure
obstructionism." I don't believe that's the case, but obviously I'm
biased and everyone is entitled to their own opinion. What I can tell
you for sure is that all of my reviewing was done with the best of
motivations and in a sincere attempt to do the right thing.

You may be right that January 15th was a bad time to start a
CommitFest, although it's very unclear to me why that might be. At
least in the US, the holidays are over long before January 15th, but
we had a very small crop of reviewers this time around, and a number
of them failed to review the patches they picked up, or did only a
very cursory review. It might be mentioned that if you have concerns
about getting your own patches reviewed, you might want to think about
reviewing some patches by other people. Of the 60 patches currently
in the 2010-01 CommitFest, I'm listed as a reviewer on 12 of them.
Needless to say, if someone else had volunteered to do some or all of
the review work on some of those patches, I would have had more time
to work on other patches.

...Robert


From: "Ragi Y(dot) Burhum" <rburhum(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, tomas(at)tuxteam(dot)de, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: knngist patch support
Date: 2010-02-11 07:45:37
Message-ID: 738927441002102345v28baa7f2t2d83c4d996527851@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I have to say that as a 3rd party observer it is quite obvious to
understand why the PostgreSQL software is so good - people are very
passionate about the work they are doing. However, in this instance,
as a by-stander, it seems that there is a lot of energy being spent on
pointing fingers. At the end, the only people that loose are users
like me who would love to have a feature like this since it would
literally make one of the most common types of spatial queries, for
lack of better wording, ridiculously fast. I sincerely apologize if I
triggered any kind of trouble by asking a questions about this
feature.

- Ragi

On Wed, Feb 10, 2010 at 11:28 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> 2010/2/11 Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>:
>> This is very disgraceful from my point of view and reflects real problem
>> in scheduling of CF. The patch was submitted Nov 23 2009, discussed and
>> reworked Nov 25. Long holidays in December-January, probably are reason why
>> there were no any movement on reviewing the patch. People with
>
> So...  I think the reason why there was no movement between November
> 25th and January 15th is because no CommitFest started between
> November 25th and January 15th.  Had you submitted the patch on
> November 14th, you would have gotten a lot more feedback in November;
> I agree that we don't have a lot of formal documentation about the
> CommitFest process, but I would think that much would be pretty clear,
> but maybe not.  The reason there was no movement after January 15th is
> because (1) I couldn't get anyone to volunteer to review it, except
> Mark Cave-Ayland who didn't actually do so (or anyway didn't post
> anything publicly), and (2) we were still working on rbtree.
>
> Personally, I am a little irritated about the whole way this situation
> has unfolded.  I devoted a substantial amount of time over my
> Christmas vacation to patch review, and many of those patches went on
> to be committed.  Some of the patches I reviewed were yours.  I did
> not get paid one dime for any of that work.  I expressed candidly,
> from the very beginning, that getting such a large patch done by the
> end of this CommitFest would likely be difficult, especially given
> that it had two precursor patches.  In exchange for giving you my
> honest opinions about your patches two weeks before the scheduled
> start of the CommitFest, over my Christmas vacation, and for free, I
> got a long stream of complaints from you and others about how the
> process is unfair, and as nearly zero help making the prerequisite
> patches committable as it is possible for anyone to achieve.  It
> regularly took 4-6 days for a new version of the patch to appear, and
> as often as not questions in my reviews were ignored for days, if not
> weeks.  It took a LOT of iterations before my performance concerns
> were addressed; and I believe that process could have been done MUCH
> more quickly.
>
> Now, it is possible that as you are sitting there reading this email,
> you are thinking to yourself "well, your feedback didn't actually make
> that patch any better, so this whole thing is just pure
> obstructionism."  I don't believe that's the case, but obviously I'm
> biased and everyone is entitled to their own opinion.  What I can tell
> you for sure is that all of my reviewing was done with the best of
> motivations and in a sincere attempt to do the right thing.
>
> You may be right that January 15th was a bad time to start a
> CommitFest, although it's very unclear to me why that might be.  At
> least in the US, the holidays are over long before January 15th, but
> we had a very small crop of reviewers this time around, and a number
> of them failed to review the patches they picked up, or did only a
> very cursory review.  It might be mentioned that if you have concerns
> about getting your own patches reviewed, you might want to think about
> reviewing some patches by other people.  Of the 60 patches currently
> in the 2010-01 CommitFest, I'm listed as a reviewer on 12 of them.
> Needless to say, if someone else had volunteered to do some or all of
> the review work on some of those patches, I would have had more time
> to work on other patches.
>
> ...Robert
>


From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: tomas(at)tuxteam(dot)de, Teodor Sigaev <teodor(at)sigaev(dot)ru>, "Ragi Y(dot) Burhum" <rburhum(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: knngist patch support
Date: 2010-02-11 08:00:20
Message-ID: Pine.LNX.4.64.1002111015290.16860@sn.sai.msu.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, 11 Feb 2010, Tom Lane wrote:

> Oleg Bartunov <oleg(at)sai(dot)msu(dot)su> writes:
>> This is very disgraceful from my point of view and reflects real problem
>> in scheduling of CF. The patch was submitted Nov 23 2009, discussed and
>> reworked Nov 25. Long holidays in December-January, probably are reason why
>> there were no any movement on reviewing the patch.
>
> There was a scheduling problem all right, which was that this patch *did
> not make* the deadline for the November CF. The fact that it got any
> review at all in November was more than expected under the CF process.
> And I remind you that we stated more than once that we didn't want major
> feature patches to show up only at the last CF. If it had come from
> anyone other than you and Teodor, there would have not been even a
> moment's consideration of letting it into 9.0.

there were several long threads, which I have no possibility to follow,
so we relied on the wisdom of people, who can discuss. So, it's true we
didn't track all nuances of our development schedule. We just developed.
Looked on commitfest page we didn't find any summary and it's hard to
understand what is the final word.

In the old-good time we also discussed
a lot, we release faster and we always had tolerance of time, since many
things were not formalized, we were developers and reviewed each other.
Now, the whole process of development redesigned to be more enterprize,
but we still have problem with resources - developers, reviewers. And I don't
see, how all changes try to solve this problem. We have problem with long
release cycle, it's getting worse and worse, in spite of CF. The main problem
is not in scheduling - we have little delta here, the problem is in human
resources and unclear regulations make it worse.

>
> My own feeling about it is that I much preferred the original proposal
> of a contrib module with little or no change to core code. I don't want
> to be changing core code for this at this late hour. If it were only
> touching GIST I'd be willing to rely on your and Teodor's expertise in
> that module, but it's not. It whacks around the planner, it makes
> questionable changes in the operator class structure, and the last

aha, we originally submit contrib module, which didn't touch anything you
mentioned, we improve stuff to follow discussion and now we are out of luck %(

> version I saw hadn't any documentation whatever. It's not committable
> on documentation grounds alone, even if everybody was satisfied about
> the code.

well, there is enough documentation to review patch.
In my understanding this was always enough to submit code.
User's documentation is depend on discussion and review and can be added later
before releasing beta.

>
> How do you feel about going back to the original contrib module for now
> and resubmitting the builtin version for 9.1?

Hmm, one good thing is that rbtree seems ok for submisson. We need to discuss
this, if it's good for PostGIS community. I'd not complain about this decision
if it touch my interests only, I could live with closed-source patch.

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83


From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: tomas(at)tuxteam(dot)de, Teodor Sigaev <teodor(at)sigaev(dot)ru>, "Ragi Y(dot) Burhum" <rburhum(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: knngist patch support
Date: 2010-02-11 08:38:01
Message-ID: Pine.LNX.4.64.1002111101240.16860@sn.sai.msu.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, 11 Feb 2010, Robert Haas wrote:

> 2010/2/11 Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>:
>> This is very disgraceful from my point of view and reflects real problem
>> in scheduling of CF. The patch was submitted Nov 23 2009, discussed and
>> reworked Nov 25. Long holidays in December-January, probably are reason why
>> there were no any movement on reviewing the patch. People with
>
> So... I think the reason why there was no movement between November
> 25th and January 15th is because no CommitFest started between
> November 25th and January 15th. Had you submitted the patch on
> November 14th, you would have gotten a lot more feedback in November;
> I agree that we don't have a lot of formal documentation about the
> CommitFest process, but I would think that much would be pretty clear,
> but maybe not. The reason there was no movement after January 15th is
> because (1) I couldn't get anyone to volunteer to review it, except
> Mark Cave-Ayland who didn't actually do so (or anyway didn't post
> anything publicly), and (2) we were still working on rbtree.
>
> Personally, I am a little irritated about the whole way this situation
> has unfolded. I devoted a substantial amount of time over my

Robert, please accept my public apology, if you feel I offense you. There are
nothing against you. Your contribution is very important and I really don't
understand why on the Earth you're not paid ! I remember discussion
to paid you from our foundation. That's shame.
Does nybody ever got support for development from our foundation ?

> Christmas vacation to patch review, and many of those patches went on
> to be committed. Some of the patches I reviewed were yours. I did
> not get paid one dime for any of that work. I expressed candidly,
> from the very beginning, that getting such a large patch done by the
> end of this CommitFest would likely be difficult, especially given
> that it had two precursor patches. In exchange for giving you my
> honest opinions about your patches two weeks before the scheduled
> start of the CommitFest, over my Christmas vacation, and for free, I
> got a long stream of complaints from you and others about how the
> process is unfair, and as nearly zero help making the prerequisite
> patches committable as it is possible for anyone to achieve. It
> regularly took 4-6 days for a new version of the patch to appear, and
> as often as not questions in my reviews were ignored for days, if not
> weeks. It took a LOT of iterations before my performance concerns
> were addressed; and I believe that process could have been done MUCH
> more quickly.

Robert, it's very hard to marshal all developers, who are not-paid people
with their regular duties and problems and their own interests in postgres.
You just discovered we have long-long
holidays in Russia, when people try to spend somewhere. I always beaten with
Christmas in December, when I tried to communicate with business people un US.
Earlier, we lived with this and our releases were faster. I'd not say, CF is
a step back, but our system should have tolerance in time if we're
open-source community, or go enterprize way - we are all paid, we follow
business plan, ... etc. Something is really wrong, that's what I can say.

>
> Now, it is possible that as you are sitting there reading this email,
> you are thinking to yourself "well, your feedback didn't actually make
> that patch any better, so this whole thing is just pure
> obstructionism." I don't believe that's the case, but obviously I'm
> biased and everyone is entitled to their own opinion. What I can tell
> you for sure is that all of my reviewing was done with the best of
> motivations and in a sincere attempt to do the right thing.
>
> You may be right that January 15th was a bad time to start a
> CommitFest, although it's very unclear to me why that might be. At
> least in the US, the holidays are over long before January 15th, but
> we had a very small crop of reviewers this time around, and a number
> of them failed to review the patches they picked up, or did only a
> very cursory review. It might be mentioned that if you have concerns
> about getting your own patches reviewed, you might want to think about
> reviewing some patches by other people. Of the 60 patches currently
> in the 2010-01 CommitFest, I'm listed as a reviewer on 12 of them.
> Needless to say, if someone else had volunteered to do some or all of
> the review work on some of those patches, I would have had more time
> to work on other patches.

Robert, human resources are the main problem and, first of all,
our system should work for developers ! If we will not understand each other
and follow only some unclear rules, we'll lost current developers and will
not attract new. We, probably, in our particulary case, will follow our
original suggestion -just contrib module, but I concern about future. Now I
have to think not just about algorithms and implementation, but about
reviewer and current regulation.

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
Cc: tomas(at)tuxteam(dot)de, Teodor Sigaev <teodor(at)sigaev(dot)ru>, "Ragi Y(dot) Burhum" <rburhum(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: knngist patch support
Date: 2010-02-11 13:01:25
Message-ID: 603c8f071002110501m4b7e692asa8cc9849160552b5@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Feb 11, 2010 at 3:38 AM, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su> wrote:
> Robert, please accept my public apology, if you feel I offense you. There
> are
> nothing against you. Your contribution is very important and I really don't
> understand why on the Earth you're not paid ! I remember discussion to paid
> you from our foundation.  That's shame. Does nybody ever got support for
> development from our foundation ?

No, I don't feel like you offended me. It's more that, from my point
of view, it seems like all the things you're complaining about are
things that you more or less have control over, or at least could have
foreseen. I have only been involved in this project for a year and a
half, so the CommitFest process is the only process that I know or
understand. On the whole, I've found it to be a pretty good process.
I get my patches in; I help other people get their patches in (and
hopefully improve them along the way). It's particularly appealing
when you're a non-committer, as it gives you a formal structure to
make sure your work gets looked at.

It seems that you're sort of frustrated with the system and the need
to go through a process before committing a patch; and that you feel
that the rules are unclear. I don't think it's a bad thing to go
through a process before committing a patch, especially a large patch
like knngist, but of course that's just my opinion. I agree that the
fact that the rules are unclear is a problem, though I'm not sure what
to do about it. I am not sure they are so unclear as you are making
them out to be, but again, I'm biased by being a relative newcomer, as
well as someone who has been in the middle of many of the process
discussions.

> Robert, human resources are the main problem and, first of all,
> our system should work for developers ! If we will not understand each other
> and follow only some unclear rules, we'll lost current developers and will
> not attract new. We, probably, in our particulary case, will follow our
> original suggestion -just contrib module, but I concern about future. Now I
> have to think not just about algorithms and implementation, but about
> reviewer and current regulation.

IMHO, our system has to work for both developers and users, and it has
to work for both committers and non-committers.

...Robert


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, tomas(at)tuxteam(dot)de, Teodor Sigaev <teodor(at)sigaev(dot)ru>, "Ragi Y(dot) Burhum" <rburhum(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: knngist patch support
Date: 2010-02-11 13:18:23
Message-ID: 603c8f071002110518l7254179av209cd3fb9889d03d@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Feb 11, 2010 at 3:00 AM, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su> wrote:
>> version I saw hadn't any documentation whatever.  It's not committable
>> on documentation grounds alone, even if everybody was satisfied about
>> the code.
>
> well, there is enough documentation to review patch.

Where is there any documentation at all? There are no changes to doc/
at all; no README; and not even a lengthy comment block anywhere that
I saw. Nor did the email in which the patch was submitted clearly lay
out the design of the feature.

> In my understanding
> this was always enough to submit code. User's documentation is depend on
> discussion and review and can be added later
> before releasing beta.

Several people have said this lately, but it doesn't match what I've
seen of our practice over the last year and a half; Tom regularly
boots patches that lack documentation (or necessary regression test
updates). Sure, people often submit small patches without
documentation thinking to fill it in later, but anything major pretty
much has to have it, AFAICS. From my own point of view, I would never
commit anything that lacked documentation, for fear of being asked to
write it myself if the patch author didn't. Of course it's a bit
different for committers, who can presumably be counted on to clean up
their own mess, but I still think it's fair to expect at least some
effort to be put into the docs before commit.

...Robert


From: Dimitri Fontaine <dfontaine(at)hi-media(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, tomas(at)tuxteam(dot)de, Teodor Sigaev <teodor(at)sigaev(dot)ru>, "Ragi Y(dot) Burhum" <rburhum(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: knngist patch support
Date: 2010-02-11 14:19:14
Message-ID: 87k4ujam8t.fsf@hi-media-techno.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> It seems that you're sort of frustrated with the system and the need
> to go through a process before committing a patch;

I've been handling arround here for years (since 2005 or before) and I
think there always was a process. The only change is it's getting more
and more formal. But still not any clearer.

It's good to try to keep the major releases one year apart, but until
now we had some flexibility towards developpers. They could have their
own agenda then appear with a big patch and it was getting considered.

We never asked contributors to arrange for being able to find a
sponsor, do the closed source version, prepare for publishing, and then
send a patch in a timely maneer so that to ease the integration and
release.

Before there was a Commit Fest process, we took some weeks then months
at the end of the cycle to consider what had been accumulated.

The new process is there for giving more feedback to developpers, and is
being considered now as a way to get better control about the release
agenda. I'm not sure it's a good tool for that. I'm not sure insisting
that much on the release schedule is a good idea.

Once more making compromises is easy. What's hard and challenging is
making *good* compromises.

> IMHO, our system has to work for both developers and users, and it has
> to work for both committers and non-committers.

That's an easy goal to share. The question is how you get there without
losing existing developpers and possibly attracting new developpers on
the way.
--
dim


From: Greg Stark <gsstark(at)mit(dot)edu>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, tomas(at)tuxteam(dot)de, Teodor Sigaev <teodor(at)sigaev(dot)ru>, "Ragi Y(dot) Burhum" <rburhum(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: knngist patch support
Date: 2010-02-11 14:51:46
Message-ID: 407d949e1002110651t1e4b06a1tfd625812b83ad37c@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Feb 11, 2010 at 1:18 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
>> In my understanding
>> this was always enough to submit code. User's documentation is depend on
>> discussion and review and can be added later
>> before releasing beta.
>
> Several people have said this lately, but it doesn't match what I've
> seen of our practice over the last year and a half;

Perhaps the confusion is that we often say not to worry about the
quality of the English in the documentation. That's because it's easy
for a reviewer to fix up the English but not so easy to figure out
what you intend the behaviour to be.

--
greg


From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, tomas(at)tuxteam(dot)de, Teodor Sigaev <teodor(at)sigaev(dot)ru>, "Ragi Y(dot) Burhum" <rburhum(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: knngist patch support
Date: 2010-02-11 15:18:46
Message-ID: Pine.LNX.4.64.1002111805210.16860@sn.sai.msu.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, 11 Feb 2010, Robert Haas wrote:

> On Thu, Feb 11, 2010 at 3:00 AM, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su> wrote:
>>> version I saw hadn't any documentation whatever.  It's not committable
>>> on documentation grounds alone, even if everybody was satisfied about
>>> the code.
>>
>> well, there is enough documentation to review patch.
>
> Where is there any documentation at all? There are no changes to doc/
> at all; no README; and not even a lengthy comment block anywhere that
> I saw. Nor did the email in which the patch was submitted clearly lay
> out the design of the feature.

Well, initial knngist announce
http://archives.postgresql.org/pgsql-hackers/2009-11/msg01547.php
isn't enough to review ? We made test data available to reproduce
results, see http://www.sai.msu.su/~megera/wiki/2009-11-25
We are here and open to any reviewer's question.

>
>> In my understanding
>> this was always enough to submit code. User's documentation is depend on
>> discussion and review and can be added later
>> before releasing beta.
>
> Several people have said this lately, but it doesn't match what I've
> seen of our practice over the last year and a half; Tom regularly
> boots patches that lack documentation (or necessary regression test
> updates). Sure, people often submit small patches without
> documentation thinking to fill it in later, but anything major pretty
> much has to have it, AFAICS. From my own point of view, I would never
> commit anything that lacked documentation, for fear of being asked to
> write it myself if the patch author didn't. Of course it's a bit
> different for committers, who can presumably be counted on to clean up
> their own mess, but I still think it's fair to expect at least some
> effort to be put into the docs before commit.

I think nobody will spend his time to write sgml code for user's
documentation for fear his patch will be rejected/moved/getting rewritten,
so his time will be just wasted.

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83


From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, tomas(at)tuxteam(dot)de, Teodor Sigaev <teodor(at)sigaev(dot)ru>, "Ragi Y(dot) Burhum" <rburhum(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: knngist patch support
Date: 2010-02-11 15:23:10
Message-ID: Pine.LNX.4.64.1002111819180.16860@sn.sai.msu.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, 11 Feb 2010, Greg Stark wrote:

> On Thu, Feb 11, 2010 at 1:18 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>
>>> In my understanding
>>> this was always enough to submit code. User's documentation is depend on
>>> discussion and review and can be added later
>>> before releasing beta.
>>
>> Several people have said this lately, but it doesn't match what I've
>> seen of our practice over the last year and a half;
>
> Perhaps the confusion is that we often say not to worry about the
> quality of the English in the documentation. That's because it's easy
> for a reviewer to fix up the English but not so easy to figure out
> what you intend the behaviour to be.

English + SGML stuff. We usually provide information in plain text, posted
in -hackers and published in my wiki. I don't remember, that there were
no information about patches.

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83


From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: tomas(at)tuxteam(dot)de, Teodor Sigaev <teodor(at)sigaev(dot)ru>, "Ragi Y(dot) Burhum" <rburhum(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: knngist patch support
Date: 2010-02-11 16:37:56
Message-ID: Pine.LNX.4.64.1002111924440.16860@sn.sai.msu.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, 11 Feb 2010, Oleg Bartunov wrote:

> On Thu, 11 Feb 2010, Tom Lane wrote:
>>
>> My own feeling about it is that I much preferred the original proposal
>> of a contrib module with little or no change to core code. I don't want
>> to be changing core code for this at this late hour. If it were only
>> touching GIST I'd be willing to rely on your and Teodor's expertise in
>> that module, but it's not. It whacks around the planner, it makes
>> questionable changes in the operator class structure, and the last

We splitted patch to make review easy, probably by several reviewers,
since we touched several subsystems.
http://archives.postgresql.org/message-id/4B4CCB9F.8080708@sigaev.ru

Patch for planner is 5600 bytes long, not so big.

>
> aha, we originally submit contrib module, which didn't touch anything you
> mentioned, we improve stuff to follow discussion and now we are out of luck
> %(
>
>> version I saw hadn't any documentation whatever. It's not committable
>> on documentation grounds alone, even if everybody was satisfied about
>> the code.
>
> well, there is enough documentation to review patch. In my understanding this
> was always enough to submit code. User's documentation is depend on
> discussion and review and can be added later
> before releasing beta.
>
>>
>> How do you feel about going back to the original contrib module for now
>> and resubmitting the builtin version for 9.1?
>
> Hmm, one good thing is that rbtree seems ok for submisson. We need to discuss
> this, if it's good for PostGIS community. I'd not complain about this
> decision
> if it touch my interests only, I could live with closed-source patch.

Contrib module is a big step backward and will produce compatibility problem,
since we'll have to use awkward operation ><, which works different
with/without index. Also, it'd be very difficult to add support to other
contrib modules (btree_gist, pg_trgm).

Links:
Heikki complaint - http://archives.postgresql.org/message-id/4B0B8C30.2080400@enterprisedb.com
Simon - http://archives.postgresql.org/message-id/1259115190.27757.11194.camel@ebony

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83


From: tomas(at)tuxteam(dot)de
To: Dimitri Fontaine <dfontaine(at)hi-media(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, tomas(at)tuxteam(dot)de, Teodor Sigaev <teodor(at)sigaev(dot)ru>, "Ragi Y(dot) Burhum" <rburhum(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: knngist patch support
Date: 2010-02-11 21:39:25
Message-ID: 20100211213925.GA17972@tomas
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Thu, Feb 11, 2010 at 03:19:14PM +0100, Dimitri Fontaine wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> > It seems that you're sort of frustrated with the system and the need
> > to go through a process before committing a patch;
>
> I've been handling arround here for years (since 2005 or before) and I
> think there always was a process. The only change is it's getting more
> and more formal. But still not any clearer.

I didn't want to give the impression that I see the process as too
heavy. Granted, knngist is a cool feature, and I'm a big fan of Oleg and
Teodor (since they pulled the hstore trick :)

Still I think Robert is doing a terrific job. The weight of the big
patches has increased considerably in the last years, and without the
heroic efforts of some people (adn Robert is definitely among them!),
PostgreSQL's release process wouldn't have been as smooth.

> It's good to try to keep the major releases one year apart, but until
> now we had some flexibility towards developpers. They could have their
> own agenda then appear with a big patch and it was getting considered.
>
> We never asked contributors to arrange for being able to find a
> sponsor, do the closed source version, prepare for publishing, and then
> send a patch in a timely maneer so that to ease the integration and
> release.
>
> Before there was a Commit Fest process, we took some weeks then months
> at the end of the cycle to consider what had been accumulated.

Yes. But remember HOT. Things seem smoother nowadays (disclaimer: I'm
just a bystander).

> The new process is there for giving more feedback to developpers, and is
> being considered now as a way to get better control about the release
> agenda. I'm not sure it's a good tool for that. I'm not sure insisting
> that much on the release schedule is a good idea.

What's your proposal then? Allow more slippage in the release schedule?
How much? When?

> Once more making compromises is easy. What's hard and challenging is
> making *good* compromises.

Agreed

Regards
- -- tomás
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFLdHkNBcgs9XrR2kYRAhSsAJ4n899St+aPYkRA7XBX78vF/xZh9wCfZ3T0
h226KRarnFrEEIcOY+zBG9E=
=X9fa
-----END PGP SIGNATURE-----


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, tomas(at)tuxteam(dot)de, Teodor Sigaev <teodor(at)sigaev(dot)ru>, "Ragi Y(dot) Burhum" <rburhum(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: knngist patch support
Date: 2010-02-12 18:14:46
Message-ID: 603c8f071002121014o19bb598bxfb0101e78ef2eb42@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2010/2/11 Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>:
> On Thu, 11 Feb 2010, Robert Haas wrote:
>
>> On Thu, Feb 11, 2010 at 3:00 AM, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su> wrote:
>>>>
>>>> version I saw hadn't any documentation whatever.  It's not committable
>>>> on documentation grounds alone, even if everybody was satisfied about
>>>> the code.
>>>
>>> well, there is enough documentation to review patch.
>>
>> Where is there any documentation at all?  There are no changes to doc/
>> at all; no README; and not even a lengthy comment block anywhere that
>> I saw.  Nor did the email in which the patch was submitted clearly lay
>> out the design of the feature.
>
> Well, initial knngist announce
> http://archives.postgresql.org/pgsql-hackers/2009-11/msg01547.php
> isn't enough to review ? We made test data available to reproduce results,
> see http://www.sai.msu.su/~megera/wiki/2009-11-25
> We are here and open to any reviewer's question.

Well, just for example, that doesn't document the changes you made to
the opclass infrastructure, and the reasons for those decisions.
Actually, I've been working on an email on that topic but haven't
gotten around to finishing it. There's some description of the
overall goal of the feature but not much about how you got there. Is
it enough to review? Perhaps, but it would certainly be nice to have
some more discussion of the overall design. IMHO, anyway.

...Robert


From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, tomas(at)tuxteam(dot)de, Teodor Sigaev <teodor(at)sigaev(dot)ru>, "Ragi Y(dot) Burhum" <rburhum(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: knngist patch support
Date: 2010-02-12 20:44:58
Message-ID: Pine.LNX.4.64.1002122319100.16860@sn.sai.msu.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, 12 Feb 2010, Robert Haas wrote:

> 2010/2/11 Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>:
> > On Thu, 11 Feb 2010, Robert Haas wrote:
> >
> >> On Thu, Feb 11, 2010 at 3:00 AM, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su> wrote:
> >>>>
> >>>> version I saw hadn't any documentation whatever. =A0It's not committab=
> le
> >>>> on documentation grounds alone, even if everybody was satisfied about
> >>>> the code.
> >>>
> >>> well, there is enough documentation to review patch.
> >>
> >> Where is there any documentation at all? =A0There are no changes to doc/
> >> at all; no README; and not even a lengthy comment block anywhere that
> >> I saw. =A0Nor did the email in which the patch was submitted clearly lay
> >> out the design of the feature.
> >
> > Well, initial knngist announce
> > http://archives.postgresql.org/pgsql-hackers/2009-11/msg01547.php
> > isn't enough to review ? We made test data available to reproduce results=
> ,
> > see http://www.sai.msu.su/~megera/wiki/2009-11-25
> > We are here and open to any reviewer's question.
>
> Well, just for example, that doesn't document the changes you made to
> the opclass infrastructure, and the reasons for those decisions.
> Actually, I've been working on an email on that topic but haven't
> gotten around to finishing it. There's some description of the
> overall goal of the feature but not much about how you got there. Is
> it enough to review? Perhaps, but it would certainly be nice to have
> some more discussion of the overall design. IMHO, anyway.

This is not fair,Robert. Everything was discussed in -hackers.I assume reviewer
should follow discussion at least, he is a member of our community. Mailing
list archive was/is/will our the best knowledge base. For example, regarding
changes in the opclass infrastructure you complain, you can see your reply
to Teodor's message http://archives.postgresql.org/message-id/603c8f070912292255u30e1983bi22ed5778bd2ce890@mail.gmail.com
which contains description of amcanorderbyop flag.

Frankly, I think we see here limit of our resources. Let me explain this.
We splitted patch by several parts - 2 parts are about contrib modules
(rather trivial), 1 - is a gist changes, 1 - planner changes and 1 part -
some proc changes. The most serious are gist and planner patches. We develop
GiST for many years and know almost everything there and could say that we're
responsible for GiST. I don't know if anybody from -hackers could review our
patch for planner better than Tom, but he is busy and will be busy.
So, any serious feature, which touch planner doomed to be rejected because of
lack of reviewer.

We tried to find compromise for 9.0 (Tom suggests contrib module), but all
variants are ugly and bring incompatibility in future. If there are no hackers
willing/capable to review our patches, then, please, help us how to save
neighbourhood search without incompatibility in future.

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, tomas(at)tuxteam(dot)de, Teodor Sigaev <teodor(at)sigaev(dot)ru>, "Ragi Y(dot) Burhum" <rburhum(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: knngist patch support
Date: 2010-02-12 21:14:51
Message-ID: 603c8f071002121314h2d95b2d1yfea0291e20a0e341@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 12, 2010 at 3:44 PM, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su> wrote:
> This is not fair,Robert. Everything was discussed in -hackers.I assume
> reviewer
> should follow discussion at least, he is a member of our community. Mailing
> list archive was/is/will our the best knowledge base.

Dude, there's no fair or unfair here; I'm just telling you what I
think. There are not six people who follow this mailing list more
closely than I do, and when I started looking at this patch it took me
two hours to figure out why you made those changes to the opclass
machinery. It's not documented anywhere either in the patch or in the
email in which the patch was submitted. That made it hard for me. If
that makes me stupid, then I'm stupid, but then probably there are a
lot of stupid people around here.

> For example, regarding
> changes in the opclass infrastructure you complain, you can see your reply
> to Teodor's message
> http://archives.postgresql.org/message-id/603c8f070912292255u30e1983bi22ed5778bd2ce890@mail.gmail.com
> which contains description of amcanorderbyop flag.

OK, so in one email message that is not the email in which you
submitted the patch there is one sentence of explanation that I failed
to remember six weeks later when I looked at the patch.

> Frankly, I think we see here limit of our resources. Let me explain this.
> We splitted patch by several parts - 2 parts are about contrib modules
> (rather trivial), 1 - is a gist changes, 1 - planner changes and 1 part -
> some proc changes. The most serious are gist and planner patches. We develop
> GiST for many years and know almost everything there and could say that
> we're
> responsible for GiST. I don't know if anybody from -hackers could review our
> patch for planner better than Tom, but he is busy and will be busy.
> So, any serious feature, which touch planner doomed to be rejected because
> of
> lack of reviewer.

I do think resource limitations play a role. For at least as long as
I have been involved in the community, we have relied very heavily on
Tom Lane to review nearly every patch and commit more than half of
them. As our group of developers grows, that is unsustainable, which
I believe to be part of the reason that -core just created four new
committers; as well as part of the reason for the CommitFest process,
which tries to enlist non-committer reviewers. But none of us are Tom
Lane. The part of your patch that took me two hours to figure out
probably would have taken Tom twenty minutes (maybe less). But even
if we assume that I am as smart as Tom Lane and that I will spend as
much time working on PostgreSQL as Tom Lane, both of which may well be
false, I won't know the code base as well as he does now until ~2020.

The only way we can get past that is, first, by splitting up the work
across as many people as possible, and second, by making it as easy as
possible for the reviewer to understand what the code is about. And
at least if the reviewer is me, *documentation helps*.

I'm going to respond to the part of this email that's about moving
this patch forward with a separate email.

...Robert


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, tomas(at)tuxteam(dot)de, Teodor Sigaev <teodor(at)sigaev(dot)ru>, "Ragi Y(dot) Burhum" <rburhum(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: knngist patch support
Date: 2010-02-12 21:29:26
Message-ID: 603c8f071002121329k7a10159dnf5250ba29ed58923@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 12, 2010 at 3:44 PM, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su> wrote:
> We tried to find compromise for 9.0 (Tom suggests contrib module), but all
> variants are ugly and bring incompatibility in future. If there are no
> hackers
> willing/capable to review our patches, then, please,  help us  how to save
> neighbourhood search without incompatibility in future.

Here's what I've gathered from my read through this patch. Let me
know if it's right:

In CVS HEAD, if an AM is marked amcanorder, that means that the index
defines some kind of intrinsic order over the tuples so that, from a
given starting point, it makes sense to talk about scanning either
forward or backward. GIST indices don't have an intrinsic notion of
ordering, but the structure of the index does lend itself to finding
index tuples that are "nearby" to some specified point. So this patch
wants to introduce the notion of an AM that is marked amcanorderbyop
to accelerate queries that order by <indexed column> <op> <constant>.

To do that, we need some way of recognizing which operators are
candidates for this optimization. This patch implements that by
putting the relevant operators into the operator class. That requires
relaxing the rule that operator class operators must return bool; so
instead when the AM is marked amcanorderbyop we allow the operator
class operators to return any type with a default btree operator
class.

Does that sound right?

Tom remarked in another email that he wasn't too happy with the
opclass changes. They seem kind of grotty to me, too, but I don't
immediately have a better idea. My fear is that there may be places
in the code that rely on opclass operators only ever returning bool,
and that changing that may break things. It also feels like allowing
non-bool-returning opclass members only for this one specific case is
kind of a hack: is this an instance of some more general problem that
we ought to be solving in some more general way? Not sure.

In any case, it seems to me that figuring out how we're going to solve
the problem of marking the operators (or otherwise identifying the
expression trees) that are index-optimizable is the key issue for this
patch. If we can agree on that, the rest should be a SMOP.

...Robert


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, tomas(at)tuxteam(dot)de, Teodor Sigaev <teodor(at)sigaev(dot)ru>, "Ragi Y(dot) Burhum" <rburhum(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: knngist patch support
Date: 2010-02-13 00:30:43
Message-ID: 14480.1266021043@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> Tom remarked in another email that he wasn't too happy with the
> opclass changes. They seem kind of grotty to me, too, but I don't
> immediately have a better idea. My fear is that there may be places
> in the code that rely on opclass operators only ever returning bool,
> and that changing that may break things. It also feels like allowing
> non-bool-returning opclass members only for this one specific case is
> kind of a hack: is this an instance of some more general problem that
> we ought to be solving in some more general way? Not sure.

Yes, that's exactly what I didn't like about it: the proposed changes
create confusion between opclass members that represent
index-optimizable WHERE conditions and those that represent
index-optimizable ORDER BY conditions. You can get away with that to
some extent as long as you assume that the latter type of operator
never yields boolean and so can never appear at the top level of
WHERE. But that assumption sucks. There are plenty of cases where
people ORDER BY boolean values, so who's to argue that we will never
want an operator returning boolean in the second category? And as
soon as you put it in, the planner is going to think that it's also
a potential index-qualification operator, which is something the AM
might or might not be prepared to support.

I think this is really unacceptable and there needs to be some cleaner
way of distinguishing the two types of operators. Possibly a couple of
boolean columns added to pg_amop (you'd need two because there are three
possible states, in case an operator really can serve both purposes in a
particular opclass). Or maybe we should do something else. But
ignoring the issue won't do.

Maybe a more general idea would be to invent "categories" of opclass
members, where the only existing category is "index search qualifier",
and these new knngist thingies are another, and maybe plus and minus for
window function ranges are a third. But I'm not sure what you do if one
operator can be in more than one category.

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, tomas(at)tuxteam(dot)de, Teodor Sigaev <teodor(at)sigaev(dot)ru>, "Ragi Y(dot) Burhum" <rburhum(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: knngist patch support
Date: 2010-02-13 02:10:19
Message-ID: 603c8f071002121810v73d676cflf2d85d96e6cc198a@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 12, 2010 at 7:30 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Maybe a more general idea would be to invent "categories" of opclass
> members, where the only existing category is "index search qualifier",
> and these new knngist thingies are another, and maybe plus and minus for
> window function ranges are a third.  But I'm not sure what you do if one
> operator can be in more than one category.

Well, if you were willing to change pg_amop so that the key was
(amopfamily, amoplefttype, amoprighttype, amopcategory) rather than
just (amopfamily, amoplefttype, amoprighttype), the issue of what to
do if an operator can be in more than one category becomes moot. You
just specify the operator more than once if need be.

If you don't want the amopcategory to be part of the key, then you
just need to define it as a type that can handle multiple values yet
has a fast membership test. A character string of some type would be
flexible - you could use any single character as a category identifier
- but given that we don't expect many categories and we do want good
performance, it seems like an int4 used as a bitmap field would be
more appropriate.

I think the first approach is better, partly because it seems to lend
itself to a cleaner syntax for CREATE OPERATOR CLASS. Something like:

CREATE OPERATOR CLASS blah blah AS
OPERATOR 3 &&,
OPERATOR ORDER 15 <->;

...Robert


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, tomas(at)tuxteam(dot)de, Teodor Sigaev <teodor(at)sigaev(dot)ru>, "Ragi Y(dot) Burhum" <rburhum(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: knngist patch support
Date: 2010-02-13 02:27:09
Message-ID: 603c8f071002121827n232412acod8319337f647cea5@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 12, 2010 at 9:10 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Fri, Feb 12, 2010 at 7:30 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Maybe a more general idea would be to invent "categories" of opclass
>> members, where the only existing category is "index search qualifier",
>> and these new knngist thingies are another, and maybe plus and minus for
>> window function ranges are a third.  But I'm not sure what you do if one
>> operator can be in more than one category.
>
> Well, if you were willing to change pg_amop so that the key was
> (amopfamily, amoplefttype, amoprighttype, amopcategory) rather than
> just (amopfamily, amoplefttype, amoprighttype), the issue of what to
> do if an operator can be in more than one category becomes moot.  You
> just specify the operator more than once if need be.

Except I'm full of it, because amopstrategy is in there too. Hmm.
And that's unfortunate because the syscache machinery is limited to
four columns as lookup keys.

This is a bit ugly, but one idea that occurs to me is to change
amopstrategy from int16 to int32. Internally, we'll treat the low 16
bits as the strategy number and the high 16 bits as the strategy
category, with strategy category 0 being "index search qualifier".

...Robert


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, tomas(at)tuxteam(dot)de, Teodor Sigaev <teodor(at)sigaev(dot)ru>, "Ragi Y(dot) Burhum" <rburhum(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: knngist patch support
Date: 2010-02-13 02:45:18
Message-ID: 21192.1266029118@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> Well, if you were willing to change pg_amop so that the key was
>> (amopfamily, amoplefttype, amoprighttype, amopcategory) rather than
>> just (amopfamily, amoplefttype, amoprighttype), the issue of what to
>> do if an operator can be in more than one category becomes moot. You
>> just specify the operator more than once if need be.

Yeah, that occurred to me too after sending my earlier email.

> Except I'm full of it, because amopstrategy is in there too. Hmm.
> And that's unfortunate because the syscache machinery is limited to
> four columns as lookup keys.

Ugh. Still, we could certainly change the 4-key limit to 5, though it
might be a tad tedious to go round and edit all the SearchSysCache and
related calls. Maybe while we were at it we should change them to
SearchSysCache1, SearchSysCache2, etc to not have the limit hardwired
textually in quite so many places...

> This is a bit ugly, but one idea that occurs to me is to change
> amopstrategy from int16 to int32. Internally, we'll treat the low 16
> bits as the strategy number and the high 16 bits as the strategy
> category, with strategy category 0 being "index search qualifier".

Hm, yeah that would work, but I agree it's ugly.

While thinking about different possible solutions here: one of the
things that was worrying me is that for cases where the same operator
can serve in more than one role, it might have to have either the same
opstrategy or different ones in different roles, depending on how the AM
has assigned strategy numbers. The method with an extra index column
side-steps that nicely since there are two unrelated pg_amop entries.
If there's only one entry then you lose if you need different
strategies. Robert's use-the-high-bits method works too, since there
would still be two separate entries, but some other possible
representations are eliminated by that worry.

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, tomas(at)tuxteam(dot)de, Teodor Sigaev <teodor(at)sigaev(dot)ru>, "Ragi Y(dot) Burhum" <rburhum(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: knngist patch support
Date: 2010-02-13 02:57:33
Message-ID: 603c8f071002121857w2fdd2b4cqd841be0a2dd19de5@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 12, 2010 at 9:45 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>>> Well, if you were willing to change pg_amop so that the key was
>>> (amopfamily, amoplefttype, amoprighttype, amopcategory) rather than
>>> just (amopfamily, amoplefttype, amoprighttype), the issue of what to
>>> do if an operator can be in more than one category becomes moot.  You
>>> just specify the operator more than once if need be.
>
> Yeah, that occurred to me too after sending my earlier email.
>
>> Except I'm full of it, because amopstrategy is in there too.  Hmm.
>> And that's unfortunate because the syscache machinery is limited to
>> four columns as lookup keys.
>
> Ugh.  Still, we could certainly change the 4-key limit to 5, though it
> might be a tad tedious to go round and edit all the SearchSysCache and
> related calls.  Maybe while we were at it we should change them to
> SearchSysCache1, SearchSysCache2, etc to not have the limit hardwired
> textually in quite so many places...

Maybe. It sounds sort of awful though; and there's probably a
distributed performance penalty involved

>> This is a bit ugly, but one idea that occurs to me is to change
>> amopstrategy from int16 to int32.  Internally, we'll treat the low 16
>> bits as the strategy number and the high 16 bits as the strategy
>> category, with strategy category 0 being "index search qualifier".
>
> Hm, yeah that would work, but I agree it's ugly.

On further review there's a serious problem with this idea:
pg_amop_opr_fam_index.

> While thinking about different possible solutions here: one of the
> things that was worrying me is that for cases where the same operator
> can serve in more than one role, it might have to have either the same
> opstrategy or different ones in different roles, depending on how the AM
> has assigned strategy numbers.  The method with an extra index column
> side-steps that nicely since there are two unrelated pg_amop entries.
> If there's only one entry then you lose if you need different
> strategies.  Robert's use-the-high-bits method works too, since there
> would still be two separate entries, but some other possible
> representations are eliminated by that worry.

OK, here's another idea. Let's just add a new column to pg_amop
called amoporderstrategy. If an operator can only be used for one
purpose or the other, we'll set the other value to -1.

...Robert


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, tomas(at)tuxteam(dot)de, Teodor Sigaev <teodor(at)sigaev(dot)ru>, "Ragi Y(dot) Burhum" <rburhum(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: knngist patch support
Date: 2010-02-13 03:10:30
Message-ID: 21526.1266030630@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> OK, here's another idea. Let's just add a new column to pg_amop
> called amoporderstrategy. If an operator can only be used for one
> purpose or the other, we'll set the other value to -1.

... problem for unique index, no?

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, tomas(at)tuxteam(dot)de, Teodor Sigaev <teodor(at)sigaev(dot)ru>, "Ragi Y(dot) Burhum" <rburhum(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: knngist patch support
Date: 2010-02-13 03:18:38
Message-ID: 603c8f071002121918k68e05676y7b69a12d6d1fe4fd@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 12, 2010 at 10:10 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> OK, here's another idea.  Let's just add a new column to pg_amop
>> called amoporderstrategy.  If an operator can only be used for one
>> purpose or the other, we'll set the other value to -1.
>
> ... problem for unique index, no?

Dang. What a pain in the tail. I guess we could make that column
nullable, but that's got it's own fair share of problems.

Is the only reasonable way to solve this problem a new catalog?
That's not tremendously scalable, but it's starting to feel like the
only way of solving this problem that doesn't involve massive surgery.

...Robert


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, tomas(at)tuxteam(dot)de, Teodor Sigaev <teodor(at)sigaev(dot)ru>, "Ragi Y(dot) Burhum" <rburhum(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: knngist patch support
Date: 2010-02-13 03:38:18
Message-ID: 22821.1266032298@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Fri, Feb 12, 2010 at 9:45 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>>> This is a bit ugly, but one idea that occurs to me is to change
>>> amopstrategy from int16 to int32. Internally, we'll treat the low 16
>>> bits as the strategy number and the high 16 bits as the strategy
>>> category, with strategy category 0 being "index search qualifier".
>>
>> Hm, yeah that would work, but I agree it's ugly.

> On further review there's a serious problem with this idea:
> pg_amop_opr_fam_index.

I think that's soluble though. The reason that index exists is to
enforce the rule that an operator can stand in only one relationship
to an opfamily. In this design the natural rule would be "one
relationship per role", ie, the unique key would become
(operator, category, opfamily).

However, that does make it even uglier to have category shoehorned in as
part of a different field. Back to wanting 5-key syscaches ...

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, tomas(at)tuxteam(dot)de, Teodor Sigaev <teodor(at)sigaev(dot)ru>, "Ragi Y(dot) Burhum" <rburhum(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: knngist patch support
Date: 2010-02-13 03:39:17
Message-ID: 603c8f071002121939y7315124cp2d2586f39ac15888@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 12, 2010 at 10:38 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On Fri, Feb 12, 2010 at 9:45 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>>>> This is a bit ugly, but one idea that occurs to me is to change
>>>> amopstrategy from int16 to int32.  Internally, we'll treat the low 16
>>>> bits as the strategy number and the high 16 bits as the strategy
>>>> category, with strategy category 0 being "index search qualifier".
>>>
>>> Hm, yeah that would work, but I agree it's ugly.
>
>> On further review there's a serious problem with this idea:
>> pg_amop_opr_fam_index.
>
> I think that's soluble though.  The reason that index exists is to
> enforce the rule that an operator can stand in only one relationship
> to an opfamily.  In this design the natural rule would be "one
> relationship per role", ie, the unique key would become
> (operator, category, opfamily).
>
> However, that does make it even uglier to have category shoehorned in as
> part of a different field.  Back to wanting 5-key syscaches ...

Sigh.

...Robert


From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, tomas(at)tuxteam(dot)de, "Ragi Y(dot) Burhum" <rburhum(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: knngist patch support
Date: 2010-02-13 18:13:19
Message-ID: 4B76EBBF.1010201@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>> However, that does make it even uglier to have category shoehorned in as
>> part of a different field. Back to wanting 5-key syscaches ...
> Sigh.

I see your point. May be it's better to introduce new system table? pg_amorderop
to store ordering operations for index.

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, tomas(at)tuxteam(dot)de, "Ragi Y(dot) Burhum" <rburhum(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: knngist patch support
Date: 2010-02-13 18:28:48
Message-ID: 5079.1266085728@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Teodor Sigaev <teodor(at)sigaev(dot)ru> writes:
> I see your point. May be it's better to introduce new system table? pg_amorderop
> to store ordering operations for index.

We could, but that approach doesn't scale to wanting more categories
in the future --- you're essentially decreeing that every new category
of opclass-associated operator will require a new system catalog,
along with all the infrastructure needed for that. That guarantees
that the temptation to take shortcuts will remain high.

If we didn't already have the plus/minus-for-WINDOW-RANGE example
staring us in the face, I might think that an extensible solution
wasn't needed here ... but we do so I think we really need to allow
for multiple categories in some form.

Now on the flip side, adding new catalogs would allow flexibility to
add columns that aren't there in pg_amop, which could come in handy
if some future category requires auxiliary data that's not needed for
the existing category of index search operators. But since the two
examples we have at hand don't appear to need any extra data, this
argument isn't real strong.

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Teodor Sigaev <teodor(at)sigaev(dot)ru>, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, tomas(at)tuxteam(dot)de, "Ragi Y(dot) Burhum" <rburhum(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: knngist patch support
Date: 2010-02-13 18:31:44
Message-ID: 603c8f071002131031y91d2297qc29d28de5c8f5dfc@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Feb 13, 2010 at 1:28 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Teodor Sigaev <teodor(at)sigaev(dot)ru> writes:
>> I see your point. May be it's better to introduce new system table? pg_amorderop
>> to store ordering operations for index.
>
> We could, but that approach doesn't scale to wanting more categories
> in the future --- you're essentially decreeing that every new category
> of opclass-associated operator will require a new system catalog,
> along with all the infrastructure needed for that.  That guarantees
> that the temptation to take shortcuts will remain high.

Yeah. PFA a patch to allow 5-key syscaches.

...Robert

Attachment Content-Type Size
syscache5-v1.patch application/octet-stream 274.9 KB

From: Joshua Tolley <eggyknap(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: knngist patch support
Date: 2010-02-13 19:03:16
Message-ID: 4b76f787.9613f30a.6a24.ffff9209@mx.google.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Feb 13, 2010 at 01:31:44PM -0500, Robert Haas wrote:
> On Sat, Feb 13, 2010 at 1:28 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > Teodor Sigaev <teodor(at)sigaev(dot)ru> writes:
> >> I see your point. May be it's better to introduce new system table? pg_amorderop
> >> to store ordering operations for index.
> >
> > We could, but that approach doesn't scale to wanting more categories
> > in the future --- you're essentially decreeing that every new category
> > of opclass-associated operator will require a new system catalog,
> > along with all the infrastructure needed for that.  That guarantees
> > that the temptation to take shortcuts will remain high.
>
> Yeah. PFA a patch to allow 5-key syscaches.

(Realizing I'm a lurker in this conversation, and hoping not to ask irritating
questions) Do we need to rename SearchSysCache et al. to SearchSysCache1,
etc.? It seems to me that requires changes to all kinds of software without
any real need. The four lines of PL/LOLCODE that inspired this thought aren't
themselves a great burden, but when combined with everyone else using
SearchSysCache already...

--
Joshua Tolley / eggyknap
End Point Corporation
http://www.endpoint.com


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Joshua Tolley <eggyknap(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: knngist patch support
Date: 2010-02-13 19:12:22
Message-ID: 603c8f071002131112x73555aa4s53f725240444c69d@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Feb 13, 2010 at 2:03 PM, Joshua Tolley <eggyknap(at)gmail(dot)com> wrote:
> On Sat, Feb 13, 2010 at 01:31:44PM -0500, Robert Haas wrote:
>> On Sat, Feb 13, 2010 at 1:28 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> > Teodor Sigaev <teodor(at)sigaev(dot)ru> writes:
>> >> I see your point. May be it's better to introduce new system table? pg_amorderop
>> >> to store ordering operations for index.
>> >
>> > We could, but that approach doesn't scale to wanting more categories
>> > in the future --- you're essentially decreeing that every new category
>> > of opclass-associated operator will require a new system catalog,
>> > along with all the infrastructure needed for that.  That guarantees
>> > that the temptation to take shortcuts will remain high.
>>
>> Yeah.  PFA a patch to allow 5-key syscaches.
>
> (Realizing I'm a lurker in this conversation, and hoping not to ask irritating
> questions) Do we need to rename SearchSysCache et al. to SearchSysCache1,
> etc.? It seems to me that requires changes to all kinds of software without
> any real need. The four lines of PL/LOLCODE that inspired this thought aren't
> themselves a great burden, but when combined with everyone else using
> SearchSysCache already...

If we want to allow 5-key syscaches, we have to add an extra parameter
to SearchSysCache and friends. So everyone caller of SearchSysCache
is going to break. (Well, unless we instead leave SearchSysCache
alone and add SearchSysCacheExtended or similar; but that's not really
project style.)

The point of the macros is that if someone some day decides they want
to allow SIX-key syscaches, we don't have to break all those people
again; we can just update the macro definitions.

...Robert


From: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Teodor Sigaev <teodor(at)sigaev(dot)ru>, Robert Haas <robertmhaas(at)gmail(dot)com>, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, tomas(at)tuxteam(dot)de, "Ragi Y(dot) Burhum" <rburhum(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: knngist patch support
Date: 2010-02-13 19:18:56
Message-ID: e08cc0401002131118u7acf856blc76dc252efa9b8fc@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2010/2/14 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> Teodor Sigaev <teodor(at)sigaev(dot)ru> writes:
>> I see your point. May be it's better to introduce new system table? pg_amorderop
>> to store ordering operations for index.
>
> We could, but that approach doesn't scale to wanting more categories
> in the future --- you're essentially decreeing that every new category
> of opclass-associated operator will require a new system catalog,
> along with all the infrastructure needed for that.  That guarantees
> that the temptation to take shortcuts will remain high.
>
> If we didn't already have the plus/minus-for-WINDOW-RANGE example
> staring us in the face, I might think that an extensible solution
> wasn't needed here ... but we do so I think we really need to allow
> for multiple categories in some form.

I'm not so familiar with knn gist topic, but I think there are not
enough fundamentals yet. Because we started from defining operators if
they can be "ordered or indexed", general operator semantics is over
them, not adding columns to pg_amop. It will be something like
datatype-independent human word: "add and subtract", "take distance",
"ordered".
And we don't have time to invent such new world.

Regards.

--
Hitoshi Harada


From: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Joshua Tolley <eggyknap(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: knngist patch support
Date: 2010-02-13 19:21:01
Message-ID: e08cc0401002131121v5d76f608q84c856293eec86a6@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2010/2/14 Robert Haas <robertmhaas(at)gmail(dot)com>:
> If we want to allow 5-key syscaches, we have to add an extra parameter
> to SearchSysCache and friends.  So everyone caller of SearchSysCache
> is going to break.  (Well, unless we instead leave SearchSysCache
> alone and add SearchSysCacheExtended or similar; but that's not really
> project style.)

DirectFunctionCall1/2/3...?

--
Hitoshi Harada


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
Cc: Teodor Sigaev <teodor(at)sigaev(dot)ru>, Robert Haas <robertmhaas(at)gmail(dot)com>, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, tomas(at)tuxteam(dot)de, "Ragi Y(dot) Burhum" <rburhum(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: knngist patch support
Date: 2010-02-13 19:38:34
Message-ID: 6048.1266089914@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com> writes:
> And we don't have time to invent such new world.

Huh? This is all discussion for 9.1 (or even later). There's
plenty of time.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Joshua Tolley <eggyknap(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: knngist patch support
Date: 2010-02-13 20:02:36
Message-ID: 6380.1266091356@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Sat, Feb 13, 2010 at 2:03 PM, Joshua Tolley <eggyknap(at)gmail(dot)com> wrote:
>> (Realizing I'm a lurker in this conversation, and hoping not to ask irritating
>> questions) Do we need to rename SearchSysCache et al. to SearchSysCache1,
>> etc.? It seems to me that requires changes to all kinds of software without
>> any real need. The four lines of PL/LOLCODE that inspired this thought aren't
>> themselves a great burden, but when combined with everyone else using
>> SearchSysCache already...

> If we want to allow 5-key syscaches, we have to add an extra parameter
> to SearchSysCache and friends. So everyone caller of SearchSysCache
> is going to break. (Well, unless we instead leave SearchSysCache
> alone and add SearchSysCacheExtended or similar; but that's not really
> project style.)

It's fair to stop and think about how this would affect external
packages and what they'd have to do to support building against both new
and old-style backends.

Reflecting on it, it seems to me that the separate SearchSysCacheN()
macros are obviously cleaner and closer to preferred project style than
the existing code with all those explicit zeroes. So I think there's
a case for migrating to that style even if we didn't have a concern
about the max number of lookup keys.

What would probably be the recommended solution for backwards-compatible
source code is to convert the actual calls to new style, and then
provide a block of macro definitions along the lines of

#if CATALOG_VERSION_NO < something
#define SearchSysCache1(...) SearchSysCache(..., 0, 0, 0)
#define SearchSysCache2(...) SearchSysCache(..., 0, 0)
etc

So that seems okay so far.

What's not clear from this is whether we should reserve SearchSysCache
as an equivalent to SearchSysCache4() and therefore name the underlying
function something different. That would allow being lazy about
converting individual calls over.

I'm inclined to think not, and here's the reason: with the old call
style it was never apparent from the callee side how many arguments the
caller intended to be valid, and so for example we couldn't have an
Assert that checked it was right. I'm not sure if we should go so
far as to add an explicit nkeys argument to SearchSysCache, but it's
worth thinking about at least.

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Joshua Tolley <eggyknap(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: knngist patch support
Date: 2010-02-13 20:48:38
Message-ID: 603c8f071002131248m2ecbf1d8xe27a89c37228c873@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Feb 13, 2010 at 3:02 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> What would probably be the recommended solution for backwards-compatible
> source code is to convert the actual calls to new style, and then
> provide a block of macro definitions along the lines of
>
> #if CATALOG_VERSION_NO < something
> #define SearchSysCache1(...) SearchSysCache(..., 0, 0, 0)
> #define SearchSysCache2(...) SearchSysCache(..., 0, 0)
> etc
>
> So that seems okay so far.

Where would we put this block of code?

...Robert


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Joshua Tolley <eggyknap(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: knngist patch support
Date: 2010-02-13 20:50:57
Message-ID: 7966.1266094257@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Sat, Feb 13, 2010 at 3:02 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> What would probably be the recommended solution for backwards-compatible
>> source code is to convert the actual calls to new style, and then
>> provide a block of macro definitions along the lines of
>>
>> #if CATALOG_VERSION_NO < something
>> #define SearchSysCache1(...) SearchSysCache(..., 0, 0, 0)
>> #define SearchSysCache2(...) SearchSysCache(..., 0, 0)
>> etc
>>
>> So that seems okay so far.

> Where would we put this block of code?

*We* wouldn't. This is a solution that might be used by pg_foundry
projects for instance. Some people want to distribute loadable modules
for which the same source code can be compiled against multiple PG
releases. They'd stick those macro definitions, conditionalized as
above, into their own source file.

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, tomas(at)tuxteam(dot)de, "Ragi Y(dot) Burhum" <rburhum(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: knngist patch support
Date: 2010-02-13 20:52:45
Message-ID: 603c8f071002131252l2d2f7224m2a25097679f61fe2@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Feb 13, 2010 at 2:38 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com> writes:
>> And we don't have time to invent such new world.
>
> Huh?  This is all discussion for 9.1 (or even later).  There's
> plenty of time.

Just to be clear, I was intending this patch, at least, to be applied
now. I actually think there's a good argument that we should do at
least this much for 9.0, namely that now is probably the time when
there are the fewest outstanding patches that will be broken by this.
If we try to apply this for the first 9.1 CommitFest, then (1) it'll
have to be completely redone and (2) it'll force massive rebasing.

...Robert


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, tomas(at)tuxteam(dot)de, "Ragi Y(dot) Burhum" <rburhum(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: knngist patch support
Date: 2010-02-13 20:58:58
Message-ID: 8088.1266094738@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> Just to be clear, I was intending this patch, at least, to be applied
> now. I actually think there's a good argument that we should do at
> least this much for 9.0, namely that now is probably the time when
> there are the fewest outstanding patches that will be broken by this.
> If we try to apply this for the first 9.1 CommitFest, then (1) it'll
> have to be completely redone and (2) it'll force massive rebasing.

What I think we should do is not change SearchSysCache for 9.0,
but provide the SearchSysCacheN macros as syntactic sugar, and
go around and change (at least most of) the call sites to use the
macros. This is clearly cleaner source code, and with that method
we'll still be ABI-compatible in 9.0 for anybody who doesn't fix their
source right away.

regards, tom lane


From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Joshua Tolley <eggyknap(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: knngist patch support
Date: 2010-02-13 20:59:27
Message-ID: 4B7712AF.3050901@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> Reflecting on it, it seems to me that the separate SearchSysCacheN()
> macros are obviously cleaner and closer to preferred project style than
> the existing code with all those explicit zeroes. So I think there's
> a case for migrating to that style even if we didn't have a concern
> about the max number of lookup keys.

http://www.sigaev.ru/misc/syscache-0.1.gz

Patch enlarges number of arguments of SearchSysCache and friends to 5 and
introduces SearchSysCacheN (and friends) macroses. All
SearchSysCache/SearchSysCacheCopy/SearchSysCacheExists/SearchSysCacheList/GetSysCacheOid
calls are converted into calling macroses.

For compatibility, we could rename SearchSysCache into SearchSysCacheSomething
and provide SearchSysCache function for old code.
--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/


From: Dimitri Fontaine <dfontaine(at)hi-media(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Teodor Sigaev <teodor(at)sigaev(dot)ru>, Robert Haas <robertmhaas(at)gmail(dot)com>, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, tomas(at)tuxteam(dot)de, "Ragi Y(dot) Burhum" <rburhum(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: knngist patch support
Date: 2010-02-13 21:20:25
Message-ID: m28waw4yue.fsf@hi-media.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
> Teodor Sigaev <teodor(at)sigaev(dot)ru> writes:
>> I see your point. May be it's better to introduce new system table? pg_amorderop
>> to store ordering operations for index.
>
> We could, but that approach doesn't scale to wanting more categories
> in the future --- you're essentially decreeing that every new category
> of opclass-associated operator will require a new system catalog,
> along with all the infrastructure needed for that. That guarantees
> that the temptation to take shortcuts will remain high.

On the other hand here's how the fine manual define an operator class:

Operator classes are so called because one thing they specify is the
set of WHERE-clause operators that can be used with an index (i.e.,
can be converted into an index-scan qualification). An operator class
can also specify some support procedures that are needed by the
internal operations of the index method, but do not directly
correspond to any WHERE-clause operator that can be used with the
index.

> If we didn't already have the plus/minus-for-WINDOW-RANGE example
> staring us in the face, I might think that an extensible solution
> wasn't needed here ... but we do so I think we really need to allow
> for multiple categories in some form.

Agreed.

And we're talking about the basic operators + and -, and about a
distance or metric operator. Those remind me of groups and etc.

http://en.wikipedia.org/wiki/Group_(mathematics)
http://en.wikipedia.org/wiki/Abelian_group
http://en.wikipedia.org/wiki/Ring_(mathematics)
http://en.wikipedia.org/wiki/Metric_space

A group is defined by a data type, an operator (+), and an identity
element. If the group is abelian, the given operation is associative and
commutative, and each element of the data type has an inverse.

Then there's the metric space which is a data type with a distance
function. This function must be non-negative, commutative, etc.

So I guess what we need here is a Operator Group to define our plus and
minus operators, and the fact that it's a group says (by convention,
like the total ordering of a BTree) that the + is commutative and the -
its opposite. Or we have an "option" called abelian for specifying the
commutativity?

Then I'm for tricking the maths to say that our notion of a metric space
will apply only to our notion of an Operator Group, unless someone
really insists on separating the concepts. So an Operator Group defines
2 strategies that you have to attach to your + and - operators, and a
support function which is the distance, and optional.

Now, we want to have Groups able to work on more than one datatype, for
example talking about the distance of a point to a circle or a box. So
we need an Operator Group Family, don't we?

How much does all that help in this case?
--
dim

PS: I'm sad to have this discussion after having read it's too late for
9.0. The KNN-Gist stuff with extended ORDER BY indexing was too good to
be true.


From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Teodor Sigaev <teodor(at)sigaev(dot)ru>, Robert Haas <robertmhaas(at)gmail(dot)com>, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, tomas(at)tuxteam(dot)de, "Ragi Y(dot) Burhum" <rburhum(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: knngist patch support
Date: 2010-02-14 00:12:44
Message-ID: 1266106364.29919.5916.camel@jdavis
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, 2010-02-13 at 13:28 -0500, Tom Lane wrote:
> If we didn't already have the plus/minus-for-WINDOW-RANGE example
> staring us in the face, I might think that an extensible solution
> wasn't needed here ... but we do so I think we really need to allow
> for multiple categories in some form.

Is this also an opportunity to provide the infrastructure to assign
meaning to operators?

Related discussion:
http://archives.postgresql.org/pgsql-hackers/2009-10/msg01443.php

Regards,
Jeff Davis


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, tomas(at)tuxteam(dot)de, "Ragi Y(dot) Burhum" <rburhum(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: knngist patch support
Date: 2010-02-14 03:26:55
Message-ID: 603c8f071002131926u925486h187add843ae08696@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Feb 13, 2010 at 3:58 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> Just to be clear, I was intending this patch, at least, to be applied
>> now.  I actually think there's a good argument that we should do at
>> least this much for 9.0, namely that now is probably the time when
>> there are the fewest outstanding patches that will be broken by this.
>> If we try to apply this for the first 9.1 CommitFest, then (1) it'll
>> have to be completely redone and (2) it'll force massive rebasing.
>
> What I think we should do is not change SearchSysCache for 9.0,
> but provide the SearchSysCacheN macros as syntactic sugar, and
> go around and change (at least most of) the call sites to use the
> macros.  This is clearly cleaner source code, and with that method
> we'll still be ABI-compatible in 9.0 for anybody who doesn't fix their
> source right away.

Let me just think out loud for a second about the roadmap for knngist
at this point.

1. Add support for 5-key syscaches using either my patch or Teodor's
or some third version that may crop up. This could possibly be split
into one version that converts everything to using macros (for 9.0)
and a second version that actually increases the maximum number of
keys from 4 to 5 (for 9.1).

2. Modify pg_amop by adding a new column amopcategory, probably either
int2 or maybe even just char. Add this key to both unique indices on
pg_amop. Modify CREATE OPERATOR CLASS (and maybe ALTER OPERATOR
FAMILY?) to support some new syntax to add "order-by-op" operators to
opclasses/families (I'm thinking OPERATOR ORDER <strategy-number>
<operator> vs. the usual OPRATOR <strategy-number> <operator>). Also
add an amcanorderbyop flag to pg_am. Teach the planner to generate an
index scan path when amcanorderbyop is set on the AM and a suitable
order-by-op clause is present.

3. Modify GIST to use the infrastructure introduced by #2 for the new
<-> operator.

I'm not prepared to endorse doing #3 in core for 9.0, but I wonder if
it would be feasible to think about doing #1 and #2 and putting
something into contrib for #3. The last round of knngist patches had
a "planner" patch which basically did #2 (with the problems previously
discussed, to which we now seem to have a fairly clean and
straightforward solution) and then the "itself" and "proc" patches did
#3. The planner patch is actually quite small, so maybe it's not out
of the question to think that it might go in, with some suitable
fixing up along the lines discussed here and upthread.

On the other hand, maybe I'm getting too ambitious.

...Robert


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, tomas(at)tuxteam(dot)de, "Ragi Y(dot) Burhum" <rburhum(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: knngist patch support
Date: 2010-02-14 03:40:23
Message-ID: 13277.1266118823@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> ...
> 2. Modify pg_amop by adding a new column amopcategory, probably either
> int2 or maybe even just char.
> ...
> I'm not prepared to endorse doing #3 in core for 9.0, but I wonder if
> it would be feasible to think about doing #1 and #2 and putting
> something into contrib for #3.

No, we are not touching the system catalogs for this in 9.0, much less
fooling with any planner logic. If it had been submitted a few months
earlier that could have happened, but we are barely 48 hours away from
alpha freeze. The only part of this that I think can even be considered
at this point is to do a backwards-compatible source code upgrade that
will decouple the various call sites from knowledge of exactly how many
key columns the syscaches allow. And even that is not for the benefit
of this feature; it's mainly to minimize breakage of other patches
that will be developed between now and whenever knngist does land.
(IOW, I agree with your point that making the call syntax change now
is the least painful time to do that.)

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, tomas(at)tuxteam(dot)de, "Ragi Y(dot) Burhum" <rburhum(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: knngist patch support
Date: 2010-02-14 04:05:36
Message-ID: 603c8f071002132005u4b259123m4a5e54603e0aebc3@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Feb 13, 2010 at 10:40 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> ...
>> 2. Modify pg_amop by adding a new column amopcategory, probably either
>> int2 or maybe even just char.
>> ...
>> I'm not prepared to endorse doing #3 in core for 9.0, but I wonder if
>> it would be feasible to think about doing #1 and #2 and putting
>> something into contrib for #3.
>
> No, we are not touching the system catalogs for this in 9.0, much less
> fooling with any planner logic.  If it had been submitted a few months
> earlier that could have happened, but we are barely 48 hours away from
> alpha freeze.  The only part of this that I think can even be considered
> at this point is to do a backwards-compatible source code upgrade that
> will decouple the various call sites from knowledge of exactly how many
> key columns the syscaches allow.  And even that is not for the benefit
> of this feature; it's mainly to minimize breakage of other patches
> that will be developed between now and whenever knngist does land.
> (IOW, I agree with your point that making the call syntax change now
> is the least painful time to do that.)

OK. In that case, any objections to my applying the attached patch,
which I believe implements this as you suggested?

...Robert

Attachment Content-Type Size
syscache-macros-v1.patch application/octet-stream 269.5 KB

From: Yeb Havinga <yebhavinga(at)gmail(dot)com>
To: Dimitri Fontaine <dfontaine(at)hi-media(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, Robert Haas <robertmhaas(at)gmail(dot)com>, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, tomas(at)tuxteam(dot)de, "Ragi Y(dot) Burhum" <rburhum(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: knngist patch support
Date: 2010-02-14 16:01:28
Message-ID: 4B781E58.4060300@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dimitri Fontaine wrote:
> Then there's the metric space which is a data type with a distance
> function. This function must be non-negative, commutative, etc.
>
> So I guess what we need here is a Operator Group to define our plus and
> minus operators, and the fact that it's a group says (by convention,
> like the total ordering of a BTree) that the + is commutative and the -
> its opposite. Or we have an "option" called abelian for specifying the
> commutativity?
>
Would the group analogy work with partially ordered domains, e.g. with a
location on a sphere datatype together with an identifier for the sphere
- so poi on earth can be compared to another, but not a poi on earth
with a poi on the moon. ?

regards,
Yeb Havinga


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, tomas(at)tuxteam(dot)de, "Ragi Y(dot) Burhum" <rburhum(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: knngist patch support
Date: 2010-02-14 17:24:59
Message-ID: 22481.1266168299@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> OK. In that case, any objections to my applying the attached patch,
> which I believe implements this as you suggested?

Um, did you test this version? It looks like the macros are still
defined according to the idea that SearchSysCache takes five arguments.

Also, I'd suggest adding explicit comments to syscache.h suggesting
that SearchSysCache etc are meant to be called via the macros
rather than directly.

I didn't check all the individual calls, but it looks generally
sane except for those points.

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, tomas(at)tuxteam(dot)de, "Ragi Y(dot) Burhum" <rburhum(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: knngist patch support
Date: 2010-02-14 18:08:47
Message-ID: 603c8f071002141008x220c10d8uc81da02ed39ec911@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Feb 14, 2010 at 12:24 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> OK.  In that case, any objections to my applying the attached patch,
>> which I believe implements this as you suggested?
>
> Um, did you test this version?  It looks like the macros are still
> defined according to the idea that SearchSysCache takes five arguments.

You are correct. I realized that this morning while I was shaving.
Sorry about that.

> Also, I'd suggest adding explicit comments to syscache.h suggesting
> that SearchSysCache etc are meant to be called via the macros
> rather than directly.

Good idea.

> I didn't check all the individual calls, but it looks generally
> sane except for those points.

Will fix and commit.

...Robert