From: | Peter Geoghegan <pg(at)heroku(dot)com> |
---|---|
To: | Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> |
Cc: | Andres Freund <andres(at)2ndquadrant(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: GiST kNN search queue (Re: KNN-GiST with recheck) |
Date: | 2014-12-20 20:50:09 |
Message-ID: | CAM3SWZQOBog8FiWA5WOYyATzm5y0ac8bj077F0vGqg__1HOMYQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Dec 17, 2014 at 6:07 AM, Heikki Linnakangas
<hlinnakangas(at)vmware(dot)com> wrote:
> How about adding a src/backend/lib/README for that, per attached?
I took a quick look at this. Some observations:
* I like the idea of adding a .../lib README. However, the README
fails to note that ilist.c implements an *integrated* list, unlike the
much more prevalent cell-based List structure. It should note that,
since that's the whole point of ilist.c.
* pairingheap_remove() is technically dead code. It still makes sense
that you'd have it in this patch, but I think there's an argument for
not including it at all on the theory that if you need to use it you
should use a different data structure. After all, the actual
(non-amortized) complexity of that operation is O(n) [1], and if
remove operations are infrequent as we might expect, that might be the
more important consideration. As long as you are including
pairingheap_remove(), though, why is the local variable "prev_ptr" a
pointer to a pointer to a pairingheap_node, rather than just a pointer
to a pairingheap_node?
* Similarly, the function-like macro pairingheap_reset() doesn't seem
to pull its weight. Why does it exist alongside pairingheap_free()?
I'm not seeing a need to re-use a heap like that.
* "Assert(!pairingheap_is_empty(heap))" appears in a few places.
You're basically asserting that a pointer isn't null, often
immediately before dereferencing the pointer. This seems to be of
questionable value.
* I think that the indentation of code could use some tweaking.
* More comments, please. In particular, comment the struct fields in
pairingheap_node. There are various blocks of code that could use at
least an additional terse comment, too.
* You talked about tuplesort.c integration. In order for that to
happen, I think the comparator logic should know less about min-heaps.
This should formally be a max-heap, with the ability to provide
customizations only encapsulated in the comparator (like inverting the
comparison logic to get a min-heap, or like custom NULLs first/last
behavior). So IMV this comment should be more generic/anticipatory:
+ /*
+ * For a max-heap, the comparator must return <0 iff a < b, 0 iff a == b,
+ * and >0 iff a > b. For a min-heap, the conditions are reversed.
+ */
+ typedef int (*pairingheap_comparator) (const pairingheap_node *a,
const pairingheap_node *b, void *arg);
I think the functions should be called pairing_max_heap* for this
reason, too. Although that isn't consistent with binaryheap.c, so I
guess this whole argument is a non-starter.
* We should just move rbtree.c to .../lib. We're not using CVS anymore
-- the history will be magically preserved.
Anyway, to get to the heart of the matter: in general, I think the
argument for the patch is sound. It's not a stellar improvement, but
it's worthwhile. That's all I have for now...
[1] https://www.cise.ufl.edu/~sahni/dsaac/enrich/c13/pairing.htm
--
Peter Geoghegan
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Geoghegan | 2014-12-20 21:14:25 | Re: INSERT ... ON CONFLICT {UPDATE | IGNORE} |
Previous Message | Jim Nasby | 2014-12-20 20:13:37 | Re: hash_create API changes (was Re: speedup tidbitmap patch: hash BlockNumber) |