From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Andres Freund <andres(at)2ndquadrant(dot)com>, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: [PATCH] binary heap implementation |
Date: | 2012-11-20 23:01:10 |
Message-ID: | 17109.1353452470@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> Here's a revised patch that takes the approach of just changing void *
> to Datum; it includes a bunch of other cleanups that I felt were
> worthwhile.
The comment in binaryheap.h says
* For a max-heap, the comparator must return -1 iff a < b, 0 iff a == b,
* and +1 iff a > b. For a min-heap, the conditions are reversed.
Is it actually required that the output be exactly these three values,
or is the specification really <0, 0, >0 ? If the latter, it should
say that, rather than overdefine the implementation of comparison.
Also, wouldn't it be reasonable to const-ify the comparator function, ie
typedef int (*binaryheap_comparator) (const binaryheap_node *a, const binaryheap_node *b);
* nodes the first of a list of "space" nodes
s/list/array/, no? Also, it's standard to mark this sort of hack with
/* VARIABLE LENGTH ARRAY */, or even better use FLEXIBLE_ARRAY_MEMBER
(which would require fixing the space calculation in
binaryheap_allocate, not that that would be a bad thing anyway).
left_offset and friends are defined as taking size_t and returning int,
which is broken on machines where size_t is wider than int, or even on
machines where they're the same width since int is signed. Personally
I'd replace pretty much every single usage of size_t in this module with
int, as I am unconvinced either that we need 8-byte indexes or that the
logic is correct if it tries to form index -1 (as would happen for
example with binaryheap_build applied to an empty heap).
The Assert() in binaryheap_add_unordered fills me with no confidence.
If we're not going to support expansion of the array (which might be a
better idea anyway) then this needs to be an actual run-time check, not
something that won't be there in production.
What I think *would* be worth asserting for is whether the heap is
correctly ordered or not --- that is, I think you should add a flag that
is cleared by binaryheap_add_unordered and set by binaryheap_build, and
then the functions that presume correct ordering should assert it's set.
An Assert in binaryheap_replace_first that the heap is not empty seems
appropriate as well.
This in binaryheap_replace_first is simply bizarre:
if (key)
heap->nodes[0].key = key;
if (val)
heap->nodes[0].value = val;
Why are these assignments conditional? If they're sane at all, they
should not be testing the Datums directly in any case, but the result of
DatumGetSomething.
static int32
! heap_compare_slots(binaryheap_node *a, binaryheap_node *b)
{
+ MergeAppendState *node = (MergeAppendState *) a->value;
This should use DatumGetPointer(a->value), and the function result
should be int not int32 to comport with the typedef for it.
While I'm thinking about it, why are the fields of a binaryheap_node
called "key" and "value"? That implies a semantics not actually used
here. Maybe "value1" and "value2" instead?
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Merlin Moncure | 2012-11-20 23:39:31 | Re: StrategyGetBuffer questions |
Previous Message | Jeff Janes | 2012-11-20 22:50:46 | Re: StrategyGetBuffer questions |