Re: WITHIN GROUP patch

From: Vik Fearing <vik(dot)fearing(at)dalibo(dot)com>
To: Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Subject: Re: WITHIN GROUP patch
Date: 2013-11-21 17:27:40
Message-ID: 528E428C.3040502@dalibo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 11/21/2013 11:04 AM, Atri Sharma wrote:
> Please find attached the latest patch for WITHIN GROUP. This patch is
> after fixing the merge conflicts.

I have spent quite some time on this and the previous versions. Here is
my review, following the questions on the wiki.

This patch is in context diff format and applies cleanly to today's
master. It contains several regression tests and for the most part,
good documentation. I would like to see at least one example of using
each of the two types of function (hypothetical and inverted
distribution) in section 4.2.8.

This patch implements what it says it does. We don't already have a way
to get these results without this patch that I know of, and I think we
do want it. I certainly want it. I do not have a copy of the SQL
standard, but I have full faith in the Andrew Gierth's claim that this
is part of it. Even if not, implementation details were brought up
during design and agreed upon by this list[1]. I don't see how anything
here could be dangerous. The custom ordered set functions I made
correctly passed a round-trip through dump/restore.

The code compiles without warning. All of the clean tests I did worked
as expected, and all of the dirty tests failed elegantly.

I did not find any corner cases, and I looked in as many corners as I
could think of. I didn't manage to trigger any assertion failures and I
didn't crash it.

I found no noticeable issues with performance, either directly or as
side effects.

I am not the most competent with code review so I'll be relying on
further review by another reviewer or the final committer. The patch
fixed the project comments/messages style issues I raised in my previous
review. I found the code comments lacking in some places (like
inversedistribution.c:mode_final for example) but I can't say if the
really is too terse, or if it's just me. On the other hand, I thought
the explanation blocks in the code comments were adequately descriptive.

There is some room for improvement in future versions. The query select
mode() within group (order by x) over (partition by y) from ... should
be valid but isn't. This was explained by Andrew on IRC as being
non-trivial: "specifically, we implemented WITHIN GROUP by repurposing
the infrastructure already present for agg(distinct ...) and agg(x order
by y) which are also not yet supported for
aggregate-as-window-function". I assume then that evolution on one side
will benefit the other.

All in all, I believe this is ready for committer.

[1]
http://www.postgresql.org/message-id/2b8b55b8ba82f83ef4e6070b95fb0cd0%40news-out.riddles.org.uk

--
Vik

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message J Smith 2013-11-21 18:01:13 Re: Data corruption issues using streaming replication on 9.0.14/9.2.5/9.3.1
Previous Message Andres Freund 2013-11-21 17:23:53 Re: Data corruption issues using streaming replication on 9.0.14/9.2.5/9.3.1