Re: Request for Patch Feedback: Lag & Lead Window Functions Can Ignore Nulls

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Nicholas White <n(dot)j(dot)white(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Request for Patch Feedback: Lag & Lead Window Functions Can Ignore Nulls
Date: 2013-06-20 14:03:30
Message-ID: CA+Tgmob-9azQHekkpe0GH+_uqMTtnyBOZPM-2CMuxXMaujttTQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jun 18, 2013 at 6:27 PM, Nicholas White <n(dot)j(dot)white(at)gmail(dot)com> wrote:
> Thanks for the reviews. I've attached a revised patch that has the lexer
> refactoring Alvaro mentions (arbitarily using a goto rather than a bool
> flag) and uses Jeff's idea of just storing the index of the last non-null
> value (if there is one) in the window function's context (and not the Datum
> itself).
>
> However, Robert's right that SELECT ... ORDER BY respect NULLS LAST will now
> fail. An earlier iteration of the patch had RESPECT and IGNORE as reserved,
> but that would have broken tables with columns called "respect" (etc.),
> which the current version allows. Is this backwards incompatibility
> acceptable?

I think it's better to add new partially reserved keywords than to
have this kind of random breakage. When you make something a
partially-reserved keyword, then things break, but in a fairly
well-defined way. Lexer hacks can break things in ways that are much
subtler, which we may not even realize for a long time, and which in
that case would mean that the word "respect" needs to be quoted in
some contexts but not others. That's going to cause a lot of
headaches, because pg_dump etc. know about quoting reserved keywords,
but they don't know anything about quoting unreserved keywords in
contexts where they might happen to be followed by the wrong next
word.

> If not, maybe I should try doing a two-token lookahead in the
> token-aggregating code between the lexer and the parser (i.e. make a
> RESPECT_NULLS token out of a sequence of RESPECT NULLS OVER tokens,
> remembering to replace the OVER token)? Or what about adding an %expect
> statement to the Bison grammar, confirming that the shift / reduce conflicts
> caused by using the RESPECT, IGNORE & NULL_P tokens the in out_clause rule
> are OK?

These lines of inquiry don't seem promising to me. It's going to be
complicated and unmaintainable and may just move the failure scenarios
to cases that are too obscure for us to reason about.

I think the question is whether this feature is really worth adding
new reserved keywords for. I have a hard time saying we shouldn't
support something that's part of the SQL standard, but personally,
it's not something I've seen come up prior to this thread.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2013-06-20 14:35:14 Re: MVCC catalog access
Previous Message Markus Wanner 2013-06-20 13:57:03 Re: dynamic background workers