Aggregates containing outer references don't work per spec

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: Aggregates containing outer references don't work per spec
Date: 2003-06-04 21:33:46
Message-ID: 27411.1054762426@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Some of the Red Hat guys have been trying to work through the NIST SQL
compliance tests. So far they've found several things we already knew
about, and one we didn't:

-- TEST:0434 GROUP BY with HAVING EXISTS-correlated set function!
SELECT PNUM, SUM(HOURS) FROM WORKS
GROUP BY PNUM
HAVING EXISTS (SELECT PNAME FROM PROJ
WHERE PROJ.PNUM = WORKS.PNUM AND
SUM(WORKS.HOURS) > PROJ.BUDGET / 200);

This query is legal according to the test, but Postgres fails with
ERROR: Aggregates not allowed in WHERE clause

The SUM() should be allowed in the sub-SELECT because, according to the
spec, it is actually an aggregate of the outer query --- and so the
whole expression "SUM(WORKS.HOURS)" is effectively an outer reference
for the sub-SELECT.

Now I finally understand why the spec has all that strange verbiage
about outer references in set-function arguments. This is the case
they're talking about. (I don't much like their restriction to a single
outer reference ... seems like it would be appropriate to allow multiple
references as long as they're all from the same outer query level.)

Fixing this looks a tad nasty. The planner can convert simple column
outer references into Params for a subquery, but there is no
infrastructure to handle making larger expressions into Params. Also,
I don't want the planner repeating the work that the parser is going to
have to do to validate correctness of the query --- the parser will need
to understand that the aggregate is an outer reference as a whole, and
the planner shouldn't have to rediscover that for itself. In any case,
it seems that an outer-reference aggregate is a rather different animal
from an aggregate of the current query, and ought to be so labeled in
the parse tree.

I'm thinking of adding an "agglevelsup" field in Aggref nodes that has
semantics similar to "varlevelsup" in Var nodes --- if it's zero then
the aggregate is a regular aggregate of the current level, if it's more
than zero then the aggregate belongs to an outer query that many levels
up. The parser would need to set this field based on what the
aggregate's argument contains. It'd also have to check that the
argument does not contain variables of more than one query level.

Comments?

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2003-06-04 21:46:01 Re: TODO list
Previous Message Tom Lane 2003-06-04 21:16:56 Re: Problem trying to implement version 3.0 of the PostgreSQL protocol