Re: PostgreSQL - 'SKYLINE OF' clause added!

From: David Fuhry <dfuhry(at)cs(dot)kent(dot)edu>
To: Gavin Sherry <swm(at)alcove(dot)com(dot)au>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Chris Browne <cbbrowne(at)acm(dot)org>
Subject: Re: PostgreSQL - 'SKYLINE OF' clause added!
Date: 2007-03-07 16:28:22
Message-ID: 45EEE826.5010407@cs.kent.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

The query Ranbeer gave - as with any skyline query - can be solved with
just pure SQL:

select * from books b where not exists(
select * from books b2 where
b2.rating >= b.rating and b2.price <= b.price and
(b2.rating > b.rating or b2.price < b.price)
);
book_name | rating | price
-------------------+--------+-------
Prodigal Daughter | 3 | 250
The Notebook | 4 | 300
Fountain Head | 5 | 350
(3 rows)

The idea of the BNL (block nested loop) skyline algorithm is to avoid
the nested loop by storing "dominating" records as the query proceeds -
in the above example, records which are relatively high in rating and
low in price - and comparing each candidate record to those first.

BNL is the most reasonable skyline algorithm in the absence of a
multidimensional (usually R-Tree) index on the columns. For answering
skyline queries where such an index exists over all query columns, the
most broadly used generalized algorithm is BBS [1].

Thanks,

Dave Fuhry

[1] Papadias, D., Tao, Y., Fu, G., and Seeger, B. 2005. Progressive
skyline computation in database systems. ACM Trans. Database Syst. 30, 1
(Mar. 2005), 41-82. DOI= http://doi.acm.org/10.1145/1061318.1061320

Gavin Sherry wrote:
> On Tue, 6 Mar 2007, Alvaro Herrera wrote:
>
>> Also, keep in mind that there were plenty of changes in the executor.
>> This stuff is not likely to be very easy to implement efficiently using
>> our extant executor machinery; note that Ranbeer mentioned
>> implementation of "block nested loop" and other algorithms. Not sure
>> how easy would be to fold that stuff into the optimizer for multi-input
>> aggregates, instead of hardwiring it to the SKYLINE OF syntax.
>>
>
> Yes, there's been a lot of working on calculating skyline efficiently,
> with different sorting techniques and so on. This is the most interesting
> part of the idea. You could calculate the query Ranbeer gave using pure
> SQL and, perhaps, use of some covariance aggregates or something already.
> Of course, it gets harder when you want to calculate across many
> dimensions.
>
> Personally, I'd love to see some of these newer data analysis
> capabilities added to PostgreSQL -- or at least put out there as
> interesting patches.
>
> Thanks,
>
> Gavin
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faq

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Zeugswetter Andreas ADI SD 2007-03-07 16:45:51 Re: Auto creation of Partitions
Previous Message Zoltan Boszormenyi 2007-03-07 16:22:13 Test report on GENERATED/IDENTITY