Re: date - range

From: Mischa <mischa(dot)Sandberg(at)telus(dot)net>
To: "H(dot)J(dot) Sanders" <hjs(at)rmax(dot)nl>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: date - range
Date: 2005-04-02 05:59:44
Message-ID: 1112421584.424e34d0d2d63@webmail.telus.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Quoting "H.J. Sanders" <hjs(at)rmax(dot)nl>:

>
> Anybody a solution for the next problem:
> people can subscribe to a service for 1 or more days (upto a max. of 365).
> So in the database is stored: first_date and last_date
> To select which people are subscribed for a certain date (e.g. today) we use
> a select like
>
> select ....... where first_date <= today and last_date >= today
>
> Whatever index we create system always does a sequential scan (which I can
> understand). Has someone a smarter solution?

Yep, standard SQL problem. The answer is sort of a hand-rolled GIST index.

To save typing, I'm going to pretend all your dates are stored as integers.
In reality, you'll probably be writing views with lots of EXTRACT(EPOCH...)'s in
them, to achieve the same result.

Suppose you have table People(id, first_date, last_date, ...)
Each such range "fits" in some larger fixed range of 1,2,4, ... days
that starts and ends on a fixed (epoch) date multiple of 1,2,4,...
For example, if your range were days (1040..1080), then that fits in the
64-wide range (1024...1088]. You calculate the start and width of the range that
just fits, and store that in People, too. Now, you index on (start,width).

Now, when you want to query for a given "today", you have to try for
all possible widths in People. Fortunately, that's darn few!
The ranges up to a decade (!) will still mean only 16 different widths.
A max range of one year (<512 days) means only 9 widths.
You can do this with a tiny static table.

Then: the query:

SELECT People.* FROM People
JOIN Widths
ON People.start = today - today % Widths.width
AND People.width = Widths.width

Though this may look gross, it makes an index work where no normal BTree index
would. I've used it for some really nasty data conversions of 100M-row tables.

Your first name wouldn't be "Harlan", would it? :-)
-- "Dreams come true, not free."

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Mischa 2005-04-02 06:25:19 Re: date - range
Previous Message Michael Fuhr 2005-04-01 23:24:01 Re: date - range