Re: Select when table have missing data

Lists: pgsql-sql
From: "Lars Gregersen" <lars(dot)gregersen(at)it(dot)dk>
To: pgsql-sql(at)postgresql(dot)org
Subject: Select when table have missing data
Date: 2007-03-02 09:17:33
Message-ID: 1172827053.569467.79430@p10g2000cwp.googlegroups.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-sql

I have a table that contains historical exchange rates:
date_time | timestamp
xrate | real

There is a maximum of one entry per day, but data are missing on
weekends and holidays. For these missing dates I must use the value
from the previous day (e.g. since data for a Sunday is missing I must
use the value from the Friday just before the weekend).

I have two questions:

1) Since historical exchange rates are not supposed to change I
thought about creating a new table with data for all the missing dates
(calculated using some procedural language). However, I would be much
happier if there was a way to do this using SQL in a SELECT statement.
Do you have any hints for this?

2)
I have a number of other tables where data may be missing for
different reasons. These data may be on a daily or an hourly basis.
When a user selects a range of data from e.g. the 1st of January to
the 1st of February I would like to be able to return a full set of
data where all missing entries are returned as NULL. Is there a smart
way to do this using SQL?

Any hints or references you may have on the subject of handling
missing data in time series data are very welcome.

If there is a smarter way to set up tables for handling this type of
data then please enlighten me.

Thanks

Lars


From: Rodrigo De León <rdeleonp(at)gmail(dot)com>
To: pgsql-sql(at)postgresql(dot)org
Subject: Re: Select when table have missing data
Date: 2007-03-09 01:30:26
Message-ID: a55915760703081730x44817d50k1a1c8b501be0ab7d@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-sql

On 2 Mar 2007 01:17:33 -0800, Lars Gregersen <lars(dot)gregersen(at)it(dot)dk> wrote:
> I have a table that contains historical exchange rates:
> date_time | timestamp
> xrate | real
>
> There is a maximum of one entry per day, but data are missing on
> weekends and holidays. For these missing dates I must use the value
> from the previous day (e.g. since data for a Sunday is missing I must
> use the value from the Friday just before the weekend).
>
> I have two questions:
>
> 1) Since historical exchange rates are not supposed to change I
> thought about creating a new table with data for all the missing dates
> (calculated using some procedural language). However, I would be much
> happier if there was a way to do this using SQL in a SELECT statement.
> Do you have any hints for this?
>
> 2)
> I have a number of other tables where data may be missing for
> different reasons. These data may be on a daily or an hourly basis.
> When a user selects a range of data from e.g. the 1st of January to
> the 1st of February I would like to be able to return a full set of
> data where all missing entries are returned as NULL. Is there a smart
> way to do this using SQL?
>
> Any hints or references you may have on the subject of handling
> missing data in time series data are very welcome.
>
> If there is a smarter way to set up tables for handling this type of
> data then please enlighten me.
>
> Thanks
>
> Lars

generate_series() is your friend:

------------------------------

create table t(
date_time timestamp
, xrate real
);

insert into t values ('2007-3-7',0.23);
insert into t values ('2007-3-8',0.1);
insert into t values ('2007-3-9',0.2);
-- no '2007-3-10'
-- no '2007-3-11'
insert into t values ('2007-3-12',0.3);
insert into t values ('2007-3-13',0.4);
-- no '2007-3-14'
insert into t values ('2007-3-15',0.99);
-- no '2007-3-16'

select d.*, (
select xrate
from t
where date_time = (
select max(date_time)
from t
where date_time <= d.ddate
)
) as xrate
from (
select ('2007-3-7'::date+s.x)::timestamp as ddate
from generate_series(0,9) s(x)
) d

------------------------------

Extrapolate for case 2.

Regards.


From: "Kashmira Patel \(kupatel\)" <kupatel(at)cisco(dot)com>
To: <pgsql-sql(at)postgresql(dot)org>
Subject: Creating views
Date: 2007-03-09 02:03:45
Message-ID: 18AE59788A3FC640A367E5652E664D8003CA26C6@xmb-sjc-237.amer.cisco.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-sql

Hi all,
Is it possible to create views that take parameters? Basically, I have to create some very complex historical reports, and as of now I am using temporary tables to store intermediate query results. I calculate values based on the content of these temporary tables and use them in the final result. I would like to eliminate the need for temporary tables and use views instead, as the data copy between the temp tables is taking quite some time.

Thanks,
Kashmira


From: Richard Huxton <dev(at)archonet(dot)com>
To: "Kashmira Patel (kupatel)" <kupatel(at)cisco(dot)com>
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: Creating views
Date: 2007-03-09 08:34:00
Message-ID: 45F11BF8.4080809@archonet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-sql

Kashmira Patel (kupatel) wrote:
> Hi all, Is it possible to create views that take parameters?
> Basically, I have to create some very complex historical reports, and
> as of now I am using temporary tables to store intermediate query
> results. I calculate values based on the content of these temporary
> tables and use them in the final result. I would like to eliminate
> the need for temporary tables and use views instead, as the data copy
> between the temp tables is taking quite some time.

You can write a function (in SQL even) that would basically be a
parameterised view.

However, views don't "store" anything, so whether you see an improvement
in speed will depend on how long it takes to do your calculations.

--
Richard Huxton
Archonet Ltd