Re: Find min year and min value

Lists: pgsql-general
From: Stefan Schwarzer <stefan(dot)schwarzer(at)grid(dot)unep(dot)ch>
To: pgsql-general(at)postgresql(dot)org
Subject: Find min year and min value
Date: 2007-10-02 13:26:17
Message-ID: F4D18345-B3E4-407C-983D-0DB79F981012@grid.unep.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Hi there,

I am trying to find in a table with different variables, countries
and years the

lowest year

and within that year the

lowest value

The following SELECT works, but I wonder if it is "elegant". Can you
recommend any other solution?

SELECT
MIN(value) AS minv
FROM
public_one_table.data
WHERE
year =
(
SELECT
MIN(year) AS min_year
FROM
public_one_table.data
WHERE
id_variable = 1
) AND
id_variable = 1

Thanks for any help!

Stef


From: Richard Huxton <dev(at)archonet(dot)com>
To: Stefan Schwarzer <stefan(dot)schwarzer(at)grid(dot)unep(dot)ch>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Find min year and min value
Date: 2007-10-02 13:40:11
Message-ID: 47024A3B.60909@archonet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Stefan Schwarzer wrote:
> Hi there,
>
> I am trying to find in a table with different variables, countries and
> years the
>
> lowest year
>
> and within that year the
>
> lowest value
>
>
> The following SELECT works, but I wonder if it is "elegant". Can you
> recommend any other solution?

SELECT value AS minv FROM public_on_table.data
WHERE id_variable = 1
ORDER BY year, value LIMIT 1

--
Richard Huxton
Archonet Ltd


From: Stefan Schwarzer <stefan(dot)schwarzer(at)grid(dot)unep(dot)ch>
To: Richard Huxton <dev(at)archonet(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Find min year and min value
Date: 2007-10-02 13:48:42
Message-ID: 30A13818-F770-426D-9341-ACF828569617@grid.unep.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

>> Hi there,
>> I am trying to find in a table with different variables, countries
>> and years the
>> lowest year
>> and within that year the
>> lowest value
>> The following SELECT works, but I wonder if it is "elegant". Can
>> you recommend any other solution?
>
> SELECT value AS minv FROM public_on_table.data
> WHERE id_variable = 1
> ORDER BY year, value LIMIT 1

But that brings only the min value, not the min year. I need to know
both of them, something like

(min year = ) 1972, (min value = ) 20

Stef


From: Richard Huxton <dev(at)archonet(dot)com>
To: Stefan Schwarzer <stefan(dot)schwarzer(at)grid(dot)unep(dot)ch>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Find min year and min value
Date: 2007-10-02 13:56:39
Message-ID: 47024E17.30804@archonet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Stefan Schwarzer wrote:
>>> Hi there,
>>> I am trying to find in a table with different variables, countries
>>> and years the
>>> lowest year
>>> and within that year the
>>> lowest value
>>> The following SELECT works, but I wonder if it is "elegant". Can you
>>> recommend any other solution?
>>
>> SELECT value AS minv FROM public_on_table.data
>> WHERE id_variable = 1
>> ORDER BY year, value LIMIT 1
>
> But that brings only the min value, not the min year. I need to know
> both of them, something like
>
> (min year = ) 1972, (min value = ) 20

SELECT year, value FROM ...

--
Richard Huxton
Archonet Ltd


From: Stefan Schwarzer <stefan(dot)schwarzer(at)grid(dot)unep(dot)ch>
To: Richard Huxton <dev(at)archonet(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Find min year and min value
Date: 2007-10-02 14:29:02
Message-ID: BA560C64-608A-44B7-BD8D-0581E825F00D@grid.unep.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

> SELECT year, value FROM ...

I feel ashamed.... such a simple solution... gush.... Thanks for that!

Unfortunately it doesn't stop there...

If I want to find the "common smallest year" for two given variables
(say, I have years 1970, 1971,.... 2005 for variable 1 (GDP) and
1980, 1981,... 2003) for variable 2 (Fish Catch) ). It should come up
with 1980 for a given country, if there is a value for that year in
both variables. Otherwise 1981, etc...

How would I do that? I really have no clue...

(my table looks something like this:

id_variable | year | value | id_country
---------------------------------------
1 | 2001 | 123 | 1
1 | 2002 | 125 | 1
1 | 2003 | 128 | 1
1 | 2004 | 132 | 1
1 | 2005 | 135 | 1

1 | 2001 | 412 | 2
1 | 2002 | 429 | 2
1 | 2003 | 456 | 2
1 | 2004 | 465 | 2
1 | 2005 | 477 | 2

....

2 | 1980 | 83 | 1
2 | 1981 | 89 | 1
....

)

Thanks for any hints,

Stef


From: Richard Huxton <dev(at)archonet(dot)com>
To: Stefan Schwarzer <stefan(dot)schwarzer(at)grid(dot)unep(dot)ch>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Find min year and min value
Date: 2007-10-02 14:53:23
Message-ID: 47025B63.5080300@archonet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Stefan Schwarzer wrote:
>> SELECT year, value FROM ...
>
> I feel ashamed.... such a simple solution... gush.... Thanks for that!

Can be easy to over-complicate things when you've been thinking about
them too long.

> Unfortunately it doesn't stop there...
>
> If I want to find the "common smallest year" for two given variables
> (say, I have years 1970, 1971,.... 2005 for variable 1 (GDP) and 1980,
> 1981,... 2003) for variable 2 (Fish Catch) ). It should come up with
> 1980 for a given country, if there is a value for that year in both
> variables. Otherwise 1981, etc...

In that case you will need two subqueries, but it's just a matter of
converting your description to SQL.

SELECT
yr1,
gdp.val1 AS gdp_val,
fish_catch.val2 AS fish_catch_val
FROM
(SELECT year AS yr1, value AS val1 FROM data WHERE id_variable = 1
) AS gdp,
(SELECT year AS yr2, value AS val2 FROM data WHERE id_variable = 2
) AS fish_catch
WHERE
gdp.yr1 = fish_catch.yr2
ORDER BY
gdp.yr1
LIMIT 1;

Here I've aliases (renamed) the columns and the sub-queries, but I'd
probably just alias the sub-queries in real-life.

You could write it as a JOIN if you prefer that style, or use the MIN()
aggregate (although I'd guess that the ORDER BY/LIMIT might prove faster).

So, I'd perhaps use:

SELECT gdp.year, gdp.val AS gdp_val, fish_catch.val AS fish_catch_val
FROM
(SELECT year,value FROM data WHERE id_variable=1) AS gdp
JOIN
(SELECT year, value FROM data WHERE id_variable=2) AS fish_catch
USING (year)
ORDER BY gdp.year
LIMIT 1;

--
Richard Huxton
Archonet Ltd


From: Michael Glaesemann <grzm(at)seespotcode(dot)net>
To: Stefan Schwarzer <stefan(dot)schwarzer(at)grid(dot)unep(dot)ch>
Cc: Richard Huxton <dev(at)archonet(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: Find min year and min value
Date: 2007-10-02 14:53:34
Message-ID: 1CA50CAD-CAF3-4F45-A86C-7A945AC7F6E3@seespotcode.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general


On Oct 2, 2007, at 9:29 , Stefan Schwarzer wrote:

> How would I do that? I really have no clue...

The key is to build it up in steps.

select id_country, year, var_1, val_1, var_2, val_2
-- Second step:
-- value for year for each country of var_1
from (select id_country, year, id_variable as var_1, "value" as val_1
from my_table) as val_1
-- value for year for each country for var_2
natural join (select id_country, year, id_variable as var_2, "value"
as val_2
from my_table) as val_2
-- First step
-- for each country, find the minimum common year (which the join
will do) for the two
-- variables you're interested in (var_1 and var_2).
natural join (select id_country, var_1, var_2, min(year) as year
from (select id_country, year, id_variable as var_1
from my_table) as var_1
natural join (select id_country, year, id_variable
as var_2
from my_table) as var_2
group by id_country, var_1, var_2) as min_common_year
where id_country = :id_country
and var_1 = :var_1
and var_2 = :var_2;

Check your explain analyze output: if the planner doesn't push up
the :var_1, :var_2, and :id_country_id values up into subqueries, you
might want to add them as where clauses.

As an aside, I assume you've rewritten the table column names: if you
haven't, as it's an SQL keyword, "value" is a particularly poor
choice of column name. I'd probably rename "year" as well.

Hope this helps.

Michael Glaesemann
grzm seespotcode net


From: Steve Crawford <scrawford(at)pinpointresearch(dot)com>
To: Stefan Schwarzer <stefan(dot)schwarzer(at)grid(dot)unep(dot)ch>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Find min year and min value
Date: 2007-10-02 16:10:35
Message-ID: 47026D7B.5030609@pinpointresearch.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Stefan Schwarzer wrote:
>> SELECT year, value FROM ...
>
> I feel ashamed.... such a simple solution... gush.... Thanks for that!
>
> Unfortunately it doesn't stop there...
>
> If I want to find the "common smallest year" for two given variables
> (say, I have years 1970, 1971,.... 2005 for variable 1 (GDP) and 1980,
> 1981,... 2003) for variable 2 (Fish Catch) ). It should come up with
> 1980 for a given country, if there is a value for that year in both
> variables. Otherwise 1981, etc...
>
> How would I do that? I really have no clue...
>
> (my table looks something like this:
>
> id_variable | year | value | id_country
> ---------------------------------------
> 1 | 2001 | 123 | 1
> 1 | 2002 | 125 | 1
> ....
>
> 2 | 1980 | 83 | 1
> 2 | 1981 | 89 | 1
> ....
>
> )
>
> Thanks for any hints,

As others have noted, the query *can* be written. But it appears to me
that you are struggling against your table layout. Before struggling
with ever more complicated queries, I'd consider restructuring your
table(s). There are many possibilities depending on the current nature
of your data, how you expect it to change and the queries you expect to
run against it. For example:
country_id
data_year
gdp
fish_catch

Then your query may be as simple as, say:
select min(year) from your_table
where country_id = xxx
and gdp is not null
and fish_catch is not null;

or

select year, gdp, fish_catch from your_table
where country_id = xxx
and gdp is not null
and fish_catch is not null
order by year desc, gdp desc, fish_catch desc
limit 1;

Alternately, you could have a gdp table and a fish_catch table which
would be easily joined to give the same result.

Cheers,
Steve


From: hubert depesz lubaczewski <depesz(at)depesz(dot)com>
To: Stefan Schwarzer <stefan(dot)schwarzer(at)grid(dot)unep(dot)ch>
Cc: Richard Huxton <dev(at)archonet(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: Find min year and min value
Date: 2007-10-02 17:37:43
Message-ID: 20071002173743.GA7266@depesz.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Tue, Oct 02, 2007 at 04:29:02PM +0200, Stefan Schwarzer wrote:
> If I want to find the "common smallest year" for two given variables
> (say, I have years 1970, 1971,.... 2005 for variable 1 (GDP) and
> 1980, 1981,... 2003) for variable 2 (Fish Catch) ). It should come up
> with 1980 for a given country, if there is a value for that year in
> both variables. Otherwise 1981, etc...
> How would I do that? I really have no clue...
> (my table looks something like this:
> id_variable | year | value | id_country
> ---------------------------------------
> 1 | 2001 | 123 | 1

select min(year) from (select year from table where id_variable in (1,2)
group by year having count(distinct id_variable) = 2) x;

depesz

--
quicksil1er: "postgres is excellent, but like any DB it requires a
highly paid DBA. here's my CV!" :)
http://www.depesz.com/ - blog dla ciebie (i moje CV)


From: Michael Glaesemann <grzm(at)seespotcode(dot)net>
To: Steve Crawford <scrawford(at)pinpointresearch(dot)com>
Cc: Stefan Schwarzer <stefan(dot)schwarzer(at)grid(dot)unep(dot)ch>, pgsql-general(at)postgresql(dot)org
Subject: Re: Find min year and min value
Date: 2007-10-02 18:10:49
Message-ID: 49C9F9A7-F165-4702-96E0-FB35EAC79F03@seespotcode.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general


On Oct 2, 2007, at 11:10 , Steve Crawford wrote:

> As others have noted, the query *can* be written. But it appears to me
> that you are struggling against your table layout.

The current schema he has is commonly called EAV (entity-attribute-
value) and is generally frowned upon. Now, in his particular case it
may be justified if the "value" column values are actually all of the
same type, such as currency amounts for each category. If this is the
case, I suggest renaming the column to be more descriptive of what is
actually stored: likewise the id_variable column.

> Before struggling
> with ever more complicated queries, I'd consider restructuring your
> table(s). There are many possibilities depending on the current nature
> of your data, how you expect it to change and the queries you
> expect to
> run against it. For example:
> country_id
> data_year
> gdp
> fish_catch

This would be one way to do it. However, each time you add a new
category you'd need to add a new column to the table: not very
flexible. You can also have the same functionality by adding a new
table for each category:

> Alternately, you could have a gdp table and a fish_catch table which
> would be easily joined to give the same result.

Expanding on this:

create table fish_catches (country text not null,
data_year date not null,
primary key (country, data_year),
fish_catch numeric not null);

create table gdp (country text not null reference countries
data_year date not null,
primary key (country, data_year),
gdp numeric not null);

This makes your queries quite simple:

select country, data_year, fish_catch, gdp
from fish_catches
natural join gdp
where country = :country
order by data_year
limit 1;

or

select country, data_year, fish_catch, gdp
from fish_catches
natural join gdp
natural join (select country, min(data_year) as data_year
from gdp
natural join fish_catch
group by country) min_data_year
where country = :country;

Splitting categories into separate tables also eliminates the
necessity of worrying about NULL, which can lead to unexpected
behavior if you aren't careful.

Michael Glaesemann
grzm seespotcode net


From: Stefan Schwarzer <stefan(dot)schwarzer(at)grid(dot)unep(dot)ch>
To: Michael Glaesemann <grzm(at)seespotcode(dot)net>
Cc: Steve Crawford <scrawford(at)pinpointresearch(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: Find min year and min value
Date: 2007-10-03 06:29:35
Message-ID: 22CCDB5A-D156-4EFE-AD39-A8042A8C59EB@grid.unep.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

>> As others have noted, the query *can* be written. But it appears
>> to me
>> that you are struggling against your table layout.
>
> The current schema he has is commonly called EAV (entity-attribute-
> value) and is generally frowned upon. Now, in his particular case
> it may be justified if the "value" column values are actually all
> of the same type, such as currency amounts for each category. If
> this is the case, I suggest renaming the column to be more
> descriptive of what is actually stored: likewise the id_variable
> column.

Hmmm.... I am somewhat surprised to here so. After being told in this
forum how "bad" my old table design was, I changed it to the current
(which is less than alpha). Perhaps to summarize: Having 500
statistical global national variables for about 240 countries/
territories. Need to do regional aggregations, per Capita
calculations and some completeness computations on-the-fly.

The design was a table like this for each variable:

id_country | 1970 | 1971 | ... | 2004 |
2005
-------------------------------------------------------------------
1 | NULL | 36 | ... |
42 | 45
2 ......

The new like this:

id_variable | year | value | id_country
---------------------------------------
1 | 2001 | 123 | 1
1 | 2002 | 125 | 1
1 | 2003 | 128 | 1
1 | 2004 | 132 | 1
1 | 2005 | 135 | 1

1 | 2001 | 412 | 2
1 | 2002 | 429 | 2
1 | 2003 | 456 | 2
1 | 2004 | 465 | 2
1 | 2005 | 477 | 2

....

2 | 1980 | 83 | 1
2 | 1981 | 89 | 1
....

I thought (and did ask) about the possibility to put nevertheless -
with the new table design - the variables into different tables, but
nobody really got my on a track for that. So I thought the most
"common" way would be to have this central table.

But I am at a stage where I still can change - and would very much
like to get your advice.

Thanks a lot!

Stef

____________________________________________________________________

Stefan Schwarzer

Lean Back and Relax - Enjoy some Nature Photography:
http://photoblog.la-famille-schwarzer.de

Appetite for Global Data? UNEP GEO Data Portal:
http://geodata.grid.unep.ch
____________________________________________________________________


From: Michael Glaesemann <grzm(at)seespotcode(dot)net>
To: Stefan Schwarzer <stefan(dot)schwarzer(at)grid(dot)unep(dot)ch>
Cc: Steve Crawford <scrawford(at)pinpointresearch(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: Find min year and min value
Date: 2007-10-03 12:55:46
Message-ID: 5B919087-9F93-47B5-9717-5B424A7EFA72@seespotcode.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general


On Oct 3, 2007, at 1:29 , Stefan Schwarzer wrote:

>>> As others have noted, the query *can* be written. But it appears
>>> to me
>>> that you are struggling against your table layout.
>>
>> The current schema he has is commonly called EAV (entity-attribute-
>> value) and is generally frowned upon. Now, in his particular case
>> it may be justified if the "value" column values are actually all
>> of the same type, such as currency amounts for each category. If
>> this is the case, I suggest renaming the column to be more
>> descriptive of what is actually stored: likewise the id_variable
>> column.
>
> Having 500 statistical global national variables for about 240
> countries/territories. Need to do regional aggregations, per Capita
> calculations and some completeness computations on-the-fly.

> id_variable | year | value | id_country

Both Steve and I have given you alternatives and reasons for choosing
alternative schema. You haven't provided any additional information
to really help us guide you in any particular direction from what we
already have. For example, in the section from me which you quoted
above, I wrote that this schema may be appropriate if the "value"
column values are actually all of the same type (e..g, all currency
amounts, all masses, all counts). You haven't said whether or not
this is the case. We can't read your mind :)

Again, one thing that would help is if you use a more descriptive
column name than "value" that gives an indication of what *kind* of
values are in the column.

> I thought (and did ask) about the possibility to put nevertheless -
> with the new table design - the variables into different tables,
> but nobody really got my on a track for that.

Steve first suggested it and I provided an example of what that would
look like (using "gdp" and "fish_catches" tables) in the same post
you quoted from above.

http://archives.postgresql.org/pgsql-general/2007-10/msg00108.php

Is this not what you mean?

Michael Glaesemann
grzm seespotcode net


From: Stefan Schwarzer <stefan(dot)schwarzer(at)grid(dot)unep(dot)ch>
To: Michael Glaesemann <grzm(at)seespotcode(dot)net>
Cc: Steve Crawford <scrawford(at)pinpointresearch(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: Find min year and min value
Date: 2007-10-04 11:41:20
Message-ID: 279B4760-C124-40F5-8383-77541C244CC7@grid.unep.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

>> Having 500 statistical global national variables for about 240
>> countries/territories. Need to do regional aggregations, per
>> Capita calculations and some completeness computations on-the-fly.
>
>
>> id_variable | year | value | id_country
>
> Both Steve and I have given you alternatives and reasons for
> choosing alternative schema. You haven't provided any additional
> information to really help us guide you in any particular direction
> from what we already have. For example, in the section from me
> which you quoted above, I wrote that this schema may be appropriate
> if the "value" column values are actually all of the same type
> (e..g, all currency amounts, all masses, all counts). You haven't
> said whether or not this is the case. We can't read your mind :)
>
> Again, one thing that would help is if you use a more descriptive
> column name than "value" that gives an indication of what *kind* of
> values are in the column.
>
>> I thought (and did ask) about the possibility to put nevertheless
>> - with the new table design - the variables into different tables,
>> but nobody really got my on a track for that.
>
> Steve first suggested it and I provided an example of what that
> would look like (using "gdp" and "fish_catches" tables) in the same
> post you quoted from above.
>
> http://archives.postgresql.org/pgsql-general/2007-10/msg00108.php
>
> Is this not what you mean?

Hmm... sorry that I didn't give you more info. I will see it I can do
better now.

So, as mentioned above, we have 500 different variables (such as GDP,
Fish Catch, Population, Forest Cover, Fertility Rate etc.). Many of
them have indeed different units (Metric Tons, Thousand People, Sqkm,
Persons per Year etc.). And many do cover different time periods
(1970-2005; 2000-2002; 1970-75, 1975-80,..., 2000-05; ...).

So, "value" is something general - it can be (measured in) (thousand)
Dollars, (Million) People, (Hundred) Hectares etc...

We update the variables something like once a year or every two
years, in an ongoing manner. Updating means especially adding new
years (as they are being published by World Bank, UNSD, etc.). So, no
real "hard core" editing/adding, I'd say. A slow, but steady progress.

Queries, as mentioned too, are mainly simple "SELECt * FROM
specific_variable_like_gdp". But as well on-the-fly aggregations to
subregions and regions (such as West Africa, and Africa). And some
calculations which enables to indicate the "completeness" of the
aggregations (parameters being used are "Number of countries included
in aggregation in respect to Number of Countries, Total Population,
Land Area, GDP).

Does this help?

Thanks anyway already for all your feedback. Your really great people
out there!!

Best wishes,

Stef


From: Michael Glaesemann <grzm(at)seespotcode(dot)net>
To: Stefan Schwarzer <stefan(dot)schwarzer(at)grid(dot)unep(dot)ch>
Cc: Steve Crawford <scrawford(at)pinpointresearch(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: Find min year and min value
Date: 2007-10-04 12:57:37
Message-ID: A06A0335-C403-423E-8105-3D515CBDC4CC@seespotcode.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general


On Oct 4, 2007, at 6:41 , Stefan Schwarzer wrote:

> So, "value" is something general - it can be (measured in)
> (thousand) Dollars, (Million) People, (Hundred) Hectares etc...

Then I would make a separate table for each variable (as I described
above). For one thing, you're going to have keep track of what units
are associated with which variables are somewhere, whether that be in
your database or possibly your middleware. Separating the variables
into separate tables provides better separation of these value types.

> Queries, as mentioned too, are mainly simple "SELECt * FROM
> specific_variable_like_gdp". But as well on-the-fly aggregations to
> subregions and regions (such as West Africa, and Africa).

If these aggregations are going to be common, you might want to set
up another table that arranges the countries into hierarchies using
nested sets or some other hierarchical strategy (like the adjacency
model or contrib/ltree). I've found nested sets are convenient for
calculating aggregations. You can google or search the archives for
these methods.

> And some calculations which enables to indicate the "completeness"
> of the aggregations (parameters being used are "Number of countries
> included in aggregation in respect to Number of Countries, Total
> Population, Land Area, GDP).

You can use nested sets to handle these calculations as well.

I'm sure there are resources out there that describe database schema
for reporting things such as you describe, as it's not a new problem.
I don't know any off the top of my head, but again, Google is your
friend.

Hope this helps!

Michael Glaesemann
grzm seespotcode net


From: Stefan Schwarzer <stefan(dot)schwarzer(at)grid(dot)unep(dot)ch>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Find min year and min value
Date: 2007-10-05 09:11:14
Message-ID: 6B863DEC-202B-4064-AAEF-ACB2ED369B02@grid.unep.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

>> Alternately, you could have a gdp table and a fish_catch table which
>> would be easily joined to give the same result.
>
> Expanding on this:
>
> create table fish_catches (country text not null,
> data_year date not null,
> primary key (country, data_year),
> fish_catch numeric not null);
>
> create table gdp (country text not null reference countries
> data_year date not null,
> primary key (country, data_year),
> gdp numeric not null);
>
> This makes your queries quite simple:
>
> select country, data_year, fish_catch, gdp
> from fish_catches
> natural join gdp
> where country = :country
> order by data_year
> limit 1;

Hmmm..... Don't really get that query working. My SQL looks like this
now:

SELECT
id_country,
year,
value
FROM
internet_users
NATURAL JOIN
gdp
WHERE
id_country = 8
ORDER BY
year
LIMIT
1

But there is no result.

My table looks like this (for each variable one table):

id_country year value
4 1980 6.6
4 1981 7.0
...
6 1980 5.1

Thanks for any advice!

Stef


From: Stefan Schwarzer <stefan(dot)schwarzer(at)grid(dot)unep(dot)ch>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Find min year and min value
Date: 2007-10-05 12:14:11
Message-ID: CE8F0CD7-611C-43D4-90D0-BA2D68EF4C5B@grid.unep.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

> Hmmm..... Don't really get that query working. My SQL looks like
> this now:
>
> SELECT
> id_country,
> year,
> value
> FROM
> internet_users
> NATURAL JOIN
> gdp
> WHERE
> id_country = 8
> ORDER BY
> year
> LIMIT
> 1

Ok, got it working with another proposed SQL SELECT. This is a
solution (comes out of an PHP loop, so one could easily have more
than 2 subselects). Thanks for the help to everyone!!

SELECT
year0 AS common_year,
v0.val0 AS v0_value,
v1.val1 AS v1_value
FROM
(
SELECT
year AS year0,
value AS val0
FROM
gdp_capita
WHERE
id_country = 672 AND
value IS NOT NULL
) AS v0,
(
SELECT
year AS year1,
value AS val1
FROM
agri_add_gdp
WHERE
id_country = 672 AND
value IS NOT NULL
) AS v1
WHERE
v1.year1 = v0.year0 AND
v1.year1 = v1.year1
ORDER BY
v0.year0
LIMIT
1


From: Michael Glaesemann <grzm(at)seespotcode(dot)net>
To: Stefan Schwarzer <stefan(dot)schwarzer(at)grid(dot)unep(dot)ch>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Find min year and min value
Date: 2007-10-05 12:40:17
Message-ID: 70526839-565E-47CF-98F4-D05D02C96312@seespotcode.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general


On Oct 5, 2007, at 4:11 , Stefan Schwarzer wrote:

>>> Alternately, you could have a gdp table and a fish_catch table which
>>> would be easily joined to give the same result.
>>
>> Expanding on this:
>>
>> create table fish_catches (country text not null,
>> data_year date not null,
>> primary key (country, data_year),
>> fish_catch numeric not null);
>>
>> create table gdp (country text not null reference countries
>> data_year date not null,
>> primary key (country, data_year),
>> gdp numeric not null);
>>
>> This makes your queries quite simple:
>>
>> select country, data_year, fish_catch, gdp
>> from fish_catches
>> natural join gdp
>> where country = :country
>> order by data_year
>> limit 1;
>
> Hmmm..... Don't really get that query working. My SQL looks like
> this now:
>
> SELECT
> id_country,
> year,
> value
> FROM
> internet_users
> NATURAL JOIN
> gdp
> WHERE
> id_country = 8
> ORDER BY
> year
> LIMIT
> 1
>
> But there is no result.
>
> My table looks like this (for each variable one table):
>
> id_country year value

The natural join operator joins on common columns: if columns are
named "value" in both tables, the join condition is (id_country,
year, value) = (id_country, year, value). In the example I provided
above, the tables were fish_catches {country, year, fish_catch} and
gdp {country, year, gdp}: the join condition is (country, year) =
(country, year).

Also, note that there are *four* output columns in the query I used:
{country, data_year, fish_catch, gdp}. You've only got three, which
is bound to be confusing.

In your case you can use subqueries to rename the columns or an
explicit join:

-- using subqueries
select id_country, year, internet_users, gdp
from (select id_country, year, value as internet users
from internet_users) i
natural join (select id_country, year, value as gdp
from gdp) g
where id_country = 8
order by year
limit 1

-- using an explicit join

select id_country, year, internet_users.value as internet_users,
gdp.value as gdp
from internet_users
join gdp using (id_country, year)
where id_country = 8
order by year
limit 1

Michael Glaesemann
grzm seespotcode net