Query producing the wrong results?

Lists: pgsql-bugs
From: Nicholas Howell <nhowell(at)intellect(dot)co(dot)uk>
To: pgsql-bugs(at)postgresql(dot)org
Subject: Query producing the wrong results?
Date: 2004-04-27 14:34:38
Message-ID: 1083076479.3585.36.camel@corba
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

ebatcher=> select version();
version
---------------------------------------------------------------------------------------------------------
PostgreSQL 7.2.2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.2
20020903 (Red Hat Linux 8.0 3.2-7)
(1 row)

Just create a table with any int column and put in a bit of data:

ebatcher=> create table test (id int);
CREATE
ebatcher=> insert into test values ( 0 );
INSERT 43522 1
ebatcher=> insert into test values ( 1 );
INSERT 43523 1
ebatcher=> insert into test values ( 2 );
INSERT 43524 1
ebatcher=> insert into test values ( 3 );
INSERT 43525 1
ebatcher=> insert into test values ( 4 );
INSERT 43526 1
ebatcher=> select * from test;
id
----
0
1
2
3
4
(5 rows)

Ok so far so good, now when you run this query you get this result:

ebatcher=> select * from test where id = ((select min(id) from test) +
round(random() * 4));
id
----
0
3
4
(3 rows)

what I would expect is to get a single row returned not 3 rows.

ebatcher=> select * from test where id = ((select min(id) from test));
id
----
0
(1 row)

as expected the min is 0

ebatcher=> select round(random() * 4);
round
-------
1
(1 row)

ebatcher=> select round(random() * 4);
round
-------
3
(1 row)

as expected this always returns a random number between 0 and 4

put these together and I would expect to get a random single row not
multiple rows. 5 more runs of the query yield these results:

ebatcher=> select * from test where id = ((select min(id) from test) +
round(random() * 4));
id
----
0
3
(2 rows)

ebatcher=> select * from test where id = ((select min(id) from test) +
round(random() * 4));
id
----
1
3
(2 rows)

ebatcher=> select * from test where id = ((select min(id) from test) +
round(random() * 4));
id
----
2
4
(2 rows)

ebatcher=> select * from test where id = ((select min(id) from test) +
round(random() * 4));
id
----
(0 rows)

ebatcher=> select * from test where id = ((select min(id) from test) +
round(random() * 4));
id
----
1
(1 row)

Just tried something further and:

ebatcher=> select * from test where id = (0 + round(random() * 4));
id
----
0
(1 row)

ebatcher=> select * from test where id = (0 + round(random() * 4));
id
----
3
4
(2 rows)

and even:

ebatcher=> select * from test where id = round(random() * 4);
id
----
0
2
(2 rows)

Again I would expect to get just a single row. Is this a bug?


From: Stephan Szabo <sszabo(at)megazone(dot)bigpanda(dot)com>
To: Nicholas Howell <nhowell(at)intellect(dot)co(dot)uk>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: Query producing the wrong results?
Date: 2004-05-03 20:45:28
Message-ID: 20040503134310.K16514@megazone.bigpanda.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs


On Tue, 27 Apr 2004, Nicholas Howell wrote:

> ebatcher=> select version();
> version
> ---------------------------------------------------------------------------------------------------------
> PostgreSQL 7.2.2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.2
> 20020903 (Red Hat Linux 8.0 3.2-7)
> (1 row)
>
> Just create a table with any int column and put in a bit of data:
>
> ebatcher=> create table test (id int);
> CREATE
> ebatcher=> insert into test values ( 0 );
> INSERT 43522 1
> ebatcher=> insert into test values ( 1 );
> INSERT 43523 1
> ebatcher=> insert into test values ( 2 );
> INSERT 43524 1
> ebatcher=> insert into test values ( 3 );
> INSERT 43525 1
> ebatcher=> insert into test values ( 4 );
> INSERT 43526 1
> ebatcher=> select * from test;
> id
> ----
> 0
> 1
> 2
> 3
> 4
> (5 rows)
>
> Ok so far so good, now when you run this query you get this result:
>
> ebatcher=> select * from test where id = ((select min(id) from test) +
> round(random() * 4));
> id
> ----
> 0
> 3
> 4
> (3 rows)
>
> what I would expect is to get a single row returned not 3 rows.

The expression is evaluated for each row. That can be optimized out in
the cases that the expression is believed to be a stable (or immutable)
value, but random() is not which means you get different random numbers
for each row.

You can get the effect of single evaluation by hiding the random() call
inside a stable function.