Re: [HACKERS] 6.5.0 - Overflow bug in AVG( )

Lists: pgsql-hackers
From: "Gene Sokolov" <hook(at)aktrad(dot)ru>
To: <pgsql-hackers(at)postgreSQL(dot)org>
Subject: 6.5.0 - Overflow bug in AVG( )
Date: 1999-06-16 06:41:39
Message-ID: 18a101beb7c3$4a0aa570$0d8cdac3@aktrad.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

% psql test1
Welcome to the POSTGRESQL interactive sql monitor:
Please read the file COPYRIGHT for copyright terms of POSTGRESQL
[PostgreSQL 6.5.0 on i386-unknown-freebsd3.2, compiled by gcc 2.7.2.1]

type \? for help on slash commands
type \q to quit
type \g or terminate with semicolon to execute query
You are currently connected to the database: test1

test1=> select count(*), max("ID"), min("ID"), avg("ID") from "ItemsBars";
count| max| min| avg
------+-------+-----+----
677719|3075717|61854|-251
(1 row)

Overflow, perhaps?

Gene Sokolov.


From: Thomas Lockhart <lockhart(at)alumni(dot)caltech(dot)edu>
To: Gene Sokolov <hook(at)aktrad(dot)ru>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] 6.5.0 - Overflow bug in AVG( )
Date: 1999-06-16 13:03:46
Message-ID: 3767A0B2.58C1DE09@alumni.caltech.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> [PostgreSQL 6.5.0 on i386-unknown-freebsd3.2, compiled by gcc 2.7.2.1]
> test1=> select count(*), max("ID"), min("ID"), avg("ID") from "ItemsBars";
> count| max| min| avg
> ------+-------+-----+----
> 677719|3075717|61854|-251
> (1 row)
> Overflow, perhaps?

Of course. These are integer fields? I've been considering changing
all accumulators (and results) for integer aggregate functions to
float8, but have not done so yet. I was sort of waiting for a v7.0
release, but am not sure why...

- Thomas

--
Thomas Lockhart lockhart(at)alumni(dot)caltech(dot)edu
South Pasadena, California


From: "Gene Sokolov" <hook(at)aktrad(dot)ru>
To: "Thomas Lockhart" <lockhart(at)alumni(dot)caltech(dot)edu>
Cc: <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: [HACKERS] 6.5.0 - Overflow bug in AVG( )
Date: 1999-06-16 13:27:00
Message-ID: 1ad801beb7fb$ea60a870$0d8cdac3@aktrad.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> > [PostgreSQL 6.5.0 on i386-unknown-freebsd3.2, compiled by gcc 2.7.2.1]
> > test1=> select count(*), max("ID"), min("ID"), avg("ID") from
"ItemsBars";
> > count| max| min| avg
> > ------+-------+-----+----
> > 677719|3075717|61854|-251
> > (1 row)
> > Overflow, perhaps?
>
> Of course. These are integer fields? I've been considering changing

Yes, the fields are int4

> all accumulators (and results) for integer aggregate functions to
> float8, but have not done so yet. I was sort of waiting for a v7.0
> release, but am not sure why...

Float8 accumulator seems to be a good solution if AVG is limited to
int/float types. I wonder if it could produce system dependency in AVG due
to rounding errors. Some broader solution should be considered though if you
want AVG to work on numeric/decimal as well.

Gene Sokolov.


From: Thomas Lockhart <lockhart(at)alumni(dot)caltech(dot)edu>
To: Gene Sokolov <hook(at)aktrad(dot)ru>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] 6.5.0 - Overflow bug in AVG( )
Date: 1999-06-16 14:03:38
Message-ID: 3767AEBA.E7C006B9@alumni.caltech.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> Float8 accumulator seems to be a good solution if AVG is limited to
> int/float types. I wonder if it could produce system dependency in AVG due
> to rounding errors. Some broader solution should be considered though if you
> want AVG to work on numeric/decimal as well.

The implementation can be specified for each datatype individually, so
that's not a problem. afaik the way numeric/decimal work it would be
fine to use those types as their own accumulators. It's mostly the
int2/int4/int8 types which are the problem, since they silently
overflow (on most machines?).

- Thomas

--
Thomas Lockhart lockhart(at)alumni(dot)caltech(dot)edu
South Pasadena, California


From: wieck(at)debis(dot)com (Jan Wieck)
To: lockhart(at)alumni(dot)caltech(dot)edu (Thomas Lockhart)
Cc: hook(at)aktrad(dot)ru, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] 6.5.0 - Overflow bug in AVG( )
Date: 1999-06-16 14:06:58
Message-ID: m10uGLC-0003krC@orion.SAPserv.Hamburg.dsh.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Thomas Lockhart wrote:

>
> > [PostgreSQL 6.5.0 on i386-unknown-freebsd3.2, compiled by gcc 2.7.2.1]
> > test1=> select count(*), max("ID"), min("ID"), avg("ID") from "ItemsBars";
> > count| max| min| avg
> > ------+-------+-----+----
> > 677719|3075717|61854|-251
> > (1 row)
> > Overflow, perhaps?
>
> Of course. These are integer fields? I've been considering changing
> all accumulators (and results) for integer aggregate functions to
> float8, but have not done so yet. I was sort of waiting for a v7.0
> release, but am not sure why...

Wouldn't it be better to use NUMERIC for the avg(int) state
values? It will never loose any significant digit.

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#========================================= wieck(at)debis(dot)com (Jan Wieck) #


From: Thomas Lockhart <lockhart(at)alumni(dot)caltech(dot)edu>
To: Jan Wieck <jwieck(at)debis(dot)com>
Cc: hook(at)aktrad(dot)ru, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] 6.5.0 - Overflow bug in AVG( )
Date: 1999-06-16 14:20:58
Message-ID: 3767B2CA.9E69AB2@alumni.caltech.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> > Of course. These are integer fields? I've been considering changing
> > all accumulators (and results) for integer aggregate functions to
> > float8, but have not done so yet. I was sort of waiting for a v7.0
> > release, but am not sure why...
>
> Wouldn't it be better to use NUMERIC for the avg(int) state
> values? It will never loose any significant digit.

Sure. It would be fast, right? avg(int) is likely to be used a lot,
and should be as fast as possible.

- Thomas

--
Thomas Lockhart lockhart(at)alumni(dot)caltech(dot)edu
South Pasadena, California


From: wieck(at)debis(dot)com (Jan Wieck)
To: lockhart(at)alumni(dot)caltech(dot)edu (Thomas Lockhart)
Cc: jwieck(at)debis(dot)com, hook(at)aktrad(dot)ru, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] 6.5.0 - Overflow bug in AVG( )
Date: 1999-06-16 23:20:23
Message-ID: m10uOyl-0003kLC@orion.SAPserv.Hamburg.dsh.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Thomas Lockhart wrote:

>
> > > Of course. These are integer fields? I've been considering changing
> > > all accumulators (and results) for integer aggregate functions to
> > > float8, but have not done so yet. I was sort of waiting for a v7.0
> > > release, but am not sure why...
> >
> > Wouldn't it be better to use NUMERIC for the avg(int) state
> > values? It will never loose any significant digit.
>
> Sure. It would be fast, right? avg(int) is likely to be used a lot,
> and should be as fast as possible.

I think it would be fast enough, even if I have things in
mind how to speed it up. But that would result in a total
rewrite of NUMERIC from scratch.

The only math function of NUMERIC which is time critical for
AVG() is ADD. And even for int8 the number of digits it has
to perform is relatively small. I expect the time spent on
that is negligible compared to the heap scanning required to
get all the values.

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#========================================= wieck(at)debis(dot)com (Jan Wieck) #