Re: Using 128-bit integers for sum, avg and statistics aggregates

Lists: pgsql-hackers
From: Andreas Karlsson <andreas(at)proxel(dot)se>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: [WIP Patch] Using 128-bit integers for sum, avg and statistics aggregates
Date: 2014-10-25 14:38:41
Message-ID: 544BB5F1.50709@proxel.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

There was recently talk about if we should start using 128-bit integers
(where available) to speed up the aggregate functions over integers
which uses numeric for their internal state. So I hacked together a
patch for this to see what the performance gain would be.

Previous thread:
http://www.postgresql.org/message-id/20141017182500.GF2075@alap3.anarazel.de

What the patch does is switching from using numerics in the aggregate
state to int128 and then convert the type from the 128-bit integer in
the final function.

The functions where we can make use of int128 states are:

- sum(int8)
- avg(int8)
- var_*(int2)
- var_*(int4)
- stdev_*(int2)
- stdev_*(int4)

The initial benchmark results look very promising. When summing 10
million int8 I get a speedup of ~2.5x and similarly for var_samp() on 10
million int4 I see a speed up of ~3.7x. To me this indicates that it is
worth the extra code. What do you say? Is this worth implementing?

The current patch still requires work. I have not written the detection
of int128 support yet, and the patch needs code cleanup (for example: I
used an int16_ prefix on the added functions, suggestions for better
names are welcome). I also need to decide on what estimate to use for
the size of that state.

The patch should work and pass make check on platforms where __int128_t
is supported.

The simple benchmarks:

CREATE TABLE test_int8 AS SELECT x::int8 FROM generate_series(1,
10000000) x;

Before:

# SELECT sum(x) FROM test_int8;
sum
----------------
50000005000000
(1 row)

Time: 2521.217 ms

After:

# SELECT sum(x) FROM test_int8;
sum
----------------
50000005000000
(1 row)

Time: 1022.811 ms

CREATE TABLE test_int4 AS SELECT x::int4 FROM generate_series(1,
10000000) x;

Before:

# SELECT var_samp(x) FROM test_int4;
var_samp
--------------------
8333334166666.6667
(1 row)

Time: 3808.546 ms

After:

# SELECT var_samp(x) FROM test_int4;
var_samp
--------------------
8333334166666.6667
(1 row)

Time: 1033.243 ms

Andreas

Attachment Content-Type Size
int128-agg-v1.patch text/x-patch 29.3 KB

From: Arthur Silva <arthurprs(at)gmail(dot)com>
To: Andreas Karlsson <andreas(at)proxel(dot)se>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [WIP Patch] Using 128-bit integers for sum, avg and statistics aggregates
Date: 2014-10-28 13:05:11
Message-ID: CAO_YK0X2FYVwfJ7Cv+xwk2CTH36dnyxu2VHbEEEuBg0UnxCwGA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Oct 25, 2014 at 12:38 PM, Andreas Karlsson <andreas(at)proxel(dot)se>
wrote:

> Hi,
>
> There was recently talk about if we should start using 128-bit integers
> (where available) to speed up the aggregate functions over integers which
> uses numeric for their internal state. So I hacked together a patch for
> this to see what the performance gain would be.
>
> Previous thread: http://www.postgresql.org/message-id/20141017182500.
> GF2075(at)alap3(dot)anarazel(dot)de
>
> What the patch does is switching from using numerics in the aggregate
> state to int128 and then convert the type from the 128-bit integer in the
> final function.
>
> The functions where we can make use of int128 states are:
>
> - sum(int8)
> - avg(int8)
> - var_*(int2)
> - var_*(int4)
> - stdev_*(int2)
> - stdev_*(int4)
>
> The initial benchmark results look very promising. When summing 10 million
> int8 I get a speedup of ~2.5x and similarly for var_samp() on 10 million
> int4 I see a speed up of ~3.7x. To me this indicates that it is worth the
> extra code. What do you say? Is this worth implementing?
>
> The current patch still requires work. I have not written the detection of
> int128 support yet, and the patch needs code cleanup (for example: I used
> an int16_ prefix on the added functions, suggestions for better names are
> welcome). I also need to decide on what estimate to use for the size of
> that state.
>
> The patch should work and pass make check on platforms where __int128_t is
> supported.
>
> The simple benchmarks:
>
> CREATE TABLE test_int8 AS SELECT x::int8 FROM generate_series(1, 10000000)
> x;
>
> Before:
>
> # SELECT sum(x) FROM test_int8;
> sum
> ----------------
> 50000005000000
> (1 row)
>
> Time: 2521.217 ms
>
> After:
>
> # SELECT sum(x) FROM test_int8;
> sum
> ----------------
> 50000005000000
> (1 row)
>
> Time: 1022.811 ms
>
> CREATE TABLE test_int4 AS SELECT x::int4 FROM generate_series(1, 10000000)
> x;
>
> Before:
>
> # SELECT var_samp(x) FROM test_int4;
> var_samp
> --------------------
> 8333334166666.6667
> (1 row)
>
> Time: 3808.546 ms
>
> After:
>
> # SELECT var_samp(x) FROM test_int4;
> var_samp
> --------------------
> 8333334166666.6667
> (1 row)
>
> Time: 1033.243 ms
>
> Andreas
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>
These are some nice improvements.

As far as I'm aware int128 types are supported on every major compiler when
compiling for 64bit platforms. Right?


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Andreas Karlsson <andreas(at)proxel(dot)se>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [WIP Patch] Using 128-bit integers for sum, avg and statistics aggregates
Date: 2014-10-28 13:21:38
Message-ID: CAHyXU0xbDcX=OsddFxjkgTpwNRYUX0nx8P-fd_sXpJ4Hqhve5w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Oct 25, 2014 at 9:38 AM, Andreas Karlsson <andreas(at)proxel(dot)se> wrote:
> Hi,
>
> There was recently talk about if we should start using 128-bit integers
> (where available) to speed up the aggregate functions over integers which
> uses numeric for their internal state. So I hacked together a patch for this
> to see what the performance gain would be.
>
> Previous thread:
> http://www.postgresql.org/message-id/20141017182500.GF2075@alap3.anarazel.de
>
> What the patch does is switching from using numerics in the aggregate state
> to int128 and then convert the type from the 128-bit integer in the final
> function.
>
> The functions where we can make use of int128 states are:
>
> - sum(int8)
> - avg(int8)
> - var_*(int2)
> - var_*(int4)
> - stdev_*(int2)
> - stdev_*(int4)
>
> The initial benchmark results look very promising. When summing 10 million
> int8 I get a speedup of ~2.5x and similarly for var_samp() on 10 million
> int4 I see a speed up of ~3.7x. To me this indicates that it is worth the
> extra code. What do you say? Is this worth implementing?

yes.

merlin


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Arthur Silva <arthurprs(at)gmail(dot)com>
Cc: Andreas Karlsson <andreas(at)proxel(dot)se>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [WIP Patch] Using 128-bit integers for sum, avg and statistics aggregates
Date: 2014-10-28 13:24:57
Message-ID: 20141028132457.GJ2639@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2014-10-28 11:05:11 -0200, Arthur Silva wrote:
> On Sat, Oct 25, 2014 at 12:38 PM, Andreas Karlsson <andreas(at)proxel(dot)se>
> As far as I'm aware int128 types are supported on every major compiler when
> compiling for 64bit platforms. Right?

Depends on what you call major. IIRC some not that old msvc versions
don't for example. Also, there's a couple 32 platforms with int128 bit
support. So I think we should just add a configure test defining the
type + a feature macro.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Andreas Karlsson <andreas(at)proxel(dot)se>
To: Arthur Silva <arthurprs(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [WIP Patch] Using 128-bit integers for sum, avg and statistics aggregates
Date: 2014-10-28 13:31:14
Message-ID: 544F9AA2.6050706@proxel.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 10/28/2014 02:05 PM, Arthur Silva wrote:
> As far as I'm aware int128 types are supported on every major compiler
> when compiling for 64bit platforms. Right?

Both gcc and clang support __int128_t, but I do not know about other
compilers like icc and MSVC.

Andreas


From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>, Arthur Silva <arthurprs(at)gmail(dot)com>
Cc: Andreas Karlsson <andreas(at)proxel(dot)se>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [WIP Patch] Using 128-bit integers for sum, avg and statistics aggregates
Date: 2014-10-28 13:54:30
Message-ID: 544FA016.9040304@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 10/28/2014 03:24 PM, Andres Freund wrote:
> On 2014-10-28 11:05:11 -0200, Arthur Silva wrote:
>> On Sat, Oct 25, 2014 at 12:38 PM, Andreas Karlsson <andreas(at)proxel(dot)se>
>> As far as I'm aware int128 types are supported on every major compiler when
>> compiling for 64bit platforms. Right?
>
> Depends on what you call major. IIRC some not that old msvc versions
> don't for example. Also, there's a couple 32 platforms with int128 bit
> support. So I think we should just add a configure test defining the
> type + a feature macro.

It wouldn't be too hard to just do:

struct {
int64 high_bits;
uint64 low_bits;
} pg_int128;

and some macros for the + - etc. operators. It might be less work than
trying to deal with the portability issues of a native C datatype for this.

- Heikki


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Arthur Silva <arthurprs(at)gmail(dot)com>, Andreas Karlsson <andreas(at)proxel(dot)se>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [WIP Patch] Using 128-bit integers for sum, avg and statistics aggregates
Date: 2014-10-28 13:56:05
Message-ID: 20141028135605.GN2639@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2014-10-28 15:54:30 +0200, Heikki Linnakangas wrote:
> On 10/28/2014 03:24 PM, Andres Freund wrote:
> >On 2014-10-28 11:05:11 -0200, Arthur Silva wrote:
> >>On Sat, Oct 25, 2014 at 12:38 PM, Andreas Karlsson <andreas(at)proxel(dot)se>
> >>As far as I'm aware int128 types are supported on every major compiler when
> >>compiling for 64bit platforms. Right?
> >
> >Depends on what you call major. IIRC some not that old msvc versions
> >don't for example. Also, there's a couple 32 platforms with int128 bit
> >support. So I think we should just add a configure test defining the
> >type + a feature macro.
>
> It wouldn't be too hard to just do:
>
> struct {
> int64 high_bits;
> uint64 low_bits;
> } pg_int128;
>
> and some macros for the + - etc. operators. It might be less work than
> trying to deal with the portability issues of a native C datatype for this.

And noticeably slower. At least x86-64 does all of this in hardware...

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Arthur Silva <arthurprs(at)gmail(dot)com>, Andreas Karlsson <andreas(at)proxel(dot)se>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [WIP Patch] Using 128-bit integers for sum, avg and statistics aggregates
Date: 2014-10-28 14:06:07
Message-ID: 21134.1414505167@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> writes:
> It wouldn't be too hard to just do:

> struct {
> int64 high_bits;
> uint64 low_bits;
> } pg_int128;

> and some macros for the + - etc. operators. It might be less work than
> trying to deal with the portability issues of a native C datatype for this.

-1. That's not that easy, especially for division, or if you want to
worry about overflow. The point of this patch IMO is to get some low
hanging fruit; coding our own int128 arithmetic doesn't sound like
"low hanging" to me.

Also, we've already got the configure infrastructure for detecting
whether a platform has working int64. It really shouldn't be much
work to transpose that to int128 (especially if we don't care about
printf support, which I think we don't).

regards, tom lane


From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Arthur Silva <arthurprs(at)gmail(dot)com>, Andreas Karlsson <andreas(at)proxel(dot)se>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [WIP Patch] Using 128-bit integers for sum, avg and statistics aggregates
Date: 2014-10-28 14:40:06
Message-ID: 544FAAC6.2090602@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 10/28/2014 04:06 PM, Tom Lane wrote:
> Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> writes:
>> It wouldn't be too hard to just do:
>
>> struct {
>> int64 high_bits;
>> uint64 low_bits;
>> } pg_int128;
>
>> and some macros for the + - etc. operators. It might be less work than
>> trying to deal with the portability issues of a native C datatype for this.
>
> -1. That's not that easy, especially for division, or if you want to
> worry about overflow.

The patch doesn't do division with the 128-bit integers. It only does
addition and multiplication. Those are pretty straightforward to implement.

> The point of this patch IMO is to get some low
> hanging fruit; coding our own int128 arithmetic doesn't sound like
> "low hanging" to me.

I wasn't thinking of writing a full-fledged 128-bit type, just the the
few operations needed for this patch.

> Also, we've already got the configure infrastructure for detecting
> whether a platform has working int64. It really shouldn't be much
> work to transpose that to int128 (especially if we don't care about
> printf support, which I think we don't).

It would be nicer to be able to use the same code on all platforms. With
a configure test, we'd still need a fallback implementation for
platforms that don't have it.

- Heikki


From: Andreas Karlsson <andreas(at)proxel(dot)se>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Arthur Silva <arthurprs(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [WIP Patch] Using 128-bit integers for sum, avg and statistics aggregates
Date: 2014-10-28 14:47:35
Message-ID: 544FAC87.2040705@proxel.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 10/28/2014 03:40 PM, Heikki Linnakangas wrote:
> The patch doesn't do division with the 128-bit integers. It only does
> addition and multiplication. Those are pretty straightforward to implement.

The patch uses division when converting from __int128_t to Numeric.

- Andreas


From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Andreas Karlsson <andreas(at)proxel(dot)se>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Arthur Silva <arthurprs(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [WIP Patch] Using 128-bit integers for sum, avg and statistics aggregates
Date: 2014-10-28 15:01:13
Message-ID: 544FAFB9.5020106@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 10/28/2014 04:47 PM, Andreas Karlsson wrote:
> On 10/28/2014 03:40 PM, Heikki Linnakangas wrote:
>> The patch doesn't do division with the 128-bit integers. It only does
>> addition and multiplication. Those are pretty straightforward to implement.
>
> The patch uses division when converting from __int128_t to Numeric.

Oh, I see. Hmph, looks like I'm losing an argument..

Moving on to other issues, isn't 128 bits too small to store the squares
of the processed numbers? That could overflow..

- Heikki


From: Andreas Karlsson <andreas(at)proxel(dot)se>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Arthur Silva <arthurprs(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [WIP Patch] Using 128-bit integers for sum, avg and statistics aggregates
Date: 2014-10-28 15:04:34
Message-ID: 544FB082.5090102@proxel.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 10/28/2014 04:01 PM, Heikki Linnakangas wrote:
> Moving on to other issues, isn't 128 bits too small to store the squares
> of the processed numbers? That could overflow..

Yeah, which is why stddev_*(int8) and var_*(int8) still have to use
Numeric in the aggregate state. For the int2 and int4 versions it is
fine to use __int128_t.

Andreas


From: Andreas Karlsson <andreas(at)proxel(dot)se>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2014-11-13 01:03:20
Message-ID: 54640358.8030304@proxel.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

Here is version 2 of the patch which detects the presence of gcc/clang
style 128-bit integers and has been cleaned up to a reviewable state. I
have not added support for any other compilers since I found no
documentation 128-bit support with icc or MSVC. I do not have access to
any Windows machines either.

A couple of things I was not sure about was the naming of the new
functions and if I should ifdef the size of the aggregate state in the
catalog or not.

--
Andreas Karlsson

Attachment Content-Type Size
int128-agg-v2.patch text/x-patch 0 bytes

From: Andreas Karlsson <andreas(at)proxel(dot)se>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2014-11-13 01:32:28
Message-ID: 54640A2C.6090702@proxel.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 11/13/2014 02:03 AM, Andreas Karlsson wrote:
> Here is version 2 of the patch which detects the presence of gcc/clang
> style 128-bit integers and has been cleaned up to a reviewable state. I
> have not added support for any other compilers since I found no
> documentation 128-bit support with icc or MSVC. I do not have access to
> any Windows machines either.
>
> A couple of things I was not sure about was the naming of the new
> functions and if I should ifdef the size of the aggregate state in the
> catalog or not.

The correct file is attached in the message.

--
Andreas Karlsson

Attachment Content-Type Size
int128-agg-v2.patch text/x-patch 33.8 KB

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Andreas Karlsson <andreas(at)proxel(dot)se>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2014-11-13 02:38:38
Message-ID: 20141113023838.GC1791@alvin.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andreas Karlsson wrote:
> On 11/13/2014 02:03 AM, Andreas Karlsson wrote:
> >Here is version 2 of the patch which detects the presence of gcc/clang
> >style 128-bit integers and has been cleaned up to a reviewable state. I
> >have not added support for any other compilers since I found no
> >documentation 128-bit support with icc or MSVC. I do not have access to
> >any Windows machines either.
> >
> >A couple of things I was not sure about was the naming of the new
> >functions and if I should ifdef the size of the aggregate state in the
> >catalog or not.

configure is a generated file. If your patch touches it but not
configure.in, there is a problem.

> diff --git a/configure b/configure

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Andreas Karlsson <andreas(at)proxel(dot)se>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2014-11-14 00:57:16
Message-ID: 5465536C.5020401@proxel.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 11/13/2014 03:38 AM, Alvaro Herrera wrote:
> configure is a generated file. If your patch touches it but not
> configure.in, there is a problem.

Thanks for pointing it out, I have now fixed it.

--
Andreas Karlsson

Attachment Content-Type Size
int128-agg-v3.patch text/x-patch 34.8 KB

From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Andreas Karlsson <andreas(at)proxel(dot)se>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2014-11-14 01:25:21
Message-ID: 54655A01.8030204@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 11/13/14 7:57 PM, Andreas Karlsson wrote:
> On 11/13/2014 03:38 AM, Alvaro Herrera wrote:
>> configure is a generated file. If your patch touches it but not
>> configure.in, there is a problem.
>
> Thanks for pointing it out, I have now fixed it.

There is something odd about your patch. I claims that all files are
new files, e.g.:

diff --git a/src/backend/utils/adt/numeric.c
b/src/backend/utils/adt/numeric.c
new file mode 100644
index d61af92..98183b4
*** a/src/backend/utils/adt/numeric.c
--- b/src/backend/utils/adt/numeric.c


From: Andreas Karlsson <andreas(at)proxel(dot)se>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2014-11-14 01:41:44
Message-ID: 54655DD8.5090005@proxel.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 11/14/2014 02:25 AM, Peter Eisentraut wrote:
> There is something odd about your patch. I claims that all files are
> new files, e.g.:
>
> diff --git a/src/backend/utils/adt/numeric.c
> b/src/backend/utils/adt/numeric.c
> new file mode 100644
> index d61af92..98183b4
> *** a/src/backend/utils/adt/numeric.c
> --- b/src/backend/utils/adt/numeric.c

Those lines look fine to me.

It does not say I have added a new file, it only claims I have changed
the mode of the file. And that is an artifact from using the script
src/tools/git-external-diff which always outputs a line with "new file
mode".

--
Andreas Karlsson


From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Andreas Karlsson <andreas(at)proxel(dot)se>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2014-12-16 10:04:13
Message-ID: CAApHDvqtQxah3SAs6OeajYXYyMMkzDO5h6z6yDUM7=fekEw7aQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 14 November 2014 at 13:57, Andreas Karlsson <andreas(at)proxel(dot)se> wrote:
>
> On 11/13/2014 03:38 AM, Alvaro Herrera wrote:
>
>> configure is a generated file. If your patch touches it but not
>> configure.in, there is a problem.
>>
>
> Thanks for pointing it out, I have now fixed it.
>
>
>
Hi Andreas,

These are some very promising performance increases.

I've done a quick pass of reading the patch. I currently don't have a
system with a 128bit int type, but I'm working on that.

Just a couple of things that could do with being fixed:

This fragment needs fixed to put braces on new lines
if (state) {
numstate.N = state->N;
int16_to_numericvar(state->sumX, &numstate.sumX);
int16_to_numericvar(state->sumX2, &numstate.sumX2);
} else {
numstate.N = 0;
}

It also looks like your OIDs have been nabbed by some jsonb stuff.

DETAIL: Key (oid)=(3267) is duplicated.

I'm also wondering why in numeric_int16_sum() you're doing:

#else
return numeric_sum(fcinfo);
#endif

but you're not doing return int8_accum() in the #else part
of int8_avg_accum()
The same goes for int8_accum_inv() and int8_avg_accum_inv(), though perhaps
you're doing it here because of the elog() showing the wrong function name.
Although that's a pretty much "shouldn't ever happen" case that mightn't be
worth worrying about.

Also since I don't currently have a machine with a working int128, I
decided to benchmark master vs patched to see if there was any sort of
performance regression due to numeric_int16_sum calling numeric_sum, but
I'm a bit confused with the performance results as it seems there's quite a
good increase in performance with the patch, I'd have expected there to be
no change.

CREATE TABLE t (value bigint not null);
insert into t select a.a from generate_series(1,5000000) a(a);
vacuum;

int128_bench.sql has select sum(value) from t;

Master:
D:\Postgres\installb\bin>pgbench.exe -f d:\int128_bench.sql -n -T 120
postgres
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 120 s
number of transactions actually processed: 92
latency average: 1304.348 ms
tps = 0.762531 (including connections establishing)
tps = 0.762642 (excluding connections establishing)

Patched:
D:\Postgres\install\bin>pgbench.exe -f d:\int128_bench.sql -n -T 120
postgres
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 120 s
number of transactions actually processed: 99
latency average: 1212.121 ms
tps = 0.818067 (including connections establishing)
tps = 0.818199 (excluding connections establishing)

Postgresql.conf is the same in both instances.
I've yet to discover why this is any faster.

Regards

David Rowley


From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Andreas Karlsson <andreas(at)proxel(dot)se>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2014-12-22 01:20:03
Message-ID: CAB7nPqTsDT2A3odxiwizEBPkQeAmWj=NFph2z6Mevh6jM=bmRw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Dec 16, 2014 at 7:04 PM, David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> It also looks like your OIDs have been nabbed by some jsonb stuff.
> DETAIL: Key (oid)=(3267) is duplicated.
Use src/include/catalog/unused_oids to track the OIDs not yet used in
the catalogs when adding new objects for a feature.
--
Michael


From: Oskari Saarenmaa <os(at)ohmu(dot)fi>
To: Andreas Karlsson <andreas(at)proxel(dot)se>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2014-12-22 22:47:48
Message-ID: 20141222224747.GC17054@saarenmaa.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Nov 14, 2014 at 01:57:16AM +0100, Andreas Karlsson wrote:
> *** a/configure.in
> --- b/configure.in
> *************** AC_DEFINE_UNQUOTED(MAXIMUM_ALIGNOF, $MAX
> *** 1751,1756 ****
> --- 1751,1759 ----
> AC_CHECK_TYPES([int8, uint8, int64, uint64], [], [],
> [#include <stdio.h>])
>
> + # Check if platform support gcc style 128-bit integers.
> + AC_CHECK_TYPES([__int128_t, __uint128_t], [], [], [#include <stdint.h>])
> +
> # We also check for sig_atomic_t, which *should* be defined per ANSI
> # C, but is missing on some old platforms.
> AC_CHECK_TYPES(sig_atomic_t, [], [], [#include <signal.h>])

__int128_t and __uint128_t are GCC extensions and are not related to
stdint.h.

> *** a/src/include/pg_config.h.in
> --- b/src/include/pg_config.h.in
> ***************
> *** 198,203 ****
> --- 198,209 ----
> /* Define to 1 if you have __sync_compare_and_swap(int64 *, int64, int64). */
> #undef HAVE_GCC__SYNC_INT64_CAS
>
> + /* Define to 1 if you have __int128_t */
> + #undef HAVE___INT128_T
> +
> + /* Define to 1 if you have __uint128_t */
> + #undef HAVE___UINT128_T
> +
> /* Define to 1 if you have the `getaddrinfo' function. */
> #undef HAVE_GETADDRINFO

These changes don't match what my autoconf does. Not a big deal I guess,
but if this is merged as-is the next time someone runs autoreconf it'll
write these lines differently to a different location of the file and the
change will end up as a part of an unrelated commit.

> *** a/src/backend/utils/adt/numeric.c
> --- b/src/backend/utils/adt/numeric.c
> *************** static void apply_typmod(NumericVar *var
> *** 402,407 ****
> --- 402,410 ----
> static int32 numericvar_to_int4(NumericVar *var);
> static bool numericvar_to_int8(NumericVar *var, int64 *result);
> static void int8_to_numericvar(int64 val, NumericVar *var);
> + #ifdef HAVE_INT128
> + static void int16_to_numericvar(int128 val, NumericVar *var);
> + #endif

Existing code uses int4 and int8 to refer to 32 and 64 bit integers as
they're also PG datatypes, but int16 isn't one and it looks a lot like
int16_t. I think it would make sense to just call it int128 internally
everywhere instead of int16 which isn't used anywhere else to refer to 128
bit integers.

> new file mode 100755

I guess src/tools/git-external-diff generated these bogus "new file mode"
lines? I know the project policy says that context diffs should be used,
but it seems to me that most people just use unified diffs these days so I'd
just use "git show" or "git format-patch -1" to generate the patches. The
ones generated by "git format-patch" can be easily applied by reviewers
using "git am".

/ Oskari


From: Andreas Karlsson <andreas(at)proxel(dot)se>
To: Oskari Saarenmaa <os(at)ohmu(dot)fi>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2014-12-23 23:41:13
Message-ID: 5499FD99.5080109@proxel.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 12/22/2014 11:47 PM, Oskari Saarenmaa wrote:
> __int128_t and __uint128_t are GCC extensions and are not related to
> stdint.h.
>
>> [...]
>
> These changes don't match what my autoconf does. Not a big deal I guess,
> but if this is merged as-is the next time someone runs autoreconf it'll
> write these lines differently to a different location of the file and the
> change will end up as a part of an unrelated commit.

Thanks for the feedback. These two issues will be fixed in the next version.

>> *** a/src/backend/utils/adt/numeric.c
>> --- b/src/backend/utils/adt/numeric.c
>> *************** static void apply_typmod(NumericVar *var
>> *** 402,407 ****
>> --- 402,410 ----
>> static int32 numericvar_to_int4(NumericVar *var);
>> static bool numericvar_to_int8(NumericVar *var, int64 *result);
>> static void int8_to_numericvar(int64 val, NumericVar *var);
>> + #ifdef HAVE_INT128
>> + static void int16_to_numericvar(int128 val, NumericVar *var);
>> + #endif
>
> Existing code uses int4 and int8 to refer to 32 and 64 bit integers as
> they're also PG datatypes, but int16 isn't one and it looks a lot like
> int16_t. I think it would make sense to just call it int128 internally
> everywhere instead of int16 which isn't used anywhere else to refer to 128
> bit integers.

Perhaps. I switched opinion on this several times while coding. On one
side there is consistency, on the other there is the risk of confusing
the different meanings of int16. I am still not sure which of these I
think is the least bad.

>> new file mode 100755
>
> I guess src/tools/git-external-diff generated these bogus "new file mode"
> lines? I know the project policy says that context diffs should be used,
> but it seems to me that most people just use unified diffs these days so I'd
> just use "git show" or "git format-patch -1" to generate the patches. The
> ones generated by "git format-patch" can be easily applied by reviewers
> using "git am".

At the time of submitting my patch I had not noticed the slow change
from git-external-diff to regular git diffs. The change snuck up on me.
The new version of the patch will be submitted in the standard git
format which is what I am more used to work with.

--
Andreas Karlsson


From: Andreas Karlsson <andreas(at)proxel(dot)se>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2014-12-24 03:04:23
Message-ID: 549A2D37.6010202@proxel.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 12/16/2014 11:04 AM, David Rowley wrote:> These are some very
promising performance increases.
>
> I've done a quick pass of reading the patch. I currently don't have a
> system with a 128bit int type, but I'm working on that.

Sorry for taking some time to get back. I have been busy before
Christmas. A new version of the patch is attached.

> This fragment needs fixed to put braces on new lines

Fixed!

> It also looks like your OIDs have been nabbed by some jsonb stuff.

Fixed!

> I'm also wondering why in numeric_int16_sum() you're doing:
>
> #else
> return numeric_sum(fcinfo);
> #endif
>
> but you're not doing return int8_accum() in the #else part
> of int8_avg_accum()
> The same goes for int8_accum_inv() and int8_avg_accum_inv(), though
> perhaps you're doing it here because of the elog() showing the wrong
> function name. Although that's a pretty much "shouldn't ever happen"
> case that mightn't be worth worrying about.

No strong reason. I did it for symmetry with int2_accum() and int4_accum().

> Also since I don't currently have a machine with a working int128, I
> decided to benchmark master vs patched to see if there was any sort of
> performance regression due to numeric_int16_sum calling numeric_sum, but
> I'm a bit confused with the performance results as it seems there's
> quite a good increase in performance with the patch, I'd have expected
> there to be no change.

Weird, I noticed similar results when doing my benchmarks, but given
that I did not change the accumulator function other than adding an
ifdef I am not totally sure if this difference is real.

master
tps = 1.001984 (excluding connections establishing)

Without int128
tps = 1.014511 (excluding connections establishing)

With int128
tps = 3.185956 (excluding connections establishing)

--
Andreas Karlsson

Attachment Content-Type Size
int128-agg-v4.patch text/x-patch 32.8 KB

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Andreas Karlsson <andreas(at)proxel(dot)se>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2014-12-26 12:57:33
Message-ID: CAApHDvp9ZUC5fUur-_5nJ7VPUYe_G+pcUTb73pvw0k0rH6xgzg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 24 December 2014 at 16:04, Andreas Karlsson <andreas(at)proxel(dot)se> wrote:

On 12/16/2014 11:04 AM, David Rowley wrote:> These are some very promising
> performance increases.
>
>>
>> I've done a quick pass of reading the patch. I currently don't have a
>> system with a 128bit int type, but I'm working on that.
>>
>
> Sorry for taking some time to get back. I have been busy before Christmas.
> A new version of the patch is attached.
>
>
Ok I've had another look at this, and I think the only things that I have
to say have been mentioned already:

1. Do we need to keep the 128 byte aggregate state size for machines
without 128 bit ints? This has been reduced to 48 bytes in the patch, which
is in favour code being compiled with a compiler which has 128 bit ints. I
kind of think that we need to keep the 128 byte estimates for compilers
that don't support int128, but I'd like to hear any counter arguments.

2. References to int16 meaning 16 bytes. I'm really in two minds about
this, it's quite nice to keep the natural flow, int4, int8, int16, but I
can't help think that this will confuse someone one day. I think it'll be a
long time before it confused anyone if we called it int128 instead, but I'm
not that excited about seeing it renamed either. I'd like to hear what
others have to say... Is there a chance that some sql standard in the
distant future will have HUGEINT and we might regret not getting the
internal names nailed down?

I also checked the performance of some windowing function calls, since
these call the final function for each row, I thought I'd better make sure
there was no regression as the final function must perform a conversion
from int128 to numeric for each row.

It seems there's still an increase in performance:

Setup:
create table bi_win (i bigint primary key);
insert into bi_win select x.x from generate_series(1,10000) x(x);
vacuum analyze;

Query:
select sum(i) over () from bi_win;

** Master
./pgbench -f ~/int128_window.sql -n -T 60 postgres
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 60 s
number of transactions actually processed: 6567
latency average: 9.137 ms
tps = 109.445841 (including connections establishing)
tps = 109.456941 (excluding connections establishing)

** Patched
./pgbench -f ~/int128_window.sql -n -T 60 postgres
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 60 s
number of transactions actually processed: 7841
latency average: 7.652 ms
tps = 130.670253 (including connections establishing)
tps = 130.675743 (excluding connections establishing)

Setup:
create table i_win (i int primary key);
insert into i_win select x.x from generate_series(1,10000) x(x);
vacuum analyze;

Query:
select stddev(i) over () from i_win;

** Master
./pgbench -f ~/int128_window.sql -n -T 60 postgres
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 60 s
number of transactions actually processed: 5084
latency average: 11.802 ms
tps = 84.730362 (including connections establishing)
tps = 84.735693 (excluding connections establishing)

** Patched
./pgbench -f ~/int128_window.sql -n -T 60 postgres
transaction type: Custom query
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 60 s
number of transactions actually processed: 7557
latency average: 7.940 ms
tps = 125.934787 (including connections establishing)
tps = 125.943176 (excluding connections establishing)

Regards

David Rowley


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Andreas Karlsson <andreas(at)proxel(dot)se>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2014-12-31 05:20:24
Message-ID: CA+TgmoZJWHCqCtrEXvAo6_cVRTZMpWiqAC0eUv4MOtS1=Bb4ag@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Dec 26, 2014 at 7:57 AM, David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> 1. Do we need to keep the 128 byte aggregate state size for machines without
> 128 bit ints? This has been reduced to 48 bytes in the patch, which is in
> favour code being compiled with a compiler which has 128 bit ints. I kind
> of think that we need to keep the 128 byte estimates for compilers that
> don't support int128, but I'd like to hear any counter arguments.

I think you're referring to the estimated state size in pg_aggregate
here, and I'd say it's probably not a big deal one way or the other.
Presumably, at some point, 128-bit arithmetic will become common
enough that we'll want to change that estimate, but I don't know
whether we've reached that point or not.

> 2. References to int16 meaning 16 bytes. I'm really in two minds about this,
> it's quite nice to keep the natural flow, int4, int8, int16, but I can't
> help think that this will confuse someone one day. I think it'll be a long
> time before it confused anyone if we called it int128 instead, but I'm not
> that excited about seeing it renamed either. I'd like to hear what others
> have to say... Is there a chance that some sql standard in the distant
> future will have HUGEINT and we might regret not getting the internal names
> nailed down?

Yeah, I think using int16 to mean 16-bytes will be confusing to
someone almost immediately.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andreas Karlsson <andreas(at)proxel(dot)se>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2014-12-31 14:00:50
Message-ID: CAApHDvoshk1L1r8fCJjf_=OuCwaycnZp4W3C2R-js-sJsFr+Uw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 31 December 2014 at 18:20, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Fri, Dec 26, 2014 at 7:57 AM, David Rowley <dgrowleyml(at)gmail(dot)com>
> wrote:
> > 1. Do we need to keep the 128 byte aggregate state size for machines
> without
> > 128 bit ints? This has been reduced to 48 bytes in the patch, which is in
> > favour code being compiled with a compiler which has 128 bit ints. I
> kind
> > of think that we need to keep the 128 byte estimates for compilers that
> > don't support int128, but I'd like to hear any counter arguments.
>
> I think you're referring to the estimated state size in pg_aggregate
> here, and I'd say it's probably not a big deal one way or the other.
> Presumably, at some point, 128-bit arithmetic will become common
> enough that we'll want to change that estimate, but I don't know
> whether we've reached that point or not.
>
>
Yeah, that's what I was talking about.
I'm just looking at the code which uses this size estimate
in choose_hashed_grouping(). I'd be a bit worried giving the difference
between 48 and 128 that we'd under estimate the hash size too much and end
up going to disk during hashagg.

I think the patch should be modified so that it uses 128 bytes for the size
estimate on machines that don't support int128, but I'm not quite sure at
this stage if that causes issues for replication, if 1 machine had support
and one didn't, would it matter if the pg_aggregate row on the replica was
48 bytes instead of 128? Is this worth worrying about?

> > 2. References to int16 meaning 16 bytes. I'm really in two minds about
> this,
> > it's quite nice to keep the natural flow, int4, int8, int16, but I can't
> > help think that this will confuse someone one day. I think it'll be a
> long
> > time before it confused anyone if we called it int128 instead, but I'm
> not
> > that excited about seeing it renamed either. I'd like to hear what others
> > have to say... Is there a chance that some sql standard in the distant
> > future will have HUGEINT and we might regret not getting the internal
> names
> > nailed down?
>
> Yeah, I think using int16 to mean 16-bytes will be confusing to
> someone almost immediately.
>
>
hmm, I think it should be changed to int128 then. Pitty int4 was selected
as a name instead of int32 back in the day...

I'm going to mark the patch as waiting on author, pending those two changes.

My view with the size estimates change is that if a committer deems it
overkill, then they can rip it out, but at least it's been thought of and
considered rather than forgotten about.

Regards

David Rowley


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Andreas Karlsson <andreas(at)proxel(dot)se>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2014-12-31 14:13:45
Message-ID: 20141231141345.GC19836@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2015-01-01 03:00:50 +1300, David Rowley wrote:
> > > 2. References to int16 meaning 16 bytes. I'm really in two minds about
> > this,
> > > it's quite nice to keep the natural flow, int4, int8, int16, but I can't
> > > help think that this will confuse someone one day. I think it'll be a
> > long
> > > time before it confused anyone if we called it int128 instead, but I'm
> > not
> > > that excited about seeing it renamed either. I'd like to hear what others
> > > have to say... Is there a chance that some sql standard in the distant
> > > future will have HUGEINT and we might regret not getting the internal
> > names
> > > nailed down?
> >
> > Yeah, I think using int16 to mean 16-bytes will be confusing to
> > someone almost immediately.
> >
> >
> hmm, I think it should be changed to int128 then. Pitty int4 was selected
> as a name instead of int32 back in the day...

Note that the C datatype has been int32/int64 for a while now, it's just
the SQL datatype and the names of its support functions. Given that,
afaiu, we're talking about the C datatype it seems pretty clear that it
should be named int128.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Andreas Karlsson <andreas(at)proxel(dot)se>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2014-12-31 16:43:39
Message-ID: CA+TgmoaihWqe1nZmNKOw_SNK_d45YSqVqc1c+7ng0aVxeUTQdQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Dec 31, 2014 at 9:00 AM, David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> Yeah, that's what I was talking about.
> I'm just looking at the code which uses this size estimate in
> choose_hashed_grouping(). I'd be a bit worried giving the difference between
> 48 and 128 that we'd under estimate the hash size too much and end up going
> to disk during hashagg.

That's an issue, but the problem is that...

> I think the patch should be modified so that it uses 128 bytes for the size
> estimate on machines that don't support int128, but I'm not quite sure at
> this stage if that causes issues for replication, if 1 machine had support
> and one didn't, would it matter if the pg_aggregate row on the replica was
> 48 bytes instead of 128? Is this worth worrying about?

...if we do this, then yes, there is an incompatibility between
binaries compiled with this option and binaries compiled without it.
They generate different system catalog contents after initdb and we
shouldn't use one set of binaries with an initdb done the other way.
That seems like serious overkill, especially since this could not
inconceivably flip between one recompile and the next if the
administrator has run 'yum update' in the meantime. So I'd argue for
picking one estimate and using it across the board, and living with
the fact that it's sometimes going to be wrong. Such is the lot in
life of an estimate.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Andreas Karlsson <andreas(at)proxel(dot)se>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-01-02 21:22:08
Message-ID: 54A70C00.8070204@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 12/31/14, 8:13 AM, Andres Freund wrote:
> On 2015-01-01 03:00:50 +1300, David Rowley wrote:
>>>> 2. References to int16 meaning 16 bytes. I'm really in two minds about
>>> this,
>>>> it's quite nice to keep the natural flow, int4, int8, int16, but I can't
>>>> help think that this will confuse someone one day. I think it'll be a
>>> long
>>>> time before it confused anyone if we called it int128 instead, but I'm
>>> not
>>>> that excited about seeing it renamed either. I'd like to hear what others
>>>> have to say... Is there a chance that some sql standard in the distant
>>>> future will have HUGEINT and we might regret not getting the internal
>>> names
>>>> nailed down?
>>>
>>> Yeah, I think using int16 to mean 16-bytes will be confusing to
>>> someone almost immediately.
>>>
>>>
>> hmm, I think it should be changed to int128 then. Pitty int4 was selected
>> as a name instead of int32 back in the day...
>
> Note that the C datatype has been int32/int64 for a while now, it's just
> the SQL datatype and the names of its support functions. Given that,
> afaiu, we're talking about the C datatype it seems pretty clear that it
> should be named int128.

Should we start down this road with SQL too, by creating int32, 64 and 128 (if needed?), and changing docs as needed?

Presumably that would be best as a separate patch...
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andreas Karlsson <andreas(at)proxel(dot)se>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-01-02 21:41:26
Message-ID: 23854.1420234886@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com> writes:
> On 12/31/14, 8:13 AM, Andres Freund wrote:
>> Note that the C datatype has been int32/int64 for a while now, it's just
>> the SQL datatype and the names of its support functions. Given that,
>> afaiu, we're talking about the C datatype it seems pretty clear that it
>> should be named int128.

I don't think there was any debate about calling the C typedef int128.
The question at hand was that there are some existing C functions whose
names follow the int2/int4/int8 convention, and it's not very clear what
to do with their 128-bit analogues. Having said that, I'm fine with
switching to int128 for the C names of the new functions; but it is
definitely less than consistent with what we're doing elsewhere.

> Should we start down this road with SQL too, by creating int32, 64 and 128 (if needed?), and changing docs as needed?

No. The situation is messy enough without changing our conventions at
the SQL level; that will introduce breakage into user applications,
for no very good reason because we've never had any inconsistency there.

What might be worth trying is establishing a hard-and-fast boundary
between C land and SQL land, with bitwise names in C and bytewise names
in SQL. This would mean, for example, that int4pl() would be renamed to
int32pl() so far as the C function goes, but the function's SQL name would
remain the same. That would introduce visible inconsistency between such
functions' pg_proc.proname and pg_proc.prosrc fields, but at least the
inconsistency would follow a very clear pattern. And I doubt that very
many user applications are depending on the contents of pg_proc.prosrc.

regards, tom lane


From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andreas Karlsson <andreas(at)proxel(dot)se>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-01-02 21:58:25
Message-ID: 54A71481.8050609@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 01/02/2015 11:41 PM, Tom Lane wrote:
> What might be worth trying is establishing a hard-and-fast boundary
> between C land and SQL land, with bitwise names in C and bytewise names
> in SQL. This would mean, for example, that int4pl() would be renamed to
> int32pl() so far as the C function goes, but the function's SQL name would
> remain the same.

I don't like that. I read int4pl as the function implementing plus
operator for the SQL-visible int4 datatype, so int4pl makes perfect sense.

> That would introduce visible inconsistency between such
> functions' pg_proc.proname and pg_proc.prosrc fields, but at least the
> inconsistency would follow a very clear pattern. And I doubt that very
> many user applications are depending on the contents of pg_proc.prosrc.

Someone might be doing

DirectFunctionCall2(int4pl, ...)

in an extension. Well, probably not too likely for int4pl, as you could
just use the native C + operator, but it's not inconceivable for
something like int4recv().

- Heikki


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andreas Karlsson <andreas(at)proxel(dot)se>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-01-02 22:18:53
Message-ID: 25011.1420237133@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> writes:
> On 01/02/2015 11:41 PM, Tom Lane wrote:
>> What might be worth trying is establishing a hard-and-fast boundary
>> between C land and SQL land, with bitwise names in C and bytewise names
>> in SQL. This would mean, for example, that int4pl() would be renamed to
>> int32pl() so far as the C function goes, but the function's SQL name would
>> remain the same.

> I don't like that. I read int4pl as the function implementing plus
> operator for the SQL-visible int4 datatype, so int4pl makes perfect sense.

I agree with that so far as the SQL name for the function goes, which is
part of why I don't think we should rename anything at the SQL level.
But right now at the C level, it's unclear how things should be named,
and I think we don't really want a situation where the most appropriate
name is so unclear and potentially confusing. We're surviving fine with
"int32" in C meaning "int4" in SQL so far as the type names go, so why not
copy that naming approach for function names?

>> That would introduce visible inconsistency between such
>> functions' pg_proc.proname and pg_proc.prosrc fields, but at least the
>> inconsistency would follow a very clear pattern. And I doubt that very
>> many user applications are depending on the contents of pg_proc.prosrc.

> Someone might be doing
> DirectFunctionCall2(int4pl, ...)
> in an extension. Well, probably not too likely for int4pl, as you could
> just use the native C + operator, but it's not inconceivable for
> something like int4recv().

We don't have a lot of hesitation about breaking individual function calls
in extensions, so long as (1) you'll get a compile error and (2) it's
clear how to update the code. See for instance the multitude of cases
where we've added new arguments to existing C functions. So I don't think
this objection holds a lot of water, especially when (as you note) there's
not that much reason to be calling most of these functions directly from C.

regards, tom lane


From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andreas Karlsson <andreas(at)proxel(dot)se>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-01-02 22:57:09
Message-ID: 54A72245.7060904@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 1/2/15, 4:18 PM, Tom Lane wrote:
> Heikki Linnakangas<hlinnakangas(at)vmware(dot)com> writes:
>> >On 01/02/2015 11:41 PM, Tom Lane wrote:
>>> >>What might be worth trying is establishing a hard-and-fast boundary
>>> >>between C land and SQL land, with bitwise names in C and bytewise names
>>> >>in SQL. This would mean, for example, that int4pl() would be renamed to
>>> >>int32pl() so far as the C function goes, but the function's SQL name would
>>> >>remain the same.
>> >I don't like that. I read int4pl as the function implementing plus
>> >operator for the SQL-visible int4 datatype, so int4pl makes perfect sense.
> I agree with that so far as the SQL name for the function goes, which is
> part of why I don't think we should rename anything at the SQL level.
> But right now at the C level, it's unclear how things should be named,
> and I think we don't really want a situation where the most appropriate
> name is so unclear and potentially confusing. We're surviving fine with
> "int32" in C meaning "int4" in SQL so far as the type names go, so why not
> copy that naming approach for function names?

Realistically, how many non-developers actually use the intXX SQL names? I don't think I've ever seen it; the only places I recall seeing it done are code snippets on developer blogs. Everyone else uses smallint, etc.

I know we're all gun-shy about this after standard_conforming_strings, but that affected *everyone*. I believe this change would affect very, very few users.

Also, note that I'm not talking about removing anything yet; that would come later.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


From: Claudio Freire <klaussfreire(at)gmail(dot)com>
To: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andreas Karlsson <andreas(at)proxel(dot)se>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-01-02 23:07:41
Message-ID: CAGTBQpb8w94Sw7OJozE60gtaUQSJyir__DNpzwVzo4BrRez43w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Jan 2, 2015 at 7:57 PM, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com> wrote:
> On 1/2/15, 4:18 PM, Tom Lane wrote:
>>
>> Heikki Linnakangas<hlinnakangas(at)vmware(dot)com> writes:
>>>
>>> >On 01/02/2015 11:41 PM, Tom Lane wrote:
>>>>
>>>> >>What might be worth trying is establishing a hard-and-fast boundary
>>>> >>between C land and SQL land, with bitwise names in C and bytewise
>>>> >> names
>>>> >>in SQL. This would mean, for example, that int4pl() would be renamed
>>>> >> to
>>>> >>int32pl() so far as the C function goes, but the function's SQL name
>>>> >> would
>>>> >>remain the same.
>>>
>>> >I don't like that. I read int4pl as the function implementing plus
>>> >operator for the SQL-visible int4 datatype, so int4pl makes perfect
>>> > sense.
>>
>> I agree with that so far as the SQL name for the function goes, which is
>> part of why I don't think we should rename anything at the SQL level.
>> But right now at the C level, it's unclear how things should be named,
>> and I think we don't really want a situation where the most appropriate
>> name is so unclear and potentially confusing. We're surviving fine with
>> "int32" in C meaning "int4" in SQL so far as the type names go, so why not
>> copy that naming approach for function names?
>
>
> Realistically, how many non-developers actually use the intXX SQL names? I
> don't think I've ever seen it; the only places I recall seeing it done are
> code snippets on developer blogs. Everyone else uses smallint, etc.
>
> I know we're all gun-shy about this after standard_conforming_strings, but
> that affected *everyone*. I believe this change would affect very, very few
> users.
>
> Also, note that I'm not talking about removing anything yet; that would come
> later.

I think it's naive to think the intXX names wouldn't be used just
because they're postgres-specific. Especially when pgadmin3 generates
them.

Many scripts of mine have them because pgadmin3 generated them, many
others have them because I grew accustomed to using them, especially
when I don't care being postgres-specific. float8 vs double precision
and stuff like that.

Lets not generalize anecdote (me using them), lets just pay attention
to the fact that lots of tools expose them (pgadmin3 among them).


From: Andres Freund <andres(at)anarazel(dot)de>
To: Andreas Karlsson <andreas(at)proxel(dot)se>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-01-11 01:36:31
Message-ID: 20150111013631.GF27519@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

> +# Check if platform support gcc style 128-bit integers.
> +AC_CHECK_TYPES([__int128_t, __uint128_t], [], [], [])

Hm, I'm not sure that's sufficent. Three things:

a) Afaics only __int128/unsigned __int128 is defined. See
https://gcc.gnu.org/onlinedocs/gcc/_005f_005fint128.html

b) I'm doubtful that AC_CHECK_TYPES is a sufficiently good test on all
platforms. IIRC gcc will generate calls to functions to do the actual
arithmetic, and I don't think they're guranteed to be available on
platforms. That how it .e.g. behaves for atomic operations. So my
guess is that you'll need to check that a program that does actual
arithmetic still links.

c) Personally I don't see the point of testing __uint128_t. That's just
a pointless test that makes configure run for longer.

> +#ifdef HAVE_INT128
> +typedef struct Int16AggState
> +{
> + bool calcSumX2; /* if true, calculate sumX2 */
> + int64 N; /* count of processed numbers */
> + int128 sumX; /* sum of processed numbers */
> + int128 sumX2; /* sum of squares of processed numbers */
> +} Int16AggState;

Not the biggest fan of the variable naming here, but that's not your
fault, you're just consistent which is good.

> @@ -3030,6 +3139,18 @@ int8_avg_accum(PG_FUNCTION_ARGS)
> Datum
> int2_accum_inv(PG_FUNCTION_ARGS)
> {
> +#ifdef HAVE_INT128
> + Int16AggState *state;
> +
> + state = PG_ARGISNULL(0) ? NULL : (Int16AggState *) PG_GETARG_POINTER(0);
> +
> + /* Should not get here with no state */
> + if (state == NULL)
> + elog(ERROR, "int2_accum_inv called with NULL state");
> +
> + if (!PG_ARGISNULL(1))
> + do_int16_discard(state, (int128) PG_GETARG_INT16(1));
> +#else
> NumericAggState *state;
>
> state = PG_ARGISNULL(0) ? NULL : (NumericAggState *) PG_GETARG_POINTER(0);
> @@ -3049,6 +3170,7 @@ int2_accum_inv(PG_FUNCTION_ARGS)
> if (!do_numeric_discard(state, newval))
> elog(ERROR, "do_numeric_discard failed unexpectedly");
> }

Hm. It might be nicer to move the if (!state) elog() outside the ifdef,
and add curly parens inside the ifdef.

> --- a/src/include/pg_config.h.in
> +++ b/src/include/pg_config.h.in
> @@ -678,6 +678,12 @@
> /* Define to 1 if your compiler understands __VA_ARGS__ in macros. */
> #undef HAVE__VA_ARGS
>
z> +/* Define to 1 if the system has the type `__int128_t'. */
> +#undef HAVE___INT128_T
> +
> +/* Define to 1 if the system has the type `__uint128_t'. */
> +#undef HAVE___UINT128_T

pg_config.h.win32 should be updated as well.

Greetings,

Andres Freund


From: Andreas Karlsson <andreas(at)proxel(dot)se>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-01-11 04:07:13
Message-ID: 54B1F6F1.7080805@proxel.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 01/11/2015 02:36 AM, Andres Freund wrote:
> a) Afaics only __int128/unsigned __int128 is defined. See
> https://gcc.gnu.org/onlinedocs/gcc/_005f_005fint128.html

Both GCC and Clang defines both of them. Which you use seems to just be
a matter of preference.

> b) I'm doubtful that AC_CHECK_TYPES is a sufficiently good test on all
> platforms. IIRC gcc will generate calls to functions to do the actual
> arithmetic, and I don't think they're guranteed to be available on
> platforms. That how it .e.g. behaves for atomic operations. So my
> guess is that you'll need to check that a program that does actual
> arithmetic still links.

I too thought some about this and decided to look at how other projects
handled this. The projects I have looked at seems to trust that if
__int128_t is defined it will also work.

https://github.com/python/cpython/blob/master/configure#L7778
http://cgit.freedesktop.org/cairo/tree/build/configure.ac.system#n88

But after some more searching now I notice that at least gstreamer does
not trust this to be true.

https://github.com/ensonic/gstreamer/blob/master/configure.ac#L382

Should I fix it to actually compile some code which uses the 128-bit types?

> c) Personally I don't see the point of testing __uint128_t. That's just
> a pointless test that makes configure run for longer.

Ok, will remove it in the next version of the patch.

>> @@ -3030,6 +3139,18 @@ int8_avg_accum(PG_FUNCTION_ARGS)
>> Datum
>> int2_accum_inv(PG_FUNCTION_ARGS)
>> {
>> +#ifdef HAVE_INT128
>> + Int16AggState *state;
>> +
>> + state = PG_ARGISNULL(0) ? NULL : (Int16AggState *) PG_GETARG_POINTER(0);
>> +
>> + /* Should not get here with no state */
>> + if (state == NULL)
>> + elog(ERROR, "int2_accum_inv called with NULL state");
>> +
>> + if (!PG_ARGISNULL(1))
>> + do_int16_discard(state, (int128) PG_GETARG_INT16(1));
>> +#else
>> NumericAggState *state;
>>
>> state = PG_ARGISNULL(0) ? NULL : (NumericAggState *) PG_GETARG_POINTER(0);
>> @@ -3049,6 +3170,7 @@ int2_accum_inv(PG_FUNCTION_ARGS)
>> if (!do_numeric_discard(state, newval))
>> elog(ERROR, "do_numeric_discard failed unexpectedly");
>> }
>
> Hm. It might be nicer to move the if (!state) elog() outside the ifdef,
> and add curly parens inside the ifdef.

The reason I did so was because the type of the state differs and I did
not feel like having two ifdef blocks. I have no strong opinion about
this though.

> pg_config.h.win32 should be updated as well.

Is it possible to update it without running Windows?

--
Andreas Karlsson


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andreas Karlsson <andreas(at)proxel(dot)se>
Cc: Andres Freund <andres(at)anarazel(dot)de>, David Rowley <dgrowleyml(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-01-11 16:14:24
Message-ID: 14248.1420992864@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andreas Karlsson <andreas(at)proxel(dot)se> writes:
> On 01/11/2015 02:36 AM, Andres Freund wrote:
>> b) I'm doubtful that AC_CHECK_TYPES is a sufficiently good test on all
>> platforms.

> Should I fix it to actually compile some code which uses the 128-bit types?

We used to have code in configure to test that int64 works. Please copy
that (at least as much as necessary; perhaps we don't need to check for
sprintf support).

regards, tom lane


From: Petr Jelinek <petr(at)2ndquadrant(dot)com>
To: Andreas Karlsson <andreas(at)proxel(dot)se>, Andres Freund <andres(at)anarazel(dot)de>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-01-11 19:08:21
Message-ID: 54B2CA25.7090503@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 11/01/15 05:07, Andreas Karlsson wrote:
> On 01/11/2015 02:36 AM, Andres Freund wrote:
>>> @@ -3030,6 +3139,18 @@ int8_avg_accum(PG_FUNCTION_ARGS)
>>> Datum
>>> int2_accum_inv(PG_FUNCTION_ARGS)
>>> {
>>> +#ifdef HAVE_INT128
>>> + Int16AggState *state;
>>> +
>>> + state = PG_ARGISNULL(0) ? NULL : (Int16AggState *)
>>> PG_GETARG_POINTER(0);
>>> +
>>> + /* Should not get here with no state */
>>> + if (state == NULL)
>>> + elog(ERROR, "int2_accum_inv called with NULL state");
>>> +
>>> + if (!PG_ARGISNULL(1))
>>> + do_int16_discard(state, (int128) PG_GETARG_INT16(1));
>>> +#else
>>> NumericAggState *state;
>>>
>>> state = PG_ARGISNULL(0) ? NULL : (NumericAggState *)
>>> PG_GETARG_POINTER(0);
>>> @@ -3049,6 +3170,7 @@ int2_accum_inv(PG_FUNCTION_ARGS)
>>> if (!do_numeric_discard(state, newval))
>>> elog(ERROR, "do_numeric_discard failed unexpectedly");
>>> }
>>
>> Hm. It might be nicer to move the if (!state) elog() outside the ifdef,
>> and add curly parens inside the ifdef.
>
> The reason I did so was because the type of the state differs and I did
> not feel like having two ifdef blocks. I have no strong opinion about
> this though.
>

I think Andres means you should move the NULL check before the ifdef and
then use curly braces inside the the ifdef/else so that you can define
the state variable there. That can be done with single ifdef.

int2_accum_inv(PG_FUNCTION_ARGS)
{
... null check ...
{
#ifdef HAVE_INT128
Int16AggState *state;
...
#else
NumericAggState *state;
...
#endif
PG_RETURN_POINTER(state);
}
}

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Andres Freund <andres(at)anarazel(dot)de>
To: Andreas Karlsson <andreas(at)proxel(dot)se>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-01-12 00:37:28
Message-ID: 20150112003728.GB16275@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2015-01-11 05:07:13 +0100, Andreas Karlsson wrote:
> On 01/11/2015 02:36 AM, Andres Freund wrote:
> >a) Afaics only __int128/unsigned __int128 is defined. See
> > https://gcc.gnu.org/onlinedocs/gcc/_005f_005fint128.html
>
> Both GCC and Clang defines both of them. Which you use seems to just be a
> matter of preference.

Only __int128 is documented, the other stuff is just legacy (and
internally documented to be just backward compat stuff).

> >b) I'm doubtful that AC_CHECK_TYPES is a sufficiently good test on all
> > platforms. IIRC gcc will generate calls to functions to do the actual
> > arithmetic, and I don't think they're guranteed to be available on
> > platforms. That how it .e.g. behaves for atomic operations. So my
> > guess is that you'll need to check that a program that does actual
> > arithmetic still links.
>
> I too thought some about this and decided to look at how other projects
> handled this. The projects I have looked at seems to trust that if
> __int128_t is defined it will also work.
>
> https://github.com/python/cpython/blob/master/configure#L7778
> http://cgit.freedesktop.org/cairo/tree/build/configure.ac.system#n88
>
> But after some more searching now I notice that at least gstreamer does not
> trust this to be true.
>
> https://github.com/ensonic/gstreamer/blob/master/configure.ac#L382
>
> Should I fix it to actually compile some code which uses the 128-bit types?

Yes, compile + link please. As Tom said, best also test some arithmetics.

> >>@@ -3030,6 +3139,18 @@ int8_avg_accum(PG_FUNCTION_ARGS)
> >> Datum
> >> int2_accum_inv(PG_FUNCTION_ARGS)
> >> {
> >>+#ifdef HAVE_INT128
> >>+ Int16AggState *state;
> >>+
> >>+ state = PG_ARGISNULL(0) ? NULL : (Int16AggState *) PG_GETARG_POINTER(0);
> >>+
> >>+ /* Should not get here with no state */
> >>+ if (state == NULL)
> >>+ elog(ERROR, "int2_accum_inv called with NULL state");
> >>+
> >>+ if (!PG_ARGISNULL(1))
> >>+ do_int16_discard(state, (int128) PG_GETARG_INT16(1));
> >>+#else
> >> NumericAggState *state;
> >>
> >> state = PG_ARGISNULL(0) ? NULL : (NumericAggState *) PG_GETARG_POINTER(0);
> >>@@ -3049,6 +3170,7 @@ int2_accum_inv(PG_FUNCTION_ARGS)
> >> if (!do_numeric_discard(state, newval))
> >> elog(ERROR, "do_numeric_discard failed unexpectedly");
> >> }
> >
> >Hm. It might be nicer to move the if (!state) elog() outside the ifdef,
> >and add curly parens inside the ifdef.
>
> The reason I did so was because the type of the state differs and I did not
> feel like having two ifdef blocks. I have no strong opinion about this
> though.

Petr explained what I was thinking of.

> >pg_config.h.win32 should be updated as well.
>
> Is it possible to update it without running Windows?

Just copy the relevant hunks by hand. In your case the #undef's
generated are sufficient. There are others where we want to define
things unconditionally.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Andreas Karlsson <andreas(at)proxel(dot)se>
To: David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-01-18 00:37:11
Message-ID: 54BB0037.5040207@proxel.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 12/31/2014 03:00 PM, David Rowley wrote:
> hmm, I think it should be changed to int128 then. Pitty int4 was
> selected as a name instead of int32 back in the day...
>
> I'm going to mark the patch as waiting on author, pending those two changes.
>
> My view with the size estimates change is that if a committer deems it
> overkill, then they can rip it out, but at least it's been thought of
> and considered rather than forgotten about.

Did we come to any conclusion about naming conventions?

I am still unsure on this question. In some cases 128 is much nicer than
16, for example Int128AggState is nicer than Int16AggState and the same
is true for do_int128_accum vs do_int16_accum, but the tricky cases are
things like int16_to_numericvar where there already is a
int8_to_numericvar function and what we should call the functions in
pg_proc (currently named numeric_int16_*).

I also agree with Robert about that we should just pick one estimate and
use that. I picked the smaller one, but that might be too optimistic so
feel free to ask me to switch it back or to pick something in between
the two estimates.

--
Andreas Karlsson


From: Andreas Karlsson <andreas(at)proxel(dot)se>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-01-22 23:27:16
Message-ID: 54C18754.4010509@proxel.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 01/11/2015 02:36 AM, Andres Freund wrote:
> a) Afaics only __int128/unsigned __int128 is defined. See
> https://gcc.gnu.org/onlinedocs/gcc/_005f_005fint128.html
>
> b) I'm doubtful that AC_CHECK_TYPES is a sufficiently good test on all
> platforms. IIRC gcc will generate calls to functions to do the actual
> arithmetic, and I don't think they're guranteed to be available on
> platforms. That how it .e.g. behaves for atomic operations. So my
> guess is that you'll need to check that a program that does actual
> arithmetic still links.
>
> c) Personally I don't see the point of testing __uint128_t. That's just
> a pointless test that makes configure run for longer.

The next version will fix all these issues.

> Hm. It might be nicer to move the if (!state) elog() outside the ifdef,
> and add curly parens inside the ifdef.

Since that change only really works for the *_inv functions I decided to
leave them all as-is for consistency.

>> --- a/src/include/pg_config.h.in
>> +++ b/src/include/pg_config.h.in
>> @@ -678,6 +678,12 @@
>> /* Define to 1 if your compiler understands __VA_ARGS__ in macros. */
>> #undef HAVE__VA_ARGS
>>
> z> +/* Define to 1 if the system has the type `__int128_t'. */
>> +#undef HAVE___INT128_T
>> +
>> +/* Define to 1 if the system has the type `__uint128_t'. */
>> +#undef HAVE___UINT128_T
>
> pg_config.h.win32 should be updated as well.

Will be fixed in the next version.

--
Andreas Karlsson


From: Andreas Karlsson <andreas(at)proxel(dot)se>
To: David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)2ndquadrant(dot)com>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-01-22 23:40:51
Message-ID: 54C18A83.1070003@proxel.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

A new version of the patch is attached, which changes the following:

- Changed from using __int128_t to __int128.
- Actually compiles and runs code in configure to see that int128 works.
- No longer tests for __uint128_t.
- Updated pg_config.h.win32
- Renamed some things from int12 to int128, there are still some places
with int16 which I am not sure what to do with.

A remaining issue is what estimate we should pick for the size of the
aggregate state. This patch uses the size of the int128 version for the
estimate, but I have no strong opinions on which we should use.

--
Andreas Karlsson

Attachment Content-Type Size
int128-agg-v5.patch text/x-patch 34.0 KB

From: Petr Jelinek <petr(at)2ndquadrant(dot)com>
To: Andreas Karlsson <andreas(at)proxel(dot)se>, David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)2ndquadrant(dot)com>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-01-23 01:58:46
Message-ID: 54C1AAD6.9040708@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 23/01/15 00:40, Andreas Karlsson wrote:

> - Renamed some things from int12 to int128, there are still some places
> with int16 which I am not sure what to do with.
>

I'd vote for renaming them to int128 too, there is enough C functions
that user int16 for 16bit integer that this is going to be confusing
otherwise.

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Andreas Karlsson <andreas(at)proxel(dot)se>
To: Petr Jelinek <petr(at)2ndquadrant(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)2ndquadrant(dot)com>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-01-27 07:21:57
Message-ID: 54C73C95.8080600@proxel.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 01/23/2015 02:58 AM, Petr Jelinek wrote:
> On 23/01/15 00:40, Andreas Karlsson wrote:
>> - Renamed some things from int12 to int128, there are still some places
>> with int16 which I am not sure what to do with.
>
> I'd vote for renaming them to int128 too, there is enough C functions
> that user int16 for 16bit integer that this is going to be confusing
> otherwise.

Do you also think the SQL functions should be named numeric_int128_sum,
numeric_int128_avg, etc?

--
Andreas Karlsson


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Andreas Karlsson <andreas(at)proxel(dot)se>
Cc: Petr Jelinek <petr(at)2ndquadrant(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-01-27 08:05:17
Message-ID: 20150127080517.GH4655@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2015-01-27 08:21:57 +0100, Andreas Karlsson wrote:
> On 01/23/2015 02:58 AM, Petr Jelinek wrote:
> >On 23/01/15 00:40, Andreas Karlsson wrote:
> >>- Renamed some things from int12 to int128, there are still some places
> >>with int16 which I am not sure what to do with.
> >
> >I'd vote for renaming them to int128 too, there is enough C functions
> >that user int16 for 16bit integer that this is going to be confusing
> >otherwise.
>
> Do you also think the SQL functions should be named numeric_int128_sum,
> numeric_int128_avg, etc?

I'm pretty sure we already decided upthread that the SQL interface is
going to keep usint int4/8 and by extension int16.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Andreas Karlsson <andreas(at)proxel(dot)se>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Petr Jelinek <petr(at)2ndquadrant(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-01-27 08:21:48
Message-ID: 54C74A9C.2090509@proxel.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 01/27/2015 09:05 AM, Andres Freund wrote:
> On 2015-01-27 08:21:57 +0100, Andreas Karlsson wrote:
>> On 01/23/2015 02:58 AM, Petr Jelinek wrote:
>>> On 23/01/15 00:40, Andreas Karlsson wrote:
>>>> - Renamed some things from int12 to int128, there are still some places
>>>> with int16 which I am not sure what to do with.
>>>
>>> I'd vote for renaming them to int128 too, there is enough C functions
>>> that user int16 for 16bit integer that this is going to be confusing
>>> otherwise.
>>
>> Do you also think the SQL functions should be named numeric_int128_sum,
>> numeric_int128_avg, etc?
>
> I'm pretty sure we already decided upthread that the SQL interface is
> going to keep usint int4/8 and by extension int16.

Excellent, then I will leave my latest patch as-is and let the reviewer
say what he thinks.

--
Andreas Karlsson


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Andreas Karlsson <andreas(at)proxel(dot)se>
Cc: Petr Jelinek <petr(at)2ndquadrant(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)2ndquadrant(dot)com>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-01-28 23:28:51
Message-ID: CAM3SWZSqshdM9DxFfTj=mdf6_AAwbJO8JJ2X_6M6z9XcjV6qmw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jan 26, 2015 at 11:21 PM, Andreas Karlsson <andreas(at)proxel(dot)se> wrote:
> Do you also think the SQL functions should be named numeric_int128_sum,
> numeric_int128_avg, etc?

Some quick review comments. These apply to int128-agg-v5.patch.

* Why is there no declaration of the function
numeric_int16_stddev_internal() within numeric.c?

* I concur with others - we should stick to int16 for the SQL
interface. The inconsistency there is perhaps slightly confusing, but
that's historic.

* I'm not sure about the idea of "polymorphic" catalog functions (that
return the type "internal", but the actual struct returned varying
based on build settings).

I tend to think that things would be better if there was always a
uniform return type for such "internal" type returning functions, but
that *its* structure varied according to the availability of int128
(i.e. HAVE_INT128) at compile time. What we should probably do is have
a third aggregate struct, that encapsulates this idea of (say)
int2_accum() piggybacking on one of either Int128AggState or
NumericAggState directly. Maybe that would be called PolyNumAggState.
Then this common code is all that is needed on both types of build (at
the top of int4_accum(), for example):

PolyNumAggState *state;

state = PG_ARGISNULL(0) ? NULL : (PolyNumAggState *) PG_GETARG_POINTER(0);

I'm not sure if I'd do this with a containing struct or a simple
"#ifdef HAVE_INT128 typedef #else ... ", but I think it's an
improvement either way.

* You didn't update this unequivocal comment to not be so strong:

* Integer data types all use Numeric accumulators to share code and
* avoid risk of overflow.

That's all I have for now...
--
Peter Geoghegan


From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Andreas Karlsson <andreas(at)proxel(dot)se>
Cc: Petr Jelinek <petr(at)2ndquadrant(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)2ndquadrant(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-02-13 13:07:41
Message-ID: CAB7nPqQAznaP5vD0Efe7tQZSgWch_StjWZvQFB9WExab+SgJPw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Jan 29, 2015 at 8:28 AM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:

> On Mon, Jan 26, 2015 at 11:21 PM, Andreas Karlsson <andreas(at)proxel(dot)se>
> wrote:
> > Do you also think the SQL functions should be named numeric_int128_sum,
> > numeric_int128_avg, etc?
>
> Some quick review comments. These apply to int128-agg-v5.patch.
>

Andreas, are you planning to continue working on this patch? Peter has
provided some comments that are still unanswered.
--
Michael


From: Andreas Karlsson <andreas(at)proxel(dot)se>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Petr Jelinek <petr(at)2ndquadrant(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)2ndquadrant(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-02-16 23:05:07
Message-ID: 54E277A3.2090209@proxel.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 02/13/2015 02:07 PM, Michael Paquier wrote:
> Andreas, are you planning to continue working on this patch? Peter has
> provided some comments that are still unanswered.

Yes, but I am quite busy right now. I will try to find time some time
later this week.

--
Andreas Karlsson


From: Andreas Karlsson <andreas(at)proxel(dot)se>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Petr Jelinek <petr(at)2ndquadrant(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)2ndquadrant(dot)com>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-03-04 22:51:42
Message-ID: 54F78C7E.1030503@proxel.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 01/29/2015 12:28 AM, Peter Geoghegan wrote:
> On Mon, Jan 26, 2015 at 11:21 PM, Andreas Karlsson <andreas(at)proxel(dot)se> wrote:
>> Do you also think the SQL functions should be named numeric_int128_sum,
>> numeric_int128_avg, etc?
>
> Some quick review comments. These apply to int128-agg-v5.patch.
>
> * Why is there no declaration of the function
> numeric_int16_stddev_internal() within numeric.c?

Because there is no declaration of numeric_stddev_internal() either. If
there should be I could add declarations for both.

> * I concur with others - we should stick to int16 for the SQL
> interface. The inconsistency there is perhaps slightly confusing, but
> that's historic.

Agreed.

> * I'm not sure about the idea of "polymorphic" catalog functions (that
> return the type "internal", but the actual struct returned varying
> based on build settings).
>
> I tend to think that things would be better if there was always a
> uniform return type for such "internal" type returning functions, but
> that *its* structure varied according to the availability of int128
> (i.e. HAVE_INT128) at compile time. What we should probably do is have
> a third aggregate struct, that encapsulates this idea of (say)
> int2_accum() piggybacking on one of either Int128AggState or
> NumericAggState directly. Maybe that would be called PolyNumAggState.
> Then this common code is all that is needed on both types of build (at
> the top of int4_accum(), for example):
>
> PolyNumAggState *state;
>
> state = PG_ARGISNULL(0) ? NULL : (PolyNumAggState *) PG_GETARG_POINTER(0);
>
> I'm not sure if I'd do this with a containing struct or a simple
> "#ifdef HAVE_INT128 typedef #else ... ", but I think it's an
> improvement either way.

Not entirely sure exactly what I mean but I have changed to something
like this, relying on typedef, in the attached version of the patch. I
think it looks better after the changes.

> * You didn't update this unequivocal comment to not be so strong:
>
> * Integer data types all use Numeric accumulators to share code and
> * avoid risk of overflow.

Fixed.

I have attached a new version of the patch which fixes the issues above
plus moves the autoconf code to the right place (from configure.in to
c-compiler.m4).

--
Andreas Karlsson

Attachment Content-Type Size
int128-agg-v6.patch text/x-patch 36.0 KB

From: Petr Jelinek <petr(at)2ndquadrant(dot)com>
To: Andreas Karlsson <andreas(at)proxel(dot)se>, Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Petr Jelinek <petr(at)2ndquadrant(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)2ndquadrant(dot)com>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-03-07 18:18:45
Message-ID: 54FB4105.5000505@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 04/03/15 23:51, Andreas Karlsson wrote:
> On 01/29/2015 12:28 AM, Peter Geoghegan wrote:
>> * I'm not sure about the idea of "polymorphic" catalog functions (that
>> return the type "internal", but the actual struct returned varying
>> based on build settings).
>>
>> I tend to think that things would be better if there was always a
>> uniform return type for such "internal" type returning functions, but
>> that *its* structure varied according to the availability of int128
>> (i.e. HAVE_INT128) at compile time. What we should probably do is have
>> a third aggregate struct, that encapsulates this idea of (say)
>> int2_accum() piggybacking on one of either Int128AggState or
>> NumericAggState directly. Maybe that would be called PolyNumAggState.
>> Then this common code is all that is needed on both types of build (at
>> the top of int4_accum(), for example):
>>
>> PolyNumAggState *state;
>>
>> state = PG_ARGISNULL(0) ? NULL : (PolyNumAggState *)
>> PG_GETARG_POINTER(0);
>>
>> I'm not sure if I'd do this with a containing struct or a simple
>> "#ifdef HAVE_INT128 typedef #else ... ", but I think it's an
>> improvement either way.
>
> Not entirely sure exactly what I mean but I have changed to something
> like this, relying on typedef, in the attached version of the patch. I
> think it looks better after the changes.
>

I think this made the patch much better indeed.

What I am wondering is if those numeric_int16_* functions that also deal
with either the Int128AggState or NumericAggState should be renamed in
similar fashion.

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Andreas Karlsson <andreas(at)proxel(dot)se>
To: Petr Jelinek <petr(at)2ndquadrant(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)2ndquadrant(dot)com>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-03-09 12:39:04
Message-ID: 54FD9468.5010906@proxel.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 03/07/2015 07:18 PM, Petr Jelinek wrote:

> What I am wondering is if those numeric_int16_* functions that also deal
> with either the Int128AggState or NumericAggState should be renamed in
> similar fashion.

You mean something like numeric_poly_sum instead of numeric_int16_sum? I
personally am not fond of either name. While numeric_int16_* incorrectly
implies we have a int16 SQL type numeric_poly_* does not tell us that
this is an optimized version which uses a smaller state.

The worst part of writing this patch has always been naming functions
and types. :)

Andreas


From: Petr Jelinek <petr(at)2ndquadrant(dot)com>
To: Andreas Karlsson <andreas(at)proxel(dot)se>, Peter Geoghegan <pg(at)heroku(dot)com>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)2ndquadrant(dot)com>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-03-09 16:33:31
Message-ID: 54FDCB5B.5000900@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 09/03/15 13:39, Andreas Karlsson wrote:
> On 03/07/2015 07:18 PM, Petr Jelinek wrote:
>
>> What I am wondering is if those numeric_int16_* functions that also deal
>> with either the Int128AggState or NumericAggState should be renamed in
>> similar fashion.
>
> You mean something like numeric_poly_sum instead of numeric_int16_sum? I
> personally am not fond of either name. While numeric_int16_* incorrectly
> implies we have a int16 SQL type numeric_poly_* does not tell us that
> this is an optimized version which uses a smaller state.

Yes that's what I mean, since the int16 in the name is misleading given
that in at least some builds the int16 won't be used. You could always
have numeric function, int16 function and the poly function which
decides between them but that's probably overkill.

>
> The worst part of writing this patch has always been naming functions
> and types. :)

Naming is hard :)

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: David Fetter <david(at)fetter(dot)org>
To: Andreas Karlsson <andreas(at)proxel(dot)se>
Cc: Petr Jelinek <petr(at)2ndquadrant(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)2ndquadrant(dot)com>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-03-09 17:39:50
Message-ID: 20150309173950.GA30307@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Mar 09, 2015 at 01:39:04PM +0100, Andreas Karlsson wrote:
> On 03/07/2015 07:18 PM, Petr Jelinek wrote:
>
> >What I am wondering is if those numeric_int16_* functions that also deal
> >with either the Int128AggState or NumericAggState should be renamed in
> >similar fashion.
>
> You mean something like numeric_poly_sum instead of numeric_int16_sum? I
> personally am not fond of either name. While numeric_int16_* incorrectly
> implies we have a int16 SQL type numeric_poly_* does not tell us that this
> is an optimized version which uses a smaller state.

Would it be simpler to write a separate patch to add an int16 SQL type
so that this implication is correct?

> The worst part of writing this patch has always been naming functions and
> types. :)

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


From: Petr Jelinek <petr(at)2ndquadrant(dot)com>
To: David Fetter <david(at)fetter(dot)org>, Andreas Karlsson <andreas(at)proxel(dot)se>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)2ndquadrant(dot)com>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-03-09 17:57:28
Message-ID: 54FDDF08.9030602@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 09/03/15 18:39, David Fetter wrote:
> On Mon, Mar 09, 2015 at 01:39:04PM +0100, Andreas Karlsson wrote:
>> On 03/07/2015 07:18 PM, Petr Jelinek wrote:
>>
>>> What I am wondering is if those numeric_int16_* functions that also deal
>>> with either the Int128AggState or NumericAggState should be renamed in
>>> similar fashion.
>>
>> You mean something like numeric_poly_sum instead of numeric_int16_sum? I
>> personally am not fond of either name. While numeric_int16_* incorrectly
>> implies we have a int16 SQL type numeric_poly_* does not tell us that this
>> is an optimized version which uses a smaller state.
>
> Would it be simpler to write a separate patch to add an int16 SQL type
> so that this implication is correct?
>

No, because then you'd need to emulate the type on platforms where it
does not exist and the patch would be several times more complex for no
useful reason.

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Andreas Karlsson <andreas(at)proxel(dot)se>
To: Petr Jelinek <petr(at)2ndquadrant(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)2ndquadrant(dot)com>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-03-10 00:37:02
Message-ID: 54FE3CAE.5000005@proxel.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 03/07/2015 07:18 PM, Petr Jelinek wrote:
> What I am wondering is if those numeric_int16_* functions that also deal
> with either the Int128AggState or NumericAggState should be renamed in
> similar fashion.

Here is a patch where I have renamed the functions.

int128-agg-v7.patch

- Rename numeric_int16_* to numeric_poly_*
- Rename static functions int{8,16}_to_numericvar to
int{64,128}_to_numericvar
- Fix typo in c-compile.m4 comment

--
Andreas Karlsson

Attachment Content-Type Size
int128-agg-v7.patch text/x-patch 38.0 KB

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Andreas Karlsson <andreas(at)proxel(dot)se>
Cc: Petr Jelinek <petr(at)2ndquadrant(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)2ndquadrant(dot)com>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-03-10 01:26:11
Message-ID: CAM3SWZRLRutM9mu9VRYwBC-Yrise7kCzf9jsUkGvgCs_r1heoA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Mar 9, 2015 at 5:37 PM, Andreas Karlsson <andreas(at)proxel(dot)se> wrote:
> int128-agg-v7.patch

I see a spelling error:

"+ * On platforms which support 128-bit integers some aggergates instead use a"

Other than that, the patch looks pretty good to me. You're
generalizing from the example of existing routines for
int128_to_numericvar(), and other such code in a fairly mechanical
way. The performance improvements are pretty real and tangible.

You say:

/*
* Integer data types use Numeric accumulators to share code and avoid
* risk of overflow. For int2 and int4 inputs, Numeric accumulation
* is overkill for the N and sum(X) values, but definitely not overkill
* for the sum(X*X) value. Hence, we use int2_accum and int4_accum only
* for stddev/variance --- there are faster special-purpose accumulator
* routines for SUM and AVG of these datatypes.
*
* Similarily we can, where available, use 128-bit integer accumulators
* for sum(X) for int8 and sum(X*X) for int2 and int4, but not sum(X*X)
* for int8.
*/

I think you should talk about the new thing first (just after the
extant, first sentence "Integer data types use Numeric..."). Refer to
where 128-bit integers are used and how, and only then the other stuff
(exceptions). After that, put the PolyNumAggState struct definition
and basic functions. Then int2_accum() and so on.
--
Peter Geoghegan


From: Andreas Karlsson <andreas(at)proxel(dot)se>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Petr Jelinek <petr(at)2ndquadrant(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)2ndquadrant(dot)com>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-03-13 01:23:36
Message-ID: 55023C18.9060305@proxel.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 03/10/2015 02:26 AM, Peter Geoghegan wrote:
> On Mon, Mar 9, 2015 at 5:37 PM, Andreas Karlsson <andreas(at)proxel(dot)se> wrote:
>> int128-agg-v7.patch
>
> I see a spelling error:
>
> "+ * On platforms which support 128-bit integers some aggergates instead use a"

Fixed.

> I think you should talk about the new thing first (just after the
> extant, first sentence "Integer data types use Numeric..."). Refer to
> where 128-bit integers are used and how, and only then the other stuff
> (exceptions). After that, put the PolyNumAggState struct definition
> and basic functions. Then int2_accum() and so on.

Good idea! Do you think the rewritten comment is clear enough, or do I
need to go into more detail?

/*
* Integer data types use Numeric accumulators to share code and avoid risk
* of overflow. To speed up aggregation 128-bit integer accumulators are
* used instead where sum(X) or sum(X*X) fit into 128-bits, and there is
* platform support.
*
* For int2 and int4 inputs sum(X) will fit into a 64-bit accumulator,
hence
* we use faster special-purpose accumulator routines for SUM and AVG of
* these datatypes.
*/

#ifdef HAVE_INT128
typedef struct Int128AggState

--
Andreas Karlsson


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Andreas Karlsson <andreas(at)proxel(dot)se>
Cc: Petr Jelinek <petr(at)2ndquadrant(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)2ndquadrant(dot)com>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-03-13 21:22:18
Message-ID: CAM3SWZTkAUZQo21_d4_xu7a8drGaVTN5+Ts9bo4KbKUgoqDP2Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Mar 12, 2015 at 6:23 PM, Andreas Karlsson <andreas(at)proxel(dot)se> wrote:
> Fixed.

Did you intend to attach a patch here?

>> I think you should talk about the new thing first (just after the
>> extant, first sentence "Integer data types use Numeric..."). Refer to
>> where 128-bit integers are used and how, and only then the other stuff
>> (exceptions). After that, put the PolyNumAggState struct definition
>> and basic functions. Then int2_accum() and so on.
>
>
> Good idea! Do you think the rewritten comment is clear enough, or do I need
> to go into more detail?
>
> /*
> * Integer data types use Numeric accumulators to share code and avoid risk
> * of overflow. To speed up aggregation 128-bit integer accumulators are
> * used instead where sum(X) or sum(X*X) fit into 128-bits, and there is
> * platform support.
> *
> * For int2 and int4 inputs sum(X) will fit into a 64-bit accumulator, hence
> * we use faster special-purpose accumulator routines for SUM and AVG of
> * these datatypes.
> */
>
> #ifdef HAVE_INT128
> typedef struct Int128AggState

Not quite. Refer to the 128-bit integer accumulators as
"special-purpose accumulator routines" instead. Then, in the case of
the extant 64-bit accumulators, refer to them by the shorthand
"integer accumulators". Otherwise it's the wrong way around.

--
Peter Geoghegan


From: Andreas Karlsson <andreas(at)proxel(dot)se>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Petr Jelinek <petr(at)2ndquadrant(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)2ndquadrant(dot)com>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-03-14 03:17:29
Message-ID: 5503A849.60003@proxel.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 03/13/2015 10:22 PM, Peter Geoghegan wrote:
> On Thu, Mar 12, 2015 at 6:23 PM, Andreas Karlsson <andreas(at)proxel(dot)se> wrote:
>> /*
>> * Integer data types use Numeric accumulators to share code and avoid risk
>> * of overflow. To speed up aggregation 128-bit integer accumulators are
>> * used instead where sum(X) or sum(X*X) fit into 128-bits, and there is
>> * platform support.
>> *
>> * For int2 and int4 inputs sum(X) will fit into a 64-bit accumulator, hence
>> * we use faster special-purpose accumulator routines for SUM and AVG of
>> * these datatypes.
>> */
>>
>> #ifdef HAVE_INT128
>> typedef struct Int128AggState
>
> Not quite. Refer to the 128-bit integer accumulators as
> "special-purpose accumulator routines" instead. Then, in the case of
> the extant 64-bit accumulators, refer to them by the shorthand
> "integer accumulators". Otherwise it's the wrong way around.

I disagree. The term "integer accumulators" is confusing. Right now I do
not have any better ideas for how to write that comment, so I submit the
next version of the patch with the comment as I wrote it above. Feel
free to come up with a better wording, my English is not always up to
par when writing technical texts.

--
Andreas Karlsson

Attachment Content-Type Size
int128-agg-v8.patch text/x-patch 38.2 KB

From: Petr Jelinek <petr(at)2ndquadrant(dot)com>
To: Andreas Karlsson <andreas(at)proxel(dot)se>, Peter Geoghegan <pg(at)heroku(dot)com>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)2ndquadrant(dot)com>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-03-15 20:47:40
Message-ID: 5505EFEC.7000800@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 14/03/15 04:17, Andreas Karlsson wrote:
> On 03/13/2015 10:22 PM, Peter Geoghegan wrote:
>> On Thu, Mar 12, 2015 at 6:23 PM, Andreas Karlsson <andreas(at)proxel(dot)se>
>> wrote:
>>> /*
>>> * Integer data types use Numeric accumulators to share code and
>>> avoid risk
>>> * of overflow. To speed up aggregation 128-bit integer
>>> accumulators are
>>> * used instead where sum(X) or sum(X*X) fit into 128-bits, and
>>> there is
>>> * platform support.
>>> *
>>> * For int2 and int4 inputs sum(X) will fit into a 64-bit
>>> accumulator, hence
>>> * we use faster special-purpose accumulator routines for SUM and
>>> AVG of
>>> * these datatypes.
>>> */
>>>
>>> #ifdef HAVE_INT128
>>> typedef struct Int128AggState
>>
>> Not quite. Refer to the 128-bit integer accumulators as
>> "special-purpose accumulator routines" instead. Then, in the case of
>> the extant 64-bit accumulators, refer to them by the shorthand
>> "integer accumulators". Otherwise it's the wrong way around.
>
> I disagree. The term "integer accumulators" is confusing. Right now I do
> not have any better ideas for how to write that comment, so I submit the
> next version of the patch with the comment as I wrote it above. Feel
> free to come up with a better wording, my English is not always up to
> par when writing technical texts.
>

I don't like the term "integer accumulators" either as "integer" is
platform specific. I would phase it like this:

/*
* Integer data types in general use Numeric accumulators to share code
* and avoid risk of overflow.
*
* However for performance reasons some of the optimized special-purpose
* accumulator routines are used when possible.
*
* On platforms with 128-bit integer support, the 128-bit routines will be
* used when sum(X) or sum(X*X) fit into 128-bit.
*
* For int2 and int4 inputs, the N and sum(X) fit into 64-bit so the 64-bit
* accumulators will be used for SUM and AVG of these data types.
*/

It's almost the same thing as you wrote but the 128 bit and 64 bit
optimizations are put on the same "level" of optimized routines.

But this is nitpicking at this point, I am happy with the patch as it
stands right now.

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Andreas Karlsson <andreas(at)proxel(dot)se>
To: Petr Jelinek <petr(at)2ndquadrant(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)2ndquadrant(dot)com>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-03-15 23:37:08
Message-ID: 550617A4.8020502@proxel.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 03/15/2015 09:47 PM, Petr Jelinek wrote:
> It's almost the same thing as you wrote but the 128 bit and 64 bit
> optimizations are put on the same "level" of optimized routines.
>
> But this is nitpicking at this point, I am happy with the patch as it
> stands right now.

Do you think it is ready for committer?

New version with altered comment is attached.

--
Andreas Karlsson

Attachment Content-Type Size
int128-agg-v9.patch text/x-patch 38.2 KB

From: Petr Jelinek <petr(at)2ndquadrant(dot)com>
To: Andreas Karlsson <andreas(at)proxel(dot)se>, Peter Geoghegan <pg(at)heroku(dot)com>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)2ndquadrant(dot)com>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-03-16 13:22:57
Message-ID: 5506D931.606@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 16/03/15 00:37, Andreas Karlsson wrote:
> On 03/15/2015 09:47 PM, Petr Jelinek wrote:
>> It's almost the same thing as you wrote but the 128 bit and 64 bit
>> optimizations are put on the same "level" of optimized routines.
>>
>> But this is nitpicking at this point, I am happy with the patch as it
>> stands right now.
>
> Do you think it is ready for committer?
>

In my opinion, yes.

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Petr Jelinek <petr(at)2ndquadrant(dot)com>
Cc: Andreas Karlsson <andreas(at)proxel(dot)se>, David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)2ndquadrant(dot)com>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-03-18 03:50:48
Message-ID: CAM3SWZSkWYKoxcg3s7Ce2x8REz8CCb88zZXqYWmrGZfZo66PpA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Mar 16, 2015 at 6:22 AM, Petr Jelinek <petr(at)2ndquadrant(dot)com> wrote:
>> Do you think it is ready for committer?
>>
>
> In my opinion, yes.

If it wasn't for the autoconf parts of this, I'd probably agree with
you. I need to go over that more carefully.

--
Peter Geoghegan


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Peter Geoghegan <pg(at)heroku(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andreas Karlsson <andreas(at)proxel(dot)se>
Cc: Petr Jelinek <petr(at)2ndquadrant(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-03-18 08:05:54
Message-ID: 20150318080554.GL27420@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2015-03-17 20:50:48 -0700, Peter Geoghegan wrote:
> On Mon, Mar 16, 2015 at 6:22 AM, Petr Jelinek <petr(at)2ndquadrant(dot)com> wrote:
> >> Do you think it is ready for committer?
> >>
> >
> > In my opinion, yes.
>
> If it wasn't for the autoconf parts of this, I'd probably agree with
> you. I need to go over that more carefully.

I think it's a pretty direct copy of the 64bit code. I'm not entirely
sure why this needs a AC_TRY_RUN with a compile fallback (for cross) and
why a AC_TRY_LINK isn't sufficient? But then, you just copied that
decision.

Tom, it seems you have added the 64bit check to configure. All, the way
back in 1998 ;). C.f. 07ae591c87. Do you see a reason why we need to
actually run code?

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andreas Karlsson <andreas(at)proxel(dot)se>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-03-18 21:00:51
Message-ID: CAM3SWZQ1jAZ9tSfMfN3qcT6eT58cOHYPPEVnS7x+rN1wnG5nFA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Mar 18, 2015 at 1:05 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> I think it's a pretty direct copy of the 64bit code. I'm not entirely
> sure why this needs a AC_TRY_RUN with a compile fallback (for cross) and
> why a AC_TRY_LINK isn't sufficient? But then, you just copied that
> decision.

Good point.

Anyway, I think that it's not quite the same. For one thing, we're
talking about a GCC extension, not a type described by C99. We don't
care about snprintf support, for example. For another, Andreas has
chosen to lump together __int128 and unsigned __int128 into one test,
where the latter really doesn't receive coverage. This only makes
sense from the point of view of this optimization, since we need both
anyway, as int128_to_numericvar() uses the uint128 type. I think it's
fine to only use int128/uint128 for this optimization, and similar
optimizations, and consequently I think it's fine to lump the two
types together, but lets be clear that that's all we mean to do.

I'm going to keep things that way, and try and test unsigned __int128
too (but as part of the same, single configure test). It's not merely
a matter of mechanically copying what we do for int64 (or uint64),
though.

--
Peter Geoghegan


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andreas Karlsson <andreas(at)proxel(dot)se>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-03-18 22:16:14
Message-ID: 20150318221614.GB26995@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2015-03-18 14:00:51 -0700, Peter Geoghegan wrote:
> Anyway, I think that it's not quite the same. For one thing, we're
> talking about a GCC extension, not a type described by C99. We don't
> care about snprintf support, for example.

I don't see that that has any consequence wrt Andreas' test.

> For another, Andreas has chosen to lump together __int128 and unsigned
> __int128 into one test, where the latter really doesn't receive
> coverage.

On my urging actually. It's pretty darn unlikely that only one variant
will work. Useless configure tests just cost time. We're testing a gcc
extension here, as you point out, it'll not just stop working for
unsigned vs signed.

The reason we need a link test (vs just a compile test) is that gcc
links to helper functions to do math - even if they're not present on
the target platform. Compiling will succeed, but linking won't.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andreas Karlsson <andreas(at)proxel(dot)se>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-03-18 22:59:52
Message-ID: CAM3SWZTNDKe3SvBG1iBU32Tz3+7Z5xgW3AxHxst761208ZvxPg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Mar 18, 2015 at 3:16 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
>> For another, Andreas has chosen to lump together __int128 and unsigned
>> __int128 into one test, where the latter really doesn't receive
>> coverage.
>
> On my urging actually. It's pretty darn unlikely that only one variant
> will work.

I think that makes sense. Since we're targeting GCC here, and we're
comfortable with that, I guess that's okay after all.

> The reason we need a link test (vs just a compile test) is that gcc
> links to helper functions to do math - even if they're not present on
> the target platform. Compiling will succeed, but linking won't.

Okay. Attached revision has a few tweaks that reflect the status of
int128/uint128 as specialized types that are basically only useful for
this optimization, or other similar optimizations on compilers that
either are GCC, or aim to be compatible with it. I don't think
Andreas' V9 reflected that sufficiently.

Also, I now always use PolyNumAggState (the typedef), even for #define
HAVE_INT128 code, which I think is a bit clearer. Note that I have
generated a minimal diff, without the machine generated changes that
are ordinarily included in the final commit when autoconf tests are
added, mostly because I do not have the exact version of autoconf on
my development machine required to do this without creating irrelevant
cruft.

Marked "Ready for committer".

A committer that has the exact right version of autoconf (GNU Autoconf
2.69), or perhaps Andreas can build the machine generated parts.
Andres may wish to take another look at the autoconf changes.

--
Peter Geoghegan

Attachment Content-Type Size
int128-agg-v10.patch text/x-patch 36.0 KB

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andreas Karlsson <andreas(at)proxel(dot)se>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-03-18 23:07:40
Message-ID: 20150318230740.GC26995@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2015-03-18 15:59:52 -0700, Peter Geoghegan wrote:
> Okay. Attached revision has a few tweaks that reflect the status of
> int128/uint128 as specialized types that are basically only useful for
> this optimization, or other similar optimizations on compilers that
> either are GCC, or aim to be compatible with it. I don't think
> Andreas' V9 reflected that sufficiently.

Given that we don't rely on C99, I don't think that actually
matters. Lots of our platforms build on pre C99 compilers... I think it
makes sense to say that this currently only tests for a gcc extension
and might be extended in the future. But nothing more.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andreas Karlsson <andreas(at)proxel(dot)se>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-03-18 23:10:28
Message-ID: CAM3SWZQ6iXZ_gu_Gf_u9j3t64hcEijX_qaNbv-odaP3h77BxsA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Mar 18, 2015 at 4:07 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> Given that we don't rely on C99, I don't think that actually
> matters. Lots of our platforms build on pre C99 compilers... I think it
> makes sense to say that this currently only tests for a gcc extension
> and might be extended in the future. But nothing more.

Works for me.

--
Peter Geoghegan


From: Andreas Karlsson <andreas(at)proxel(dot)se>
To: Peter Geoghegan <pg(at)heroku(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-03-18 23:14:46
Message-ID: 550A06E6.1030906@proxel.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 03/18/2015 11:59 PM, Peter Geoghegan wrote:
> Okay. Attached revision has a few tweaks that reflect the status of
> int128/uint128 as specialized types that are basically only useful for
> this optimization, or other similar optimizations on compilers that
> either are GCC, or aim to be compatible with it. I don't think
> Andreas' V9 reflected that sufficiently.
>
> Also, I now always use PolyNumAggState (the typedef), even for #define
> HAVE_INT128 code, which I think is a bit clearer. Note that I have
> generated a minimal diff, without the machine generated changes that
> are ordinarily included in the final commit when autoconf tests are
> added, mostly because I do not have the exact version of autoconf on
> my development machine required to do this without creating irrelevant

Thanks,

I have attached a patch where I have ran autoconf.

--
Andreas Karlsson

Attachment Content-Type Size
int128-agg-v11.patch text/x-patch 38.4 KB

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Andreas Karlsson <andreas(at)proxel(dot)se>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-03-19 18:08:26
Message-ID: 20150319180826.GH9495@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

Working on committing this:
* Converted the configure test to AC_LINK_IFELSE

* I dislike the way the configure test and the resulting HAVE_* is
named. This imo shouldn't be so gcc specific, even if it right now
only detects gcc support. Changed.

* Furthermore does the test use 64bit literals without marking them as
such. That doesn't strike me as a great idea.

* Stuff like:
static int32 numericvar_to_int4(NumericVar *var);
static bool numericvar_to_int8(NumericVar *var, int64 *result);
static void int64_to_numericvar(int64 val, NumericVar *var);
#ifdef HAVE_INT128
static void int128_to_numericvar(int128 val, NumericVar *var);
#endif
is beyond ugly. Imnsho the only int2/4/8 functions that should keep
their name are the SQL callable ones. It's surely one to have
numericvar_to_int8 and int64_to_numericvar.

* I'm not a fan at all of the c.h comment you added. That comment seems,
besides being oddly formatted, to be designed to be outdated ;) I'll
just rip it out and replace it by something shorter. This shouldn't be
so overly specific for this patch.

* This thread is long, I'm not sure who to list as reviewers. Please
check whether those are appropriate.

* I've split off the int128 support from the aggregate changes.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
0001-Add-optional-support-for-128bit-integers.patch text/x-patch 6.3 KB
0002-Use-128-bit-math-to-accelerate-some-aggregation-func.patch text/x-patch 36.9 KB

From: Andreas Karlsson <andreas(at)proxel(dot)se>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-03-19 23:49:07
Message-ID: 550B6073.6080907@proxel.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 03/19/2015 07:08 PM, Andres Freund wrote:
> Working on committing this:

Nice fixes. Sorry about forgetting numericvar_to_int*.

As for the reviewers those lists look pretty much correct. David Rowley
should probably be added to the second patch for his early review and
benchmarking.

--
Andreas Karlsson


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Andreas Karlsson <andreas(at)proxel(dot)se>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-03-20 00:05:24
Message-ID: CAM3SWZRywTCK+9ERWsEmOmxXfVadBbD5NtSoUu4-ha4mBC7sMA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Mar 19, 2015 at 4:49 PM, Andreas Karlsson <andreas(at)proxel(dot)se> wrote:
> Nice fixes. Sorry about forgetting numericvar_to_int*.
>
> As for the reviewers those lists look pretty much correct. David Rowley
> should probably be added to the second patch for his early review and
> benchmarking.

This also seems fine to me.

--
Peter Geoghegan


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Andreas Karlsson <andreas(at)proxel(dot)se>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-03-20 09:32:06
Message-ID: 20150320093206.GA21493@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2015-03-20 00:49:07 +0100, Andreas Karlsson wrote:
> On 03/19/2015 07:08 PM, Andres Freund wrote:
> >Working on committing this:
>
> Nice fixes. Sorry about forgetting numericvar_to_int*.
>
> As for the reviewers those lists look pretty much correct. David Rowley
> should probably be added to the second patch for his early review and
> benchmarking.

Pushed with that additional change. Let's see if the buildfarm thinks.

Thanks for the feature.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Andreas Karlsson <andreas(at)proxel(dot)se>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-03-20 09:39:46
Message-ID: 550BEAE2.3090002@proxel.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 03/20/2015 10:32 AM, Andres Freund wrote:
> Pushed with that additional change. Let's see if the buildfarm thinks.
>
> Thanks for the feature.

Thanks to you and all the reviewers for helping me out with it.

Andreas


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Andreas Karlsson <andreas(at)proxel(dot)se>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-03-20 09:43:30
Message-ID: CAM3SWZSTMrmAHztwD6_3DLyTM13iQbpyHM+P5y_HVtwE4LqcyA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Mar 20, 2015 at 2:39 AM, Andreas Karlsson <andreas(at)proxel(dot)se> wrote:
> On 03/20/2015 10:32 AM, Andres Freund wrote:
>>
>> Pushed with that additional change. Let's see if the buildfarm thinks.
>>
>> Thanks for the feature.
>
> Thanks to you and all the reviewers for helping me out with it.

Indeed. Thanks for your efforts, Andreas.

--
Peter Geoghegan


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Andreas Karlsson <andreas(at)proxel(dot)se>, Peter Geoghegan <pg(at)heroku(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-03-21 15:32:58
Message-ID: 10971.1426951978@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> Pushed with that additional change. Let's see if the buildfarm thinks.

jacana, apparently alone among buildfarm members, does not like it.

regards, tom lane


From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Andreas Karlsson <andreas(at)proxel(dot)se>, Peter Geoghegan <pg(at)heroku(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-03-22 05:17:28
Message-ID: CAB7nPqRAJ+wXrDPEydfO6x_duuPYOdJhTPyueboiwGPHwjwYJg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Mar 22, 2015 at 12:32 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Andres Freund <andres(at)2ndquadrant(dot)com> writes:
>> Pushed with that additional change. Let's see if the buildfarm thinks.
>
> jacana, apparently alone among buildfarm members, does not like it.

All the windows nodes don't pass tests with this patch, the difference
is in the exponential precision: e+000 instead of e+00.
--
Michael


From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Andreas Karlsson <andreas(at)proxel(dot)se>, Peter Geoghegan <pg(at)heroku(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-03-22 05:22:25
Message-ID: CAB7nPqQ_nkK=r2-P0qzCFYFggVX3gFQ9KU2DNF3u6VsqRKNNuQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Mar 22, 2015 at 2:17 PM, Michael Paquier
<michael(dot)paquier(at)gmail(dot)com> wrote:
> On Sun, Mar 22, 2015 at 12:32 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Andres Freund <andres(at)2ndquadrant(dot)com> writes:
>>> Pushed with that additional change. Let's see if the buildfarm thinks.
>>
>> jacana, apparently alone among buildfarm members, does not like it.
>
> All the windows nodes don't pass tests with this patch, the difference
> is in the exponential precision: e+000 instead of e+00.

Useless noise from my side, the error is in the test windows.sql like here:
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jacana&dt=2015-03-21%2003%3A01%3A21
--
Michael


From: David Rowley <dgrowley(at)gmail(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)2ndquadrant(dot)com>, Andreas Karlsson <andreas(at)proxel(dot)se>, Peter Geoghegan <pg(at)heroku(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-03-22 05:27:10
Message-ID: CAHoyFK-DQh_Uf7hb4jKBrr2nDufpFdRkskapTyzwjQV=AP7qJg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 22 March 2015 at 18:17, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
wrote:

> On Sun, Mar 22, 2015 at 12:32 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> >> Pushed with that additional change. Let's see if the buildfarm thinks.
> >
> > jacana, apparently alone among buildfarm members, does not like it.
>
> All the windows nodes don't pass tests with this patch, the difference
> is in the exponential precision: e+000 instead of e+00.
>

It looks like there's some other problems there too, but for the
exponential issue, I posted a patch here:

http://www.postgresql.org/message-id/CAApHDvoRDz8kCdfYzgRCPDEfLMqK0F6U_78nJ-JajxyJ7ufvHA@mail.gmail.com


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>,Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andreas Karlsson <andreas(at)proxel(dot)se>, Peter Geoghegan <pg(at)heroku(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-03-22 09:22:30
Message-ID: 57F23B9A-A690-435A-94DD-DD8D177F26F4@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On March 22, 2015 6:17:28 AM GMT+01:00, Michael Paquier <michael(dot)paquier(at)gmail(dot)com> wrote:
>On Sun, Mar 22, 2015 at 12:32 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Andres Freund <andres(at)2ndquadrant(dot)com> writes:
>>> Pushed with that additional change. Let's see if the buildfarm
>thinks.
>>
>> jacana, apparently alone among buildfarm members, does not like it.
>
>All the windows nodes don't pass tests with this patch, the difference
>is in the exponential precision: e+000 instead of e+00.

That's due to a different patch though, right? When I checked earlier only jacana had problems due to this, and it looked like random memory was being output. It's interesting that that's on the one windows (not cygwin) critter that does the 128bit dance...

--
Please excuse brevity and formatting - I am writing this on my mobile phone.

Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andreas Karlsson <andreas(at)proxel(dot)se>, Peter Geoghegan <pg(at)heroku(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-03-22 09:34:04
Message-ID: CAB7nPqQ93xNR4oLAsrBR=pkSJ75eP1ZEGq=AH_LN+8t_AgjHKA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Mar 22, 2015 at 6:22 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> On March 22, 2015 6:17:28 AM GMT+01:00, Michael Paquier <michael(dot)paquier(at)gmail(dot)com> wrote:
>>On Sun, Mar 22, 2015 at 12:32 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> Andres Freund <andres(at)2ndquadrant(dot)com> writes:
>>>> Pushed with that additional change. Let's see if the buildfarm
>>thinks.
>>>
>>> jacana, apparently alone among buildfarm members, does not like it.
>>
>>All the windows nodes don't pass tests with this patch, the difference
>>is in the exponential precision: e+000 instead of e+00.
>
> That's due to a different patch though, right? When I checked earlier only jacana had problems due to this, and it looked like random memory was being output. It's interesting that that's on the one windows (not cygwin) critter that does the 128bit dance...

Yes, sorry, the e+000 stuff is from 959277a. This patch has visibly broken that:
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jacana&dt=2015-03-21%2003%3A01%3A21
--
Michael


From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andreas Karlsson <andreas(at)proxel(dot)se>, Peter Geoghegan <pg(at)heroku(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-03-22 09:34:26
Message-ID: CAApHDvqvx_t3FRrn34cuFkfvBQ0FMO2_yRvB3q3dpW3TXeiM0g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 22 March 2015 at 22:22, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:

> On March 22, 2015 6:17:28 AM GMT+01:00, Michael Paquier <
> michael(dot)paquier(at)gmail(dot)com> wrote:
> >On Sun, Mar 22, 2015 at 12:32 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> >>> Pushed with that additional change. Let's see if the buildfarm
> >thinks.
> >>
> >> jacana, apparently alone among buildfarm members, does not like it.
> >
> >All the windows nodes don't pass tests with this patch, the difference
> >is in the exponential precision: e+000 instead of e+00.
>
> That's due to a different patch though, right?

Yes this is due to cc0d90b.

> When I checked earlier only jacana had problems due to this, and it looked
> like random memory was being output. It's interesting that that's on the
> one windows (not cygwin) critter that does the 128bit dance...
>

Yeah, I can't recreate the issue locally on my windows machine, but I may
try with gcc if I can get some time.

Regards

David Rowley


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andreas Karlsson <andreas(at)proxel(dot)se>, Peter Geoghegan <pg(at)heroku(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-03-22 09:35:34
Message-ID: 50949B74-A45E-4D00-A4D2-373CD88B5E29@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On March 22, 2015 10:34:04 AM GMT+01:00, Michael Paquier <michael(dot)paquier(at)gmail(dot)com> wrote:
>On Sun, Mar 22, 2015 at 6:22 PM, Andres Freund <andres(at)2ndquadrant(dot)com>
>wrote:
>> On March 22, 2015 6:17:28 AM GMT+01:00, Michael Paquier
><michael(dot)paquier(at)gmail(dot)com> wrote:
>>>On Sun, Mar 22, 2015 at 12:32 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>>> Andres Freund <andres(at)2ndquadrant(dot)com> writes:
>>>>> Pushed with that additional change. Let's see if the buildfarm
>>>thinks.
>>>>
>>>> jacana, apparently alone among buildfarm members, does not like it.
>>>
>>>All the windows nodes don't pass tests with this patch, the
>difference
>>>is in the exponential precision: e+000 instead of e+00.
>>
>> That's due to a different patch though, right? When I checked earlier
>only jacana had problems due to this, and it looked like random memory
>was being output. It's interesting that that's on the one windows (not
>cygwin) critter that does the 128bit dance...
>
>Yes, sorry, the e+000 stuff is from 959277a. This patch has visibly
>broken that:
>http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jacana&dt=2015-03-21%2003%3A01%3A21

That's the stuff looking like random memory that I talk about above...

--
Please excuse brevity and formatting - I am writing this on my mobile phone.

Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andreas Karlsson <andreas(at)proxel(dot)se>, Peter Geoghegan <pg(at)heroku(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-03-22 09:42:51
Message-ID: CAB7nPqS+9nNTatMHw4aF-4tq4HMBgB6sCFqXokBDg0aAE2Vt3Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Mar 22, 2015 at 6:34 PM, David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> On 22 March 2015 at 22:22, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
>>
>> On March 22, 2015 6:17:28 AM GMT+01:00, Michael Paquier
>> <michael(dot)paquier(at)gmail(dot)com> wrote:
>> >On Sun, Mar 22, 2015 at 12:32 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> >> Andres Freund <andres(at)2ndquadrant(dot)com> writes:
>> >>> Pushed with that additional change. Let's see if the buildfarm
>> >thinks.
>> >>
>> >> jacana, apparently alone among buildfarm members, does not like it.
>> >
>> >All the windows nodes don't pass tests with this patch, the difference
>> >is in the exponential precision: e+000 instead of e+00.
>>
>> That's due to a different patch though, right?
>
>
> Yes this is due to cc0d90b.

Err, yes. This exp error is caused by the to_char patch (and that's
what I meant X)), bad copy-paste from here...
--
Michael


From: Petr Jelinek <petr(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andreas Karlsson <andreas(at)proxel(dot)se>, Peter Geoghegan <pg(at)heroku(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-03-22 10:47:59
Message-ID: 550E9DDF.1080701@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 22/03/15 10:35, Andres Freund wrote:
> On March 22, 2015 10:34:04 AM GMT+01:00, Michael Paquier <michael(dot)paquier(at)gmail(dot)com> wrote:
>> On Sun, Mar 22, 2015 at 6:22 PM, Andres Freund <andres(at)2ndquadrant(dot)com>
>>> That's due to a different patch though, right? When I checked earlier
>> only jacana had problems due to this, and it looked like random memory
>> was being output. It's interesting that that's on the one windows (not
>> cygwin) critter that does the 128bit dance...
>>
>> Yes, sorry, the e+000 stuff is from 959277a. This patch has visibly
>> broken that:
>> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jacana&dt=2015-03-21%2003%3A01%3A21
>
> That's the stuff looking like random memory that I talk about above...
>

If you look at it closely, it's actually not random memory. At least in
the first 2 failing tests which are not obfuscated by aggregates on top
of aggregates. It looks like first NumericDigit is ok and the second one
is corrupted (there are only 2 NumericDigits in those numbers). Of
course the conversion to Numeric is done from the end so it looks like
only the last computation/pointer change/something stays ok while the
rest got corrupted.

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Andreas Karlsson <andreas(at)proxel(dot)se>
To: Petr Jelinek <petr(at)2ndquadrant(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Geoghegan <pg(at)heroku(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-03-22 12:59:22
Message-ID: 550EBCAA.6030004@proxel.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 03/22/2015 11:47 AM, Petr Jelinek wrote:
> On 22/03/15 10:35, Andres Freund wrote:
>>> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jacana&dt=2015-03-21%2003%3A01%3A21
>>>
>>
>> That's the stuff looking like random memory that I talk about above...
>>
>
> If you look at it closely, it's actually not random memory. At least in
> the first 2 failing tests which are not obfuscated by aggregates on top
> of aggregates. It looks like first NumericDigit is ok and the second one
> is corrupted (there are only 2 NumericDigits in those numbers). Of
> course the conversion to Numeric is done from the end so it looks like
> only the last computation/pointer change/something stays ok while the
> rest got corrupted.

Would this mean the bug is most likely somewhere in
int128_to_numericvar()? Maybe that version of gcc has a bug in some
__int128 operator or I messed up the code there somehow.

Andreas


From: Petr Jelinek <petr(at)2ndquadrant(dot)com>
To: Andreas Karlsson <andreas(at)proxel(dot)se>, Andres Freund <andres(at)2ndquadrant(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Geoghegan <pg(at)heroku(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-03-22 21:00:13
Message-ID: 550F2D5D.9080700@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 22/03/15 13:59, Andreas Karlsson wrote:
> On 03/22/2015 11:47 AM, Petr Jelinek wrote:
>> On 22/03/15 10:35, Andres Freund wrote:
>>>> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jacana&dt=2015-03-21%2003%3A01%3A21
>>>>
>>>>
>>>
>>> That's the stuff looking like random memory that I talk about above...
>>>
>>
>> If you look at it closely, it's actually not random memory. At least in
>> the first 2 failing tests which are not obfuscated by aggregates on top
>> of aggregates. It looks like first NumericDigit is ok and the second one
>> is corrupted (there are only 2 NumericDigits in those numbers). Of
>> course the conversion to Numeric is done from the end so it looks like
>> only the last computation/pointer change/something stays ok while the
>> rest got corrupted.
>
> Would this mean the bug is most likely somewhere in
> int128_to_numericvar()? Maybe that version of gcc has a bug in some
> __int128 operator or I messed up the code there somehow.
>

Yeah that's what I was thinking also, and I went through the function
and didn't find anything suspicious (besides it's same as the 64 bit
version except for the int128 use).

It really might be some combination of arithmetic + the conversion to
16bit uint bug in the compiler. Would be worth to try to produce test
case and try it standalone maybe?

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Petr Jelinek <petr(at)2ndquadrant(dot)com>
Cc: Andreas Karlsson <andreas(at)proxel(dot)se>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Geoghegan <pg(at)heroku(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-03-22 21:20:49
Message-ID: 20150322212049.GE3241@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2015-03-22 22:00:13 +0100, Petr Jelinek wrote:
> On 22/03/15 13:59, Andreas Karlsson wrote:
> >Would this mean the bug is most likely somewhere in
> >int128_to_numericvar()? Maybe that version of gcc has a bug in some
> >__int128 operator or I messed up the code there somehow.

Yes, or a compiler bug. I looked through the code again and found and
fixed one minor bug, but that doesnt' explain the problem.

> Yeah that's what I was thinking also, and I went through the function and
> didn't find anything suspicious (besides it's same as the 64 bit version
> except for the int128 use).
>
> It really might be some combination of arithmetic + the conversion to 16bit
> uint bug in the compiler. Would be worth to try to produce test case and try
> it standalone maybe?

A compiler bug looks like a not unreasonable bet at this point. I've
asked Andrew to recompile without optimizations... We'll see whether
that makes a difference. Jacana is the only compiler with gcc 4.8.1 (or
is it 4.8.0? there's conflicting output). There've been a number of bugs
fixed that affect loop unrolling and such.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Andreas Karlsson <andreas(at)proxel(dot)se>
To: Andres Freund <andres(at)2ndquadrant(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Geoghegan <pg(at)heroku(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-03-22 21:28:01
Message-ID: 550F33E1.9000406@proxel.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 03/22/2015 10:20 PM, Andres Freund wrote:
> Yes, or a compiler bug. I looked through the code again and found and
> fixed one minor bug, but that doesnt' explain the problem.

Strangely enough the bug looks like it has been fixed at jacana after
your fix of my copypasto. Maybe the bug is random, or your fix really
fixed it.

http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jacana&dt=2015-03-22%2020%3A41%3A54

Andreas


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Petr Jelinek <petr(at)2ndquadrant(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Andreas Karlsson <andreas(at)proxel(dot)se>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Geoghegan <pg(at)heroku(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-03-22 21:32:32
Message-ID: 20150322213232.GF3241@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2015-03-22 22:20:49 +0100, Andres Freund wrote:
> A compiler bug looks like a not unreasonable bet at this point. I've
> asked Andrew to recompile without optimizations... We'll see whether
> that makes a difference. Jacana is the only compiler with gcc 4.8.1 (or
> is it 4.8.0? there's conflicting output). There've been a number of bugs
> fixed that affect loop unrolling and such.

And indeed, a run of jacana with -O0 fixed it. As you can see in
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jacana&dt=2015-03-22%2020%3A41%3A54
, it still fails, but just because of the exp problem.

My guess it's
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jacana&dt=2015-03-22%2020%3A41%3A54

Do we feel the need to find a workaround if this if it's indeed 4.8.0
and 4.8.1 (and some other releases at the same time) that are
problematic? Given that gcc 4.8.2 was released October 16, 2013 I'm
inclined not to.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Andreas Karlsson <andreas(at)proxel(dot)se>
Cc: Petr Jelinek <petr(at)2ndquadrant(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Geoghegan <pg(at)heroku(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using 128-bit integers for sum, avg and statistics aggregates
Date: 2015-03-22 21:34:21
Message-ID: 20150322213421.GG3241@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2015-03-22 22:28:01 +0100, Andreas Karlsson wrote:
> On 03/22/2015 10:20 PM, Andres Freund wrote:
> >Yes, or a compiler bug. I looked through the code again and found and
> >fixed one minor bug, but that doesnt' explain the problem.
>
> Strangely enough the bug looks like it has been fixed at jacana after your
> fix of my copypasto. Maybe the bug is random, or your fix really fixed it.

I bet it's actually the -O0 Andrew added at my behest.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services