review: Reduce palloc's in numeric operations

Lists: pgsql-hackers
From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp
Subject: review: Reduce palloc's in numeric operations
Date: 2012-11-19 13:17:33
Message-ID: CAFj8pRCLpJzBGEL4rZjEtgjD8esiKXtN9wzsnfA6_ESpvnFP5w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello all

I tested this patch and I can confirm, so this patch can increase
speed about 16-22% (depends on IO waits, load type).

I tested real query (anonymyzed)

SELECT SUM(COALESCE((a * b * c * (d + ((1 -(d)) * (1 -(e))))),0)) m1 FROM tab;

-- patched
4493 26.5591 postgres slot_deform_tuple
1327 7.8442 postgres AllocSetAlloc
1313 7.7614 postgres ExecMakeFunctionResultNoSets
1105 6.5319 postgres set_var_from_num_nocopy
924 5.4620 postgres make_result
637 3.7654 postgres mul_var
635 3.7536 postgres numeric_mul
560 3.3103 postgres MemoryContextAlloc
405 2.3940 postgres AllocSetFree
389 2.2995 postgres ExecEvalScalarVarFast
332 1.9625 postgres slot_getsomeattrs
322 1.9034 postgres numeric_add
313 1.8502 postgres add_abs
304 1.7970 postgres pfree
238 1.4069 postgres slot_getattr
216 1.2768 postgres numeric_sub
200 1.1822 postgres heap_tuple_untoast_attr
183 1.0818 postgres strip_var
180 1.0640 postgres ExecEvalConst
173 1.0226 postgres alloc_var
172 1.0167 postgres check_stack_depth

-- origin
4419 22.8325 postgres slot_deform_tuple
2041 10.5456 postgres AllocSetAlloc
1248 6.4483 postgres set_var_from_num
1186 6.1279 postgres ExecMakeFunctionResultNoSets
886 4.5779 postgres MemoryContextAlloc
856 4.4229 postgres make_result
757 3.9113 postgres numeric_mul
731 3.7770 postgres AllocSetFree
625 3.2293 postgres mul_var
601 3.1053 postgres alloc_var
545 2.8160 postgres pfree
503 2.5989 postgres free_var
458 2.3664 postgres slot_getsomeattrs
378 1.9531 postgres numeric_add
360 1.8601 postgres add_abs
336 1.7361 postgres ExecEvalScalarVarFast
266 1.3744 postgres slot_getattr
221 1.1419 postgres numeric_sub

Review:

1) this patch was clearly applied and compilation was without warning

2) regress tests - All 133 tests passed.

4) don't see any memory leaks there (verified by following code)

CREATE OR REPLACE FUNCTION public.fx(_m integer)
RETURNS void
LANGUAGE plpgsql
AS $function$
declare m numeric = 10;
n numeric = 20022.222;
r numeric;
begin
for i in 1.._m
loop
r := m * n;
end loop;
end;
$function$

postgres=# select fx(10000000);
fx
----

(1 row)

Time: 5312.623 ms --- original ( 4798.103 ms -- patched 10% speedup)

5) it respect PostgreSQL's coding standards

6) we would to accept this patch - it can carry 10% speedup
calculations with numerics.

Regards

Pavel Stehule


From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp
Subject: Re: review: Reduce palloc's in numeric operations
Date: 2012-11-20 17:00:46
Message-ID: 50ABB73E.9070702@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 19.11.2012 15:17, Pavel Stehule wrote:
> I tested this patch and I can confirm, so this patch can increase
> speed about 16-22% (depends on IO waits, load type).

Thanks for the review.

I spent some more time on this, continuing with the thought that perhaps
it would be better if get_str_from_var() didn't scribble on its input. I
ended up with the attached patch, which contains a bunch of small tweaks:

* Add init_var_from_num() function. This is the same as
set_var_from_num_nocopy in the original patch, but it doesn't require
the caller to have called init_var() first. IMHO this makes the calling
code slightly more readable. Also, it's now more evident what these vars
are: the digits array points to original array in the original Datum,
but 'buf' is NULL. This is the same arrangement that's used in the
constant NumericVars like const_zero.

* get_str_from_var() no longer scribbles on its input. I noticed that
it's always called with a dscale that comes from the input var itself.
In other words, the rounding was unnecessary to begin with. I simply
removed the dscale argument and the round_var() call from
get_str_from_var(). If a someone wants to display a string with
different dscale in the future, he can simply call round_var() before
get_str_from_var().

* numericvar_to_int8() no long scribbles on its input either. It creates
a temporary copy to avoid that. To compensate, the callers no longer
need to create a temporary copy, so the net # of pallocs is the same,
but this is nicer. (there's room for a micro-optimization to avoid
making the temporary copy numericvar_to_int8() when the argument is
already suitably rounded - I left that our for now, dunno if it would
make any difference in practice)

* use a constant for the number 10 in get_str_from_var_sci(), when
calculating 10^exponent. Saves a palloc() and some cycles to convert
integer 10 to numeric.

Comments? Assuming no-one sees some fatal flaw in this, I'll commit this
tomorrow.

- Heikki

Attachment Content-Type Size
fasternumeric_v3-heikki.patch text/x-diff 16.8 KB

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp
Subject: Re: review: Reduce palloc's in numeric operations
Date: 2012-11-20 17:32:59
Message-ID: CAFj8pRBE0uOY61Z8UJQ1Aw10=AhWH9V4Y1n3n89CHcN3PifOyA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2012/11/20 Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>:
> On 19.11.2012 15:17, Pavel Stehule wrote:
>>
>> I tested this patch and I can confirm, so this patch can increase
>> speed about 16-22% (depends on IO waits, load type).
>
>
> Thanks for the review.
>
> I spent some more time on this, continuing with the thought that perhaps it
> would be better if get_str_from_var() didn't scribble on its input. I ended
> up with the attached patch, which contains a bunch of small tweaks:
>
> * Add init_var_from_num() function. This is the same as
> set_var_from_num_nocopy in the original patch, but it doesn't require the
> caller to have called init_var() first. IMHO this makes the calling code
> slightly more readable. Also, it's now more evident what these vars are: the
> digits array points to original array in the original Datum, but 'buf' is
> NULL. This is the same arrangement that's used in the constant NumericVars
> like const_zero.
>
> * get_str_from_var() no longer scribbles on its input. I noticed that it's
> always called with a dscale that comes from the input var itself. In other
> words, the rounding was unnecessary to begin with. I simply removed the
> dscale argument and the round_var() call from get_str_from_var(). If a
> someone wants to display a string with different dscale in the future, he
> can simply call round_var() before get_str_from_var().
>
> * numericvar_to_int8() no long scribbles on its input either. It creates a
> temporary copy to avoid that. To compensate, the callers no longer need to
> create a temporary copy, so the net # of pallocs is the same, but this is
> nicer. (there's room for a micro-optimization to avoid making the temporary
> copy numericvar_to_int8() when the argument is already suitably rounded - I
> left that our for now, dunno if it would make any difference in practice)
>
> * use a constant for the number 10 in get_str_from_var_sci(), when
> calculating 10^exponent. Saves a palloc() and some cycles to convert integer
> 10 to numeric.
>
> Comments? Assuming no-one sees some fatal flaw in this, I'll commit this
> tomorrow.

I have no objections

all regression tests passed, no warnings - has a sense

Regards

Pavel

>
> - Heikki


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp
Subject: Re: review: Reduce palloc's in numeric operations
Date: 2012-11-20 19:44:26
Message-ID: 12022.1353440666@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> writes:
> * Add init_var_from_num() function. This is the same as
> set_var_from_num_nocopy in the original patch, but it doesn't require
> the caller to have called init_var() first. IMHO this makes the calling
> code slightly more readable. Also, it's now more evident what these vars
> are: the digits array points to original array in the original Datum,
> but 'buf' is NULL. This is the same arrangement that's used in the
> constant NumericVars like const_zero.

init_var_from_num's header comment could use some more work. The
statement that one "must not modify the returned var" is false in some
sense, since for instance numeric_ceil() does that. The key is that you
have to replace the digit buffer not modify it in-place, but most
computational routines do that anyway. Also it might be worth pointing
out explicitly that free_var() is not required unless the var is
modified subsequently. (There are instances of both cases, and it might
not be clear to the reader why.)

> * get_str_from_var() no longer scribbles on its input. I noticed that
> it's always called with a dscale that comes from the input var itself.
> In other words, the rounding was unnecessary to begin with.

Hmm, it may have been necessary once upon a time, but I agree getting
rid of the rounding step is a win now. Suggest adding a comment though
that the var is displayed to the number of digits indicated by its dscale.

Looks good otherwise.

regards, tom lane


From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp
Subject: Re: review: Reduce palloc's in numeric operations
Date: 2012-11-21 14:17:50
Message-ID: 50ACE28E.1060404@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 20.11.2012 21:44, Tom Lane wrote:
> init_var_from_num's header comment could use some more work. The
> statement that one "must not modify the returned var" is false in some
> sense, since for instance numeric_ceil() does that. The key is that you
> have to replace the digit buffer not modify it in-place, but most
> computational routines do that anyway. Also it might be worth pointing
> out explicitly that free_var() is not required unless the var is
> modified subsequently. (There are instances of both cases, and it might
> not be clear to the reader why.)

Added those points to the comment.

>> * get_str_from_var() no longer scribbles on its input. I noticed that
>> it's always called with a dscale that comes from the input var itself.
>> In other words, the rounding was unnecessary to begin with.
>
> Hmm, it may have been necessary once upon a time, but I agree getting
> rid of the rounding step is a win now. Suggest adding a comment though
> that the var is displayed to the number of digits indicated by its dscale.
>
> Looks good otherwise.

Committed, thanks to everyone involved.

- Heikki