Re: [PATCH] Fix conversion for Decimal arguments in plpython functions

From: Szymon Guz <mabewlun(at)gmail(dot)com>
To: Steve Singer <steve(at)ssinger(dot)info>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Fix conversion for Decimal arguments in plpython functions
Date: 2013-06-25 09:42:22
Message-ID: CAFjNrYuZL74uRAey2_7Dk3CYNfhO7mzS3a_8cPYntds3=V1HrA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 25 June 2013 05:16, Steve Singer <steve(at)ssinger(dot)info> wrote:

>
> One concern I have is that this patch makes pl/python functions involving
> numerics more than 3 times as slow as before.
>
>
> create temp table b(a numeric);
> insert into b select generate_series(1,10000);
>
> create or replace function x(a numeric,b numeric) returns numeric as $$
> if a==None:
> return b
> return a+b
> $$ language plpythonu;
> create aggregate sm(basetype=numeric, sfunc=x,stype=numeric);
>
>
> test=# select sm(a) from b;
> sm
> ----------
> 50005000
> (1 row)
>
> Time: 565.650 ms
>
> versus before the patch this was taking in the range of 80ms.
>
> Would it be faster to call numeric_send instead of numeric_out and then
> convert the sequence of Int16's to a tuple of digits that can be passed
> into the Decimal constructor? I think this is worth trying and testing,
>
>
Hi,
thanks for all the remarks.

I think I cannot do anything about speeding up the code. What I've found so
far is:

I cannot use simple fields from NumericVar in my code, so to not waste time
on something not sensible, I've tried to found out if using the tuple
constructor for decimal.Decimal will be faster. I've changed the function
to something like this:

static PyObject *
PLyDecimal_FromNumeric(PLyDatumToOb *arg, Datum d)
{
PyObject *digits = PyTuple_New(4);
PyTuple_SetItem(digits, 0, PyInt_FromLong(1));
PyTuple_SetItem(digits, 1, PyInt_FromLong(4));
PyTuple_SetItem(digits, 2, PyInt_FromLong(1));
PyTuple_SetItem(digits, 3, PyInt_FromLong(4));

PyObject *tuple = PyTuple_New(3);
PyTuple_SetItem(tuple, 0, PyInt_FromLong(1));
PyTuple_SetItem(tuple, 1, digits);
PyTuple_SetItem(tuple, 2, PyInt_FromLong(-3));

value = PyObject_CallFunctionObjArgs(PLy_decimal_ctor_global,
tuple, NULL);

return value;
}

Yes, it returns the same value regardless the params. The idea is to call
Python code like:

Decimal((0, (1, 4, 1, 4), -3))

which is simply:

Decimal('1.414')

Unfortunately this is not faster. It is as slow as it was with string
constructor.

I've checked the speed of decimal.Decimal using pure python. For this I
used a simple function, similar to yours:

def x(a, b):
if a is None:
return b
return a + b

I've run the tests using simple ints:

def test():
a = 0
for i in xrange(0, 10000):
a += x(a, i)

for a in xrange(1, 100):
test()

And later I've run the same function, but with converting the arguments to
Decimals:

from decimal import Decimal

def x(a, b):
if a is None:
return b
return a + b

def test():
a = 0
for i in xrange(0, 10000):
a += x(Decimal(a), Decimal(i))

for a in xrange(1, 100):
test()

It was run 100 times for decreasing the impact of test initialization.

The results for both files are:
int: 0.697s
decimal: 38.859s

What gives average time for one function call of:
int: 69ms
decimal: 380ms

For me the problem is with slow code at Python's side, the Decimal
constructors are pretty slow, and there is nothing I can do with that at
the Postgres' side.

I will send patch with fixes later.

thanks,
Szymon

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2013-06-25 10:01:37 Re: proposal: enable new error fields in plpgsql (9.4)
Previous Message Michael Paquier 2013-06-25 09:33:00 Re: PostgreSQL 9.3 latest dev snapshot