Re: cache type info in json_agg and friends

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: cache type info in json_agg and friends
Date: 2015-09-15 16:36:05
Message-ID: 55F848F5.1080603@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 09/14/2015 04:24 PM, Andrew Dunstan wrote:
>
>
> On 09/14/2015 03:42 PM, Teodor Sigaev wrote:
>>> Currently, json_agg, jsonb_agg, json_object_agg and jsonb_object_agg do
>>> type classification on their arguments on each call to the transition
>>> function. This is quite unnecessary, as the argument types won't
>>> change.
>>> This patch remedies the defect by caching the necessary values in the
>>> aggregate state object.
>> +1
>>
>>>
>>> While this doesn't change the performance much, since these functions
>>> are essentially dominated by other bits of the processing, I think
>>> it is
>>> nevertheless worth doing.
>> Agree
>>
>> After quick observation of your patch, why don't you use FmgrInfo
>> instead of JsonAggState.val_output_func/JsonAggState.key_category?
>> FmgrInfo could be filled by fmgr_info_cxt() in aggcontext memory
>> context. Suppose, direct usage of FmgrInfo with FunctionCall a bit
>> faster than OidFunctionCall.
>
>
> Well, we need the category to help data_to_json(b) do its work.
> Nevertheless, it might be doable to pass an FmgrInfo* to
> datum_to_json. I'll see what I can do.
>
>

The real problem about this is that in the most important cases to
improve (composite_to_json(b) and the array processing functions) we'll
still end up calling fmgr_info for every attribute of every record and
for every array, although not for every array element, which is what
OidOutputFunctionCall does for us anyway.

It's not obvious to me how to fix that, and before I put lots of effort
into it I want to do some profiling to see where the time is actually
being spent - I don't want to add a whole lot of code for a very
marginal improvement.

One thought I did have that might be worth testing is that in the case
of jsonb all the micro operations might be killing us, and that it might
well be faster to generate a JSON string in the aggregates and then
parse that into jsonb in the final function.

cheers

andrew

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message YUriy Zhuravlev 2015-09-15 16:43:28 Re: Move PinBuffer and UnpinBuffer to atomics
Previous Message Joe Conway 2015-09-15 16:28:38 Re: row_security GUC, BYPASSRLS