Lists: | pgsql-hackers |
---|
From: | Craig Ringer <craig(at)2ndquadrant(dot)com> |
---|---|
To: | PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Multiple calls to json_array_elements slow down nonlinearly |
Date: | 2014-02-03 01:09:11 |
Message-ID: | 52EEEC37.9040305@2ndquadrant.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Hi all
I ran into this performance report over the weekend:
http://stackoverflow.com/q/21507127/398670
and wanted to mention it here.
json_array_elements seems to spend about 97% of its time in
MemoryContextReset(...).
Given dummy data:
test=> create table g as select (select json_agg(random()) json from
generate_series(0, (r1*4)::int)) from (select random() r1 from
generate_series(1,20000)) aux;
Compare these two methods of producing the same result set:
test=> create table q as select json->x foo from g,
generate_series(0,json_array_length(g.json)-1) x;
SELECT 60103
Time: 157.702 ms
test=> create table p as select json_array_elements(json) foo from g;
SELECT 60103
Time: 4254.494 ms
The issue is reproducible and scales non-linearly with row count, which
is a clue. At 100k rows input, the lateral query takes 592ms vs 179959ms
(3 minutes) for json_array_elements.
Whenever I grab a backtrace it looks like:
> #0 0x000000000072dd7d in MemoryContextReset (context=0x2a02dc90) at mcxt.c:130
> #1 0x000000000072dd90 in MemoryContextResetChildren (context=<optimized out>) at mcxt.c:155
> #2 MemoryContextReset (context=0x1651220) at mcxt.c:131
> #3 0x00000000005817f9 in ExecScan (node=node(at)entry=0x164e1a0, accessMtd=accessMtd(at)entry=0x592040 <SeqNext>, recheckMtd=recheckMtd(at)entry=0x592030 <SeqRecheck>)
> at execScan.c:155
(Sorry for the quote-paste; only way to make @#$ Thunderbird not wrap
mail, I need to switch clients or fix that).
"perf top" on the process shows:
96.92% postgres [.] MemoryContextReset
0.15% [kernel] [k] cpuacct_account_field
0.09% [kernel] [k] update_cfs_rq_blocked_load
0.09% postgres [.] AllocSetAlloc
At a guess, we're looking at a case where a new child context is created
at every call, so every MemoryContextResetChildren call has to deal with
more child contexts. I'm going to take a quick look now, I just wanted
to get this written up before I got sidetracked.
--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From: | Craig Ringer <craig(at)2ndquadrant(dot)com> |
---|---|
To: | PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: bugfix patch for json_array_elements |
Date: | 2014-02-03 01:54:32 |
Message-ID: | 52EEF6D8.4080409@2ndquadrant.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On 02/03/2014 09:09 AM, Craig Ringer wrote:
> At a guess, we're looking at a case where a new child context is created
> at every call, so every MemoryContextResetChildren call has to deal with
> more child contexts.
That would be "yes". After a short run, I see 32849 lines like:
json_array_elements temporary cxt: 8192 total in 1 blocks; 8160 free (0
chunks); 32 used
under the context:
PortalMemory: 8192 total in 1 blocks
PortalHeapMemory: 7168 total in 3 blocks
ExecutorState: 65600 total in 4 blocks
ExprContext: 8192 total in 1 blocks
json_array_elements temporary cxt: 8192 total in 1 blocks;
8160 free (0 chunks); 32 used
The attached patch deletes the context after use, bringing performance
back into line. It should be backpatched to 9.3.
--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Attachment | Content-Type | Size |
---|---|---|
0001-Fix-leaked-memory-context-in-json_array_each-and-jso.patch | text/x-patch | 1.4 KB |
From: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
---|---|
To: | Craig Ringer <craig(at)2ndquadrant(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: bugfix patch for json_array_elements |
Date: | 2014-02-03 14:05:19 |
Message-ID: | 52EFA21F.60707@dunslane.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On 02/02/2014 08:54 PM, Craig Ringer wrote:
> On 02/03/2014 09:09 AM, Craig Ringer wrote:
>
>> At a guess, we're looking at a case where a new child context is created
>> at every call, so every MemoryContextResetChildren call has to deal with
>> more child contexts.
> That would be "yes". After a short run, I see 32849 lines like:
>
> json_array_elements temporary cxt: 8192 total in 1 blocks; 8160 free (0
> chunks); 32 used
>
> under the context:
>
> PortalMemory: 8192 total in 1 blocks
> PortalHeapMemory: 7168 total in 3 blocks
> ExecutorState: 65600 total in 4 blocks
> ExprContext: 8192 total in 1 blocks
> json_array_elements temporary cxt: 8192 total in 1 blocks;
> 8160 free (0 chunks); 32 used
>
>
> The attached patch deletes the context after use, bringing performance
> back into line. It should be backpatched to 9.3.
[...]
> + MemoryContextDelete(state->tmp_cxt);
> + state->tmp_cxt = NULL;
> +
> PG_RETURN_NULL();
Hmm. I guess I was assuming that the tmp_cxt would be cleaned up at the
end of the function since it's a child of the CurrentMemoryContext. But
if I got that wrong I'm happy to fix it. I don't see the point in
setting state->tmp_cxt to NULL since it's about to go out of scope anyway.
cheers
andrew
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
Cc: | Craig Ringer <craig(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: bugfix patch for json_array_elements |
Date: | 2014-02-03 16:12:00 |
Message-ID: | 31857.1391443920@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> On 02/02/2014 08:54 PM, Craig Ringer wrote:
>> The attached patch deletes the context after use, bringing performance
>> back into line. It should be backpatched to 9.3.
> Hmm. I guess I was assuming that the tmp_cxt would be cleaned up at the
> end of the function since it's a child of the CurrentMemoryContext.
The executor does MemoryContextReset, not
MemoryContextResetAndDeleteChildren, on the per-tuple context. That means
that child contexts will be reset, not deleted. I seem to recall some
discussions about changing that, or even redefining MemoryContextReset to
automatically delete child contexts; but it would take a fair amount of
research to be sure such a change was safe.
regards, tom lane
From: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Craig Ringer <craig(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: bugfix patch for json_array_elements |
Date: | 2014-02-03 16:42:12 |
Message-ID: | 52EFC6E4.5080009@dunslane.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On 02/03/2014 11:12 AM, Tom Lane wrote:
> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
>> On 02/02/2014 08:54 PM, Craig Ringer wrote:
>>> The attached patch deletes the context after use, bringing performance
>>> back into line. It should be backpatched to 9.3.
>> Hmm. I guess I was assuming that the tmp_cxt would be cleaned up at the
>> end of the function since it's a child of the CurrentMemoryContext.
> The executor does MemoryContextReset, not
> MemoryContextResetAndDeleteChildren, on the per-tuple context. That means
> that child contexts will be reset, not deleted. I seem to recall some
> discussions about changing that, or even redefining MemoryContextReset to
> automatically delete child contexts; but it would take a fair amount of
> research to be sure such a change was safe.
>
Good to know.
Is it worth a note in src/backend/utils/mmgr/README so people are warned
against making the same mistake I did? Both of these are set-returning
functions operating in materialize mode - not sure if that makes any
difference.
cheers
andrew