Re: querying with index on jsonb slower than standard column. Why?

From: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Tim Dudgeon <tdudgeon(dot)ml(at)gmail(dot)com>, pgsql-sql(at)postgresql(dot)org
Subject: Re: querying with index on jsonb slower than standard column. Why?
Date: 2014-12-08 16:44:52
Message-ID: 5485D584.8080105@aklaver.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance pgsql-sql

On 12/08/2014 07:50 AM, Adrian Klaver wrote:
> On 12/08/2014 07:46 AM, Tom Lane wrote:
>> Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com> writes:
>>> On 12/07/2014 05:28 PM, Tom Lane wrote:
>>>> I don't see any particular difference ...
>>
>>> Running the above on my machine I do see the slow down the OP reports. I
>>> ran it several times and it stayed around 3.5x.
>>
>> Interesting. A couple of points that might be worth checking:
>>
>> * I tried this on a 64-bit build, whereas you were evidently using
>> 32-bit.
>
> My laptop is 64-bit, so when I get a chance I will setup the test there
> and run it to see what happens.
>
>>

Seems work_mem is the key:

postgres(at)test=# select version();
version

-------------------------------------------------------------------------------------------------------------------------------------
PostgreSQL 9.4rc1 on x86_64-unknown-linux-gnu, compiled by gcc (SUSE
Linux) 4.8.1 20130909 [gcc-4_8-branch revision 202388], 64-bit
(1 row)

The default:

postgres(at)test=# show work_mem ;
work_mem
----------
4MB
(1 row)

postgres(at)test=# \timing
Timing is on.
postgres(at)test=# explain analyze select count(*) from json_test where
(data->>'assay1_ic50')::float > 90
test-# and (data->>'assay2_ic50')::float < 10;

QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=198713.45..198713.46 rows=1 width=0) (actual
time=8564.799..8564.799 rows=1 loops=1)
-> Bitmap Heap Scan on json_test (cost=36841.42..198465.53
rows=99168 width=0) (actual time=1043.226..8550.183 rows=99781 loops=1)
Recheck Cond: ((((data ->> 'assay1_ic50'::text))::double
precision > 90::double precision) AND (((data ->>
'assay2_ic50'::text))::double precision < 10::double precision))
Rows Removed by Index Recheck: 7236280
Heap Blocks: exact=30252 lossy=131908
-> BitmapAnd (cost=36841.42..36841.42 rows=99168 width=0)
(actual time=1034.738..1034.738 rows=0 loops=1)
-> Bitmap Index Scan on idx_data_json_assay1_ic50
(cost=0.00..18157.96 rows=983136 width=0) (actual time=513.878..513.878
rows=1001237 loops=1)
Index Cond: (((data ->>
'assay1_ic50'::text))::double precision > 90::double precision)
-> Bitmap Index Scan on idx_data_json_assay2_ic50
(cost=0.00..18633.62 rows=1008691 width=0) (actual time=502.396..502.396
rows=1000930 loops=1)
Index Cond: (((data ->>
'assay2_ic50'::text))::double precision < 10::double precision)
Planning time: 121.962 ms
Execution time: 8565.609 ms
(12 rows)

Time: 9110.408 ms
postgres(at)test=# explain analyze select count(*) from json_test where
assay1_ic50 > 90 and assay2_ic50 < 10;

QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=197225.91..197225.92 rows=1 width=0) (actual
time=1848.769..1848.769 rows=1 loops=1)
-> Bitmap Heap Scan on json_test (cost=36841.41..196977.99
rows=99168 width=0) (actual time=405.110..1839.299 rows=99781 loops=1)
Recheck Cond: ((assay1_ic50 > 90::double precision) AND
(assay2_ic50 < 10::double precision))
Rows Removed by Index Recheck: 7236280
Heap Blocks: exact=30252 lossy=131908
-> BitmapAnd (cost=36841.41..36841.41 rows=99168 width=0)
(actual time=397.138..397.138 rows=0 loops=1)
-> Bitmap Index Scan on idx_data_col_assay1_ic50
(cost=0.00..18157.96 rows=983136 width=0) (actual time=196.304..196.304
rows=1001237 loops=1)
Index Cond: (assay1_ic50 > 90::double precision)
-> Bitmap Index Scan on idx_data_col_assay2_ic50
(cost=0.00..18633.62 rows=1008691 width=0) (actual time=182.845..182.845
rows=1000930 loops=1)
Index Cond: (assay2_ic50 < 10::double precision)
Planning time: 0.212 ms
Execution time: 1848.814 ms
(12 rows)

Time: 1849.570 ms

****************************************************************************

Set work_mem up:

postgres(at)test=# set work_mem='16MB';
SET
Time: 0.143 ms
postgres(at)test=# show work_mem;
work_mem
----------
16MB
(1 row)

postgres(at)test=# explain analyze select count(*) from json_test where
(data->>'assay1_ic50')::float > 90
and (data->>'assay2_ic50')::float < 10;

QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=198713.45..198713.46 rows=1 width=0) (actual
time=861.413..861.413 rows=1 loops=1)
-> Bitmap Heap Scan on json_test (cost=36841.42..198465.53
rows=99168 width=0) (actual time=588.969..852.720 rows=99781 loops=1)
Recheck Cond: ((((data ->> 'assay1_ic50'::text))::double
precision > 90::double precision) AND (((data ->>
'assay2_ic50'::text))::double precision < 10::double precision))
Heap Blocks: exact=77216
-> BitmapAnd (cost=36841.42..36841.42 rows=99168 width=0)
(actual time=564.927..564.927 rows=0 loops=1)
-> Bitmap Index Scan on idx_data_json_assay1_ic50
(cost=0.00..18157.96 rows=983136 width=0) (actual time=265.318..265.318
rows=1001237 loops=1)
Index Cond: (((data ->>
'assay1_ic50'::text))::double precision > 90::double precision)
-> Bitmap Index Scan on idx_data_json_assay2_ic50
(cost=0.00..18633.62 rows=1008691 width=0) (actual time=256.225..256.225
rows=1000930 loops=1)
Index Cond: (((data ->>
'assay2_ic50'::text))::double precision < 10::double precision)
Planning time: 0.126 ms
Execution time: 861.453 ms
(11 rows)

Time: 861.965 ms
postgres(at)test=# explain analyze select count(*) from json_test where
assay1_ic50 > 90 and assay2_ic50 < 10;

QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=197225.91..197225.92 rows=1 width=0) (actual
time=848.410..848.410 rows=1 loops=1)
-> Bitmap Heap Scan on json_test (cost=36841.41..196977.99
rows=99168 width=0) (actual time=578.360..839.659 rows=99781 loops=1)
Recheck Cond: ((assay1_ic50 > 90::double precision) AND
(assay2_ic50 < 10::double precision))
Heap Blocks: exact=77216
-> BitmapAnd (cost=36841.41..36841.41 rows=99168 width=0)
(actual time=554.387..554.387 rows=0 loops=1)
-> Bitmap Index Scan on idx_data_col_assay1_ic50
(cost=0.00..18157.96 rows=983136 width=0) (actual time=263.961..263.961
rows=1001237 loops=1)
Index Cond: (assay1_ic50 > 90::double precision)
-> Bitmap Index Scan on idx_data_col_assay2_ic50
(cost=0.00..18633.62 rows=1008691 width=0) (actual time=247.268..247.268
rows=1000930 loops=1)
Index Cond: (assay2_ic50 < 10::double precision)
Planning time: 0.128 ms
Execution time: 848.453 ms
(11 rows)

*****************************************************************

Then set it back:

postgres(at)test=# set work_mem='4MB';
SET
Time: 0.213 ms
postgres(at)test=# show work_mem ;
work_mem
----------
4MB
(1 row)

postgres(at)test=# explain analyze select count(*) from json_test where
(data->>'assay1_ic50')::float > 90
and (data->>'assay2_ic50')::float < 10;

QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=198713.45..198713.46 rows=1 width=0) (actual
time=6607.650..6607.650 rows=1 loops=1)
-> Bitmap Heap Scan on json_test (cost=36841.42..198465.53
rows=99168 width=0) (actual time=400.598..6594.442 rows=99781 loops=1)
Recheck Cond: ((((data ->> 'assay1_ic50'::text))::double
precision > 90::double precision) AND (((data ->>
'assay2_ic50'::text))::double precision < 10::double precision))
Rows Removed by Index Recheck: 7236280
Heap Blocks: exact=30252 lossy=131908
-> BitmapAnd (cost=36841.42..36841.42 rows=99168 width=0)
(actual time=392.622..392.622 rows=0 loops=1)
-> Bitmap Index Scan on idx_data_json_assay1_ic50
(cost=0.00..18157.96 rows=983136 width=0) (actual time=191.598..191.598
rows=1001237 loops=1)
Index Cond: (((data ->>
'assay1_ic50'::text))::double precision > 90::double precision)
-> Bitmap Index Scan on idx_data_json_assay2_ic50
(cost=0.00..18633.62 rows=1008691 width=0) (actual time=183.107..183.107
rows=1000930 loops=1)
Index Cond: (((data ->>
'assay2_ic50'::text))::double precision < 10::double precision)
Planning time: 0.126 ms
Execution time: 6607.692 ms
(12 rows)

Time: 6608.197 ms
postgres(at)test=# explain analyze select count(*) from json_test where
assay1_ic50 > 90 and assay2_ic50 < 10;

QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=197225.91..197225.92 rows=1 width=0) (actual
time=1836.383..1836.383 rows=1 loops=1)
-> Bitmap Heap Scan on json_test (cost=36841.41..196977.99
rows=99168 width=0) (actual time=396.414..1826.818 rows=99781 loops=1)
Recheck Cond: ((assay1_ic50 > 90::double precision) AND
(assay2_ic50 < 10::double precision))
Rows Removed by Index Recheck: 7236280
Heap Blocks: exact=30252 lossy=131908
-> BitmapAnd (cost=36841.41..36841.41 rows=99168 width=0)
(actual time=388.498..388.498 rows=0 loops=1)
-> Bitmap Index Scan on idx_data_col_assay1_ic50
(cost=0.00..18157.96 rows=983136 width=0) (actual time=187.928..187.928
rows=1001237 loops=1)
Index Cond: (assay1_ic50 > 90::double precision)
-> Bitmap Index Scan on idx_data_col_assay2_ic50
(cost=0.00..18633.62 rows=1008691 width=0) (actual time=182.743..182.743
rows=1000930 loops=1)
Index Cond: (assay2_ic50 < 10::double precision)
Planning time: 0.109 ms
Execution time: 1836.422 ms
(12 rows)

--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2014-12-08 16:56:04 Re: querying with index on jsonb slower than standard column. Why?
Previous Message Adrian Klaver 2014-12-08 15:50:18 Re: querying with index on jsonb slower than standard column. Why?

Browse pgsql-sql by date

  From Date Subject
Next Message Tom Lane 2014-12-08 16:56:04 Re: querying with index on jsonb slower than standard column. Why?
Previous Message Adrian Klaver 2014-12-08 15:50:18 Re: querying with index on jsonb slower than standard column. Why?