From: | "Etsuro Fujita" <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp> |
---|---|
To: | "'Amit Khandekar'" <amit(dot)khandekar(at)enterprisedb(dot)com> |
Cc: | "'Fujii Masao'" <masao(dot)fujii(at)gmail(dot)com>, "'pgsql-hackers'" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Show lossy heap block info in EXPLAIN ANALYZE for bitmap heap scan |
Date: | 2013-11-25 08:07:34 |
Message-ID: | 00e601cee9b5$65ceff20$316cfd60$@etsuro@lab.ntt.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
From: Amit Khandekar [mailto:amit(dot)khandekar(at)enterprisedb(dot)com]
> On 1 November 2013 16:32, Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>> From: Fujii Masao [mailto:masao(dot)fujii(at)gmail(dot)com]
>>> I'm not sure if it's good idea to show the number of the fetches because it
>>> seems difficult to tune work_mem from that number. How can we calculate how
>>> much to increase work_mem to avoid lossy bitmap from the number of the fetches
>>> in EXPLAIN output?
>> We can calculate that from the following equation in tbm_create():
>> nbuckets = maxbytes /
>> (MAXALIGN(sizeof(HASHELEMENT)) + MAXALIGN(sizeof(PagetableEntry))
>> + sizeof(Pointer) + sizeof(Pointer)),
>> where maxbytes is the size of memory used for the hashtable in a TIDBitmap,
>> designated by work_mem, and nbuckets is the estimated number of hashtable
>> entries we can have within maxbytes. From this, the size of work_mem within
>> which we can have every hashtable entry as an exact bitmap is calculated as
>> follows:
>> work_mem = (the number of exact pages + the number of lossy pages) *
>> (MAXALIGN(sizeof(HASHELEMENT)) + MAXALIGN(sizeof(PagetableEntry))
>> + sizeof(Pointer) + sizeof(Pointer)) /
>> (1024 * 1024).
> I am yet to give more thought on the above formula
> (particularly exact_pages + lossy_pages), but I was also wondering if the user
> would indeed be able to figure out the above way to estimate the memory, or the
> explain itself should show the estimated memory required for the bitmap. For
> hash joins we do show the memory taken by the hash table in show_hash_info(). We
> can show the memory requirement in addition to the number of exact/lossy pages.
Thank you for the review!
Reconsidering that, I wish to know your opinion. The patch shows the number of exact/lossy pages that has been fetched in a bitmap heap scan. But the number varies with the fraction of tuples to be retrieved like the following.
postgres=# EXPLAIN ANALYZE SELECT * FROM demo WHERE col2 between 0.01 and 0.02;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on demo (cost=2187.35..101419.96 rows=102919 width=42) (actual time=23.684..1302.382 rows=99803 loops=1)
Recheck Cond: ((col2 >= 0.01::double precision) AND (col2 <= 0.02::double precision))
Rows Removed by Index Recheck: 6279502
Heap Blocks: exact=1990 lossy=59593
-> Bitmap Index Scan on demo_col2_idx (cost=0.00..2161.62 rows=102919 width=0) (actual time=23.330..23.330 rows=99803 loops=1)
Index Cond: ((col2 >= 0.01::double precision) AND (col2 <= 0.02::double precision))
Total runtime: 1311.949 ms
(7 rows)
postgres=# EXPLAIN ANALYZE SELECT * FROM demo WHERE col2 between 0.01 and 0.02 LIMIT 5000;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=2187.35..7008.26 rows=5000 width=42) (actual time=23.543..86.093 rows=5000 loops=1)
-> Bitmap Heap Scan on demo (cost=2187.35..101419.96 rows=102919 width=42) (actual time=23.542..85.196 rows=5000 loops=1)
Recheck Cond: ((col2 >= 0.01::double precision) AND (col2 <= 0.02::double precision))
Rows Removed by Index Recheck: 312179
Heap Blocks: exact=99 lossy=2963
-> Bitmap Index Scan on demo_col2_idx (cost=0.00..2161.62 rows=102919 width=0) (actual time=23.189..23.189 rows=99803 loops=1)
Index Cond: ((col2 >= 0.01::double precision) AND (col2 <= 0.02::double precision))
Total runtime: 86.626 ms
(8 rows)
So, my question is, we should show the number of exact/lossy pages in a TIDBitmap, not the number of these pages that has been fetched in the bitmap heap scan?
Thanks,
Best regards,
Etsuro Fujita
From | Date | Subject | |
---|---|---|---|
Next Message | Christian Ullrich | 2013-11-25 08:42:54 | Re: PostgreSQL Service on Windows does not start. ~ "is not a valid Win32 application" |
Previous Message | Alexey Vasiliev | 2013-11-25 05:15:57 | Re[2]: [HACKERS] Re[2]: [HACKERS] Connect from background worker thread to database |