Re: NDirectFileRead and Write

Lists: pgsql-hackers
From: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: pgsql-hackers(at)postgresql(dot)org
Subject: NDirectFileRead and Write
Date: 2008-08-01 07:23:54
Message-ID: 20080801155433.7776.52131E4D@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello,

I'd like to use NDirectFileRead and NDirectFileWrite statistics counters
for counting reads and writes in BufFile. They are defined, but not used
now. BufFile is used for tuple sorting or materializing, so we could use
NDirectFileRead/Write to retrieve how many I/Os are done in temp tablespace.

We can get the values of NDirectFileRead/Write when we enable
log_statement_stats, log_parser_stats, log_planner_stats or
log_executor_stats. (They show always 0 for NDirectFile R/W now.)

I have a plan to store the values shown by log_xxx_stats into
pg_stat_statements, that is per-query statistics information collector
I'm developing. The combination of NDirectFile R/W and pg_stat_statements
have an advantage to detect which sql uses large tuple sorting.
And after we find such queries, we can turn trace_sort on and examine
how amount of work_mem is needed by such queries. We could save time
for tuning queries and reading server logs.

-- The patch would be very trivial ;-)

Index: src/backend/storage/file/buffile.c
===================================================================
--- src/backend/storage/file/buffile.c (HEAD)
+++ src/backend/storage/file/buffile.c (working copy)
@@ -238,6 +238,7 @@
file->nbytes = 0;
file->offsets[file->curFile] += file->nbytes;
/* we choose not to advance curOffset here */
+ NDirectFileRead++;
}

/*
@@ -300,6 +301,7 @@
file->offsets[file->curFile] += bytestowrite;
file->curOffset += bytestowrite;
wpos += bytestowrite;
+ NDirectFileWrite++;
}
file->dirty = false;

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


From: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: NDirectFileRead and Write
Date: 2008-08-05 06:26:08
Message-ID: 20080805150620.A197.52131E4D@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Here is a patch to user NDirectFileRead/Write counters to get I/O counts
in BufFile module. We can see the counters when log_statement_stats is on.

The information is different from trace_sort; trace_sort shows used blocks
in external sort, and log_statement_stats shows how many I/Os are submitted
during sorts.

I wrote:
> I'd like to use NDirectFileRead and NDirectFileWrite statistics counters
> for counting reads and writes in BufFile. They are defined, but not used
> now. BufFile is used for tuple sorting or materializing, so we could use
> NDirectFileRead/Write to retrieve how many I/Os are done in temp tablespace.

=# SET client_min_messages = log;
=# SET trace_sort = on;
=# SET log_statement_stats = on;
=# EXPLAIN ANALYZE SELECT * FROM generate_series(1, 1000000) AS i ORDER BY i;
LOG: begin tuple sort: nkeys = 1, workMem = 1024, randomAccess = f
LOG: switching to external sort with 7 tapes: CPU 0.09s/0.26u sec elapsed 0.35 sec
LOG: performsort starting: CPU 0.48s/1.68u sec elapsed 2.20 sec
LOG: finished writing final run 1 to tape 0: CPU 0.48s/1.70u sec elapsed 2.21 sec
LOG: performsort done: CPU 0.48s/1.70u sec elapsed 2.21 sec
LOG: external sort ended, 2444 disk blocks used: CPU 0.79s/2.23u sec elapsed 3.06 sec
LOG: QUERY STATISTICS
DETAIL: ! system usage stats:
! 3.078000 elapsed 2.234375 user 0.812500 system sec
! [3.328125 user 1.281250 sys total]
! buffer usage stats:
! Shared blocks: 0 read, 0 written, buffer hit rate = 0.00%
! Local blocks: 0 read, 0 written, buffer hit rate = 0.00%
! Direct blocks: 5375 read, 5374 written
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=62.33..64.83 rows=1000 width=4) (actual time=2221.485..2743.831 rows=1000000 loops=1)
Sort Key: i
Sort Method: external sort Disk: 19552kB
-> Function Scan on generate_series i (cost=0.00..12.50 rows=1000 width=4) (actual time=349.065..892.907 rows=1000000 loops=1)
Total runtime: 3087.305 ms
(5 rows)

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

Attachment Content-Type Size
NDirectFileReadWrite.patch application/octet-stream 3.0 KB

From: "Hitoshi Harada" <umi(dot)tanuki(at)gmail(dot)com>
To: "ITAGAKI Takahiro" <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: NDirectFileRead and Write
Date: 2008-09-06 08:14:00
Message-ID: e08cc0400809060114k7adcce5cl9b3236bc8e454c3f@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2008/8/5 ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>:
> Here is a patch to user NDirectFileRead/Write counters to get I/O counts
> in BufFile module. We can see the counters when log_statement_stats is on.
>
> The information is different from trace_sort; trace_sort shows used blocks
> in external sort, and log_statement_stats shows how many I/Os are submitted
> during sorts.
>
> I wrote:
>> I'd like to use NDirectFileRead and NDirectFileWrite statistics counters
>> for counting reads and writes in BufFile. They are defined, but not used
>> now. BufFile is used for tuple sorting or materializing, so we could use
>> NDirectFileRead/Write to retrieve how many I/Os are done in temp tablespace.
>
> =# SET client_min_messages = log;
> =# SET trace_sort = on;
> =# SET log_statement_stats = on;
> =# EXPLAIN ANALYZE SELECT * FROM generate_series(1, 1000000) AS i ORDER BY i;
> LOG: begin tuple sort: nkeys = 1, workMem = 1024, randomAccess = f
> LOG: switching to external sort with 7 tapes: CPU 0.09s/0.26u sec elapsed 0.35 sec
> LOG: performsort starting: CPU 0.48s/1.68u sec elapsed 2.20 sec
> LOG: finished writing final run 1 to tape 0: CPU 0.48s/1.70u sec elapsed 2.21 sec
> LOG: performsort done: CPU 0.48s/1.70u sec elapsed 2.21 sec
> LOG: external sort ended, 2444 disk blocks used: CPU 0.79s/2.23u sec elapsed 3.06 sec
> LOG: QUERY STATISTICS
> DETAIL: ! system usage stats:
> ! 3.078000 elapsed 2.234375 user 0.812500 system sec
> ! [3.328125 user 1.281250 sys total]
> ! buffer usage stats:
> ! Shared blocks: 0 read, 0 written, buffer hit rate = 0.00%
> ! Local blocks: 0 read, 0 written, buffer hit rate = 0.00%
> ! Direct blocks: 5375 read, 5374 written
> QUERY PLAN
> ------------------------------------------------------------------------------------------------------------------------------------
> Sort (cost=62.33..64.83 rows=1000 width=4) (actual time=2221.485..2743.831 rows=1000000 loops=1)
> Sort Key: i
> Sort Method: external sort Disk: 19552kB
> -> Function Scan on generate_series i (cost=0.00..12.50 rows=1000 width=4) (actual time=349.065..892.907 rows=1000000 loops=1)
> Total runtime: 3087.305 ms
> (5 rows)
>
> Regards,
> ---
> ITAGAKI Takahiro
> NTT Open Source Software Center
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>

Though I'm not quite good around buffers, I reviewed this patch and
found that it seems strange about the declaring position of the two
variables. In ResetBufferUsage() of bufmgr.c it says:

void
ResetBufferUsage(void)
{
BufferHitCount = 0;
ReadBufferCount = 0;
BufferFlushCount = 0;
LocalBufferHitCount = 0;
ReadLocalBufferCount = 0;
LocalBufferFlushCount = 0;
NDirectFileRead = 0;
NDirectFileWrite = 0;
}

so I guess all of these variables should be defined together but
actually you put the two in buffile.h while the others in
buf_iinnternals.h. Is there clear reason for that?

Regards,

--
Hitoshi Harada


From: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: "Hitoshi Harada" <umi(dot)tanuki(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: NDirectFileRead and Write
Date: 2008-09-08 09:23:17
Message-ID: 20080908180748.983F.52131E4D@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


"Hitoshi Harada" <umi(dot)tanuki(at)gmail(dot)com> wrote:

> so I guess all of these variables should be defined together but
> actually you put the two in buffile.h while the others in
> buf_internals.h. Is there clear reason for that?

That's because buffile.c includes buffile.h, but not buf_internals.h .
NDirectFileRead/Writes are counters for BufFile in the patch
and don't depend on bufmgr.

It might be cleaner if we had something like storage/bufstats.h
and declared all counter variables in it, but it seems to be an overkill.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: NDirectFileRead and Write
Date: 2008-09-11 18:34:32
Message-ID: 6512.1221158072@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> writes:
> Here is a patch to user NDirectFileRead/Write counters to get I/O counts
> in BufFile module. We can see the counters when log_statement_stats is on.

Couple thoughts here:

* Let's rename them BufFileReadCount and BufFileWriteCount to reflect
their actual purpose.

* If the extern's are moved to buffile.h, I think the definitions of the
variables themselves should move to buffile.c. However, that would
imply including buffile.h in bufmgr.c which is a bit ugly from a
modularity standpoint. In any case I agree that the current arrangement
with execdebug.h declaring variables defined in bufmgr.c is just weird.

regards, tom lane


From: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: NDirectFileRead and Write
Date: 2008-09-12 10:24:34
Message-ID: 20080912191753.5B98.52131E4D@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> * Let's rename them BufFileReadCount and BufFileWriteCount to reflect
> their actual purpose.

It looks good. No one uses NDirectFileRead/Write variables now,
so we can rename it freely.

BTW, we have the word 'Direct' in ShowBufferUsage().
Can I keep it as-is?

ShowBufferUsage()
! Shared blocks: ...
! Local blocks: ...
! Direct blocks: <R> read, <W> written

> * In any case I agree that the current arrangement
> with execdebug.h declaring variables defined in bufmgr.c is just weird.

Ok.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: NDirectFileRead and Write
Date: 2008-09-12 11:45:51
Message-ID: 2111.1221219951@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> writes:
> BTW, we have the word 'Direct' in ShowBufferUsage().
> Can I keep it as-is?

> ShowBufferUsage()
> ! Shared blocks: ...
> ! Local blocks: ...
> ! Direct blocks: <R> read, <W> written

Good point. Seems like it should be changed, but I am not sure to what.
Ideas anyone?

regards, tom lane


From: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: NDirectFileRead and Write
Date: 2008-09-16 01:56:29
Message-ID: 20080916104812.7E68.52131E4D@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Here is a revised patch to renewal NDirectFileRead/Write.

> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > * Let's rename them BufFileReadCount and BufFileWriteCount to reflect
> > their actual purpose.
> > * In any case I agree that the current arrangement
> > with execdebug.h declaring variables defined in bufmgr.c is just weird.

- NDirectFile{Read|Write} are renamed to BufFile{Read|Write}Count.
- They are still declared in execdebug.h and buffile.c includes it.

I did not touch messages in ShowBufferUsage() in the patch.
I think it still has meaning because BufFile counters are
kinds of direct block access.

> ShowBufferUsage()
> ! Shared blocks: ...
> ! Local blocks: ...
> ! Direct blocks: <R> read, <W> written

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

Attachment Content-Type Size
BufFileCount.patch application/octet-stream 3.9 KB

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: NDirectFileRead and Write
Date: 2008-09-16 09:12:03
Message-ID: 48CF7863.5080903@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> writes:
>> BTW, we have the word 'Direct' in ShowBufferUsage().
>> Can I keep it as-is?
>
>> ShowBufferUsage()
>> ! Shared blocks: ...
>> ! Local blocks: ...
>> ! Direct blocks: <R> read, <W> written
>
> Good point. Seems like it should be changed, but I am not sure to what.
> Ideas anyone?

How about "Temp blocks"? It might be slightly confusing that access to
temporary tables shows up in "Local blocks", rather than "Temp blocks",
though. But then again, temp_tablespaces already controls where the
files are created, among other things, so there's some precedence to
calling those things "temp".

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Subject: Re: NDirectFileRead and Write
Date: 2008-09-17 13:21:33
Message-ID: 10637.1221657693@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> writes:
> - NDirectFile{Read|Write} are renamed to BufFile{Read|Write}Count.
> - They are still declared in execdebug.h and buffile.c includes it.

After some thought I concluded that the least ugly way to do this
was to declare and define the variables in the same places as the
other counters that are printed by ShowBufferUsage. It's not any
worse from a modularity standpoint than the other alternatives we
considered, and it seems to satisfy the principle of least
surprise a little better. Committed it that way.

> I did not touch messages in ShowBufferUsage() in the patch.

I didn't change the message either. Heikki's suggestion of "temp
blocks" doesn't seem very good to me because of the likelihood
of confusion with temp tables. I don't have a better alternative
offhand.

regards, tom lane