Re: Explain analyze getrusage tracking

Lists: pgsql-hackers
From: Greg Stark <stark(at)mit(dot)edu>
To: "<pgsql-hackers(at)postgresql(dot)org>" <pgsql-hackers(at)postgresql(dot)org>
Subject: Explain analyze getrusage tracking
Date: 2010-11-15 02:33:27
Message-ID: AANLkTikdqGshABDgkALheNZna7HBSQwzy39hDe1cn_nf@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

This is an update to my earlier patch to add getrusage resource
tracking to EXPLAIN ANALYZE.

With this patch you get something like:

QUERY PLAN
--------------------------------------------------------------------------------------------------------------
Seq Scan on i (cost=0.00..6919.44 rows=262144 width=101) (actual
time=17.240..1123.751 rows=262144 loops=1)
Resources: sys=210.000ms user=430.000ms read=33.6MB
Buffers: shared read=4298
Total runtime: 1548.651 ms
(4 rows)

The main change is to make it work under Windows. At least I think the
changes should make it work under Windows, I haven't been able to test
it. Actually I'm not to happy with the way I did it, I would be more
inclined to hack the getrusagestub,h definition of struct rusage to
have an instr_time in it so that we can use the same macros directly.
But that's more changes than I would be happy making without being
able to compile them to test them.

--
greg

Attachment Content-Type Size
explain-getrusage-v3.diff text/x-patch 16.9 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Greg Stark <stark(at)mit(dot)edu>
Cc: "<pgsql-hackers(at)postgresql(dot)org>" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Explain analyze getrusage tracking
Date: 2010-11-16 02:43:05
Message-ID: AANLkTik960v3Cpok_GYSQXiRsy4uuQbN7XO32zfEwjP3@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Nov 14, 2010 at 9:33 PM, Greg Stark <stark(at)mit(dot)edu> wrote:
> This is an update to my earlier patch to add getrusage resource
> tracking to EXPLAIN ANALYZE.
>
> With this patch you get something like:
>
>                                                  QUERY PLAN
> --------------------------------------------------------------------------------------------------------------
>  Seq Scan on i  (cost=0.00..6919.44 rows=262144 width=101) (actual
> time=17.240..1123.751 rows=262144 loops=1)
>   Resources: sys=210.000ms user=430.000ms read=33.6MB
>   Buffers: shared read=4298
>  Total runtime: 1548.651 ms
> (4 rows)
>
> The main change is to make it work under Windows. At least I think the
> changes should make it work under Windows, I haven't been able to test
> it. Actually I'm not to happy with the way I did it, I would be more
> inclined to hack the getrusagestub,h definition of struct rusage to
> have an instr_time in it so that we can use the same macros directly.
> But that's more changes than I would be happy making without being
> able to compile them to test them.

I don't really think these changes to the INSTR macros make much
sense. The macros don't really add much notational convenience;
they're mostly wrappers to make the WIN32 and non-WIN32 cases work
similarly for the instrumentation stuff, so hacking them up to use
them for this doesn't seem like it adds anything. Just do whatever it
is you need to do, or define macros locally in explain.c.

It doesn't make much sense to me to normalize the memory for this
output to a variable unit when the other memory values we use in
explain.c are still going to be printed as kB. I think we should just
print it in kB and call it good. Alternatively, we could apply the
same normalization algorithm across the board, but I don't think
that's as good.

I continue to feel strongly that the choice of EXPLAIN format should
only affect the format, not the choice of information to be displayed.
Using the verbose option to control how much data the resource option
prints is, I think, not a good idea. If you want to have two modes,
one for partial rusage data and one for full rusage data, you can just
as easily implement EXPLAIN (RESOURCE [PARTIAL|FULL]). I believe that
the existing grammar is adequate to support that; you'd just need to
write the appropriate DefElem-parsing code. But personally I'd just
print the whole kit and kaboodle regardless.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Greg Stark <stark(at)mit(dot)edu>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "<pgsql-hackers(at)postgresql(dot)org>" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Explain analyze getrusage tracking
Date: 2010-11-16 17:19:53
Message-ID: AANLkTi=zXySegth60RvEMJgyKP7qg95gaVTQLovXBuTx@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Nov 16, 2010 at 2:43 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> I don't really think these changes to the INSTR macros make much
> sense.  The macros don't really add much notational convenience;
> they're mostly wrappers to make the WIN32 and non-WIN32 cases work
> similarly for the instrumentation stuff, so hacking them up to use
> them for this doesn't seem like it adds anything.  Just do whatever it
> is you need to do, or define macros locally in explain.c.

Well they're going to be just like the macros in instr_time and I
didn't really want to duplicate them. What's really goofy here is that
on Windows the getrusage portability wrapper is actually consing up
those times from FILETIMEs which I believe are actually just like the
instr_time data type that we're already using in those macros so it's
going of doing a silly dance converting to struct timeval just so it
can use macros that aren't convenient at all on Windows.

I definitely agree this is awkward. I would love to find a cleaner
solution here. I'll try copying the macros and see if that works out
more cleanly.

>
> It doesn't make much sense to me to normalize the memory for this
> output to a variable unit when the other memory values we use in
> explain.c are still going to be printed as kB.  I think we should just
> print it in kB and call it good.  Alternatively, we could apply the
> same normalization algorithm across the board, but I don't think
> that's as good.

I think we should have a project policy of always printing memory and
disk usage in kB, MB, GB etc unless they're functions returning an
integer intended for machine use. Effectively this is the dual of
accepting units on all our memory gucs. I don't know about others but
I find it pretty hard to read things like 1234567kB and compare it to
125765kB in my head.

> I continue to feel strongly that the choice of EXPLAIN format should
> only affect the format, not the choice of information to be displayed.

I don't have a good feeling for what should appear in
XML/JSON/whetever. I thought I was just implementing the consensus
which was that there wasn't much point in not including the maximum
level of detail because the consumer could decide what to display. But
having it be the same as the human readable output is also reasonable.
Whatever we're doing elsewhere I'll make this the consistent.

>  Using the verbose option to control how much data the resource option
> prints is, I think, not a good idea.  If you want to have two modes,
> one for partial rusage data and one for full rusage data, you can just
> as easily implement EXPLAIN (RESOURCE [PARTIAL|FULL]).  I believe that
> the existing grammar is adequate to support that; you'd just need to
> write the appropriate DefElem-parsing code.  But personally I'd just
> print the whole kit and kaboodle regardless.

This is a separate question. The numbers I'm outputing without VERBOSE
now are the numbers that I understand and which relate to the other
numbers in explain. My intent was that as we understand how the other
numbers relate to Postgres functioning we'll learn which ones to
include. But mostly they're hard to relate to Postgres and the query
execution in any way and just noise that are hard to explain to users.
I'm not even sure which ones it makes sense to sum over time -- is the
integral virtual memory usage over time something that it makes sense
to sum over time? What units is it in?

I don't see why VERBOSE isn't the right key to use. The other things
VERBOSE toggles are also extra detail that's usually useless but might
be useful if you're interested in the inner workings. Things like
which fields are being included at each level of the plan for example.

--
greg


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Greg Stark <stark(at)mit(dot)edu>
Cc: "<pgsql-hackers(at)postgresql(dot)org>" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Explain analyze getrusage tracking
Date: 2010-11-16 19:44:17
Message-ID: AANLkTikDfL2QTfmZyuJta_Es7q396_xnutDJVmGMpFcU@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Nov 16, 2010 at 12:19 PM, Greg Stark <stark(at)mit(dot)edu> wrote:
> On Tue, Nov 16, 2010 at 2:43 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> I don't really think these changes to the INSTR macros make much
>> sense. The macros don't really add much notational convenience;
>> they're mostly wrappers to make the WIN32 and non-WIN32 cases work
>> similarly for the instrumentation stuff, so hacking them up to use
>> them for this doesn't seem like it adds anything. Just do whatever it
>> is you need to do, or define macros locally in explain.c.
>
> Well they're going to be just like the macros in instr_time and I
> didn't really want to duplicate them. What's really goofy here is that
> on Windows the getrusage portability wrapper is actually consing up
> those times from FILETIMEs which I believe are actually just like the
> instr_time data type that we're already using in those macros so it's
> going of doing a silly dance converting to struct timeval just so it
> can use macros that aren't convenient at all on Windows.
>
> I definitely agree this is awkward. I would love to find a cleaner
> solution here. I'll try copying the macros and see if that works out
> more cleanly.

OK.

>> It doesn't make much sense to me to normalize the memory for this
>> output to a variable unit when the other memory values we use in
>> explain.c are still going to be printed as kB. I think we should just
>> print it in kB and call it good. Alternatively, we could apply the
>> same normalization algorithm across the board, but I don't think
>> that's as good.
>
> I think we should have a project policy of always printing memory and
> disk usage in kB, MB, GB etc unless they're functions returning an
> integer intended for machine use. Effectively this is the dual of
> accepting units on all our memory gucs. I don't know about others but
> I find it pretty hard to read things like 1234567kB and compare it to
> 125765kB in my head.

We have no other place in the system that takes 12345678kB and
converts it inexactly to MB or GB.

rhaas=# set work_mem to '1048576kB';
SET
rhaas=# show work_mem;
work_mem
----------
1GB
(1 row)

rhaas=# set work_mem to '1048577kB';
SET
rhaas=# show work_mem;
work_mem
-----------
1048577kB
(1 row)

We could decide that for text-format EXPLAIN output, an inexact
conversion is OK, but it's still not OK to do it for the new fields
you're adding and not for the existing fields (see the sort and hash
instrumentation). Personally, I'm not really convinced that making
such a change has a lot of value, and I think it should be submitted
as a separate patch and discussed separately, rather than being rolled
in here. But if we are going to change it then we at least need to be
consistent.

>> Using the verbose option to control how much data the resource option
>> prints is, I think, not a good idea. If you want to have two modes,
>> one for partial rusage data and one for full rusage data, you can just
>> as easily implement EXPLAIN (RESOURCE [PARTIAL|FULL]). I believe that
>> the existing grammar is adequate to support that; you'd just need to
>> write the appropriate DefElem-parsing code. But personally I'd just
>> print the whole kit and kaboodle regardless.
>
> This is a separate question. The numbers I'm outputing without VERBOSE
> now are the numbers that I understand and which relate to the other
> numbers in explain. My intent was that as we understand how the other
> numbers relate to Postgres functioning we'll learn which ones to
> include. But mostly they're hard to relate to Postgres and the query
> execution in any way and just noise that are hard to explain to users.
> I'm not even sure which ones it makes sense to sum over time -- is the
> integral virtual memory usage over time something that it makes sense
> to sum over time? What units is it in?

I suspect that the ru_maxrss, ru_ixrss, ru_idrss, and ru_isrss values
are useless to us for EXPLAIN purposes because it sounds like they
don't increment over time. The others presumably do, so it makes
sense to show 'em if the system is collecting them.

> I don't see why VERBOSE isn't the right key to use. The other things
> VERBOSE toggles are also extra detail that's usually useless but might
> be useful if you're interested in the inner workings. Things like
> which fields are being included at each level of the plan for example.

Yeah, VERBOSE is kind of a catch-all for things that we don't have
individual flags for. But I think it's better for each piece of data
to depend on one setting, rather than a combination of two or more
settings. Otherwise you end up being forced to use VERBOSE and then
you get this deluge of output. I'd actually sort of like to remove
some things from VERBOSE and give them their own settings, rather than
adding more. The fact that VERBOSE turns on "Output:" is particularly
annoying.

...Robert


From: Greg Stark <stark(at)mit(dot)edu>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, "<pgsql-hackers(at)postgresql(dot)org>" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Explain analyze getrusage tracking
Date: 2010-11-16 19:53:53
Message-ID: AANLkTinA9scb6+RyQwAmza57f1P9KLM1e5ebAfrJ4W+M@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Nov 16, 2010 at 11:38 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> I think we should have a project policy of always printing memory and
>> disk usage in kB, MB, GB etc unless they're functions returning an
>> integer intended for machine use.
>
> rhaas=# set work_mem to '1048577kB';

Interesting. Though in this case I'll mention the raw data is in 512
byte increments so that would imply having to print .5 sometimes.

> We could decide that for text-format EXPLAIN output, an inexact
> conversion is OK, but it's still not OK to do it for the new fields
> you're adding and not for the existing fields (see the sort and hash
> instrumentation).  Personally, I'm not really convinced that making
> such a change has a lot of value, and I think it should be submitted
> as a separate patch and discussed separately, rather than being rolled
> in here.  But if we are going to change it then we at least need to be
> consistent.

Well I proposed that as a separate patch back pre-9.0 and it was
problematic. Having a policy of doing x doesn't mean we have to switch
everything to x or do nothing. We can have a policy that it's better
to do x and then go around fixing things as we find them.

But fine, I'll separate that into a separate patch.

> I suspect that the ru_maxrss, ru_ixrss, ru_idrss, and ru_isrss values
> are useless to us for EXPLAIN purposes because it sounds like they
> don't increment over time.  The others presumably do, so it makes
> sense to show 'em if the system is collecting them.

I'm not sure, the "integral" part might mean it is ok to sum. I've
never seen it explained anywhere exactly what these represent. But the
other question is whether they have anything to do with the plan
nodes. If Postgres is taking lots of memory or receiving signals or
receiving sysv messages (we don't even use sysv messages so that would
be very strange) it doesn't necessarily have anything to do with the
executor algorithm. They're probably just not very interesting stats
to present to the user.

> Yeah, VERBOSE is kind of a catch-all for things that we don't have
> individual flags for.  But I think it's better for each piece of data
> to depend on one setting, rather than a combination of two or more
> settings.  Otherwise you end up being forced to use VERBOSE and then
> you get this deluge of output.  I'd actually sort of like to remove
> some things from VERBOSE and give them their own settings, rather than
> adding more.  The fact that VERBOSE turns on "Output:" is particularly
> annoying.

I tend to think it's "don't be clever about showing me just the useful
stuff, include whatever you've got even if you think it's not useful".
I would consider it a bug if there's anything that requires VERBOSE
and a user finds is relevant to a fixing a user problem (as opposed to
debugging a postgres bug).

I'm concerned about the converse problem. The "I why do I have to
include two dozen flags to get postgres to actually include
everything?". Along with the "try turning on this flag and resending
the plan. Hm, oops forgot a flag resend it again with this too. Oh
that's still not including something we need try it a third time with
this flag" etc.

--
greg


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Greg Stark <stark(at)mit(dot)edu>
Cc: "<pgsql-hackers(at)postgresql(dot)org>" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Explain analyze getrusage tracking
Date: 2010-11-16 20:07:07
Message-ID: AANLkTinZv-HHp1Sv1FcLhxW=02dz59cEt89R6svu2mP-@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Nov 16, 2010 at 2:53 PM, Greg Stark <stark(at)mit(dot)edu> wrote:
>> Yeah, VERBOSE is kind of a catch-all for things that we don't have
>> individual flags for.  But I think it's better for each piece of data
>> to depend on one setting, rather than a combination of two or more
>> settings.  Otherwise you end up being forced to use VERBOSE and then
>> you get this deluge of output.  I'd actually sort of like to remove
>> some things from VERBOSE and give them their own settings, rather than
>> adding more.  The fact that VERBOSE turns on "Output:" is particularly
>> annoying.
>
> I tend to think it's "don't be clever about showing me just the useful
> stuff, include whatever you've got even if you think it's not useful".
> I would consider it a bug if there's anything that requires VERBOSE
> and a user finds is relevant to a fixing a user problem (as opposed to
> debugging a postgres bug).
>
> I'm concerned about the converse problem. The "I why do I have to
> include two dozen flags to get postgres to actually include
> everything?". Along with the "try turning on this flag and resending
> the plan. Hm, oops forgot a flag resend it again with this too. Oh
> that's still not including something we need try it a third time with
> this flag" etc.

Well, the reason's pretty obvious in this case: each of ANALYZE,
BUFFERS, and RESOURCE adds a huge amount of incremental overhead. I
wouldn't object to having a flag that says "just turn on absolutely
everything you've got". Then you could just say:

EXPLAIN (THECRAPOUTOFIT) query...

In any case, this seems like an argument for making RESOURCE print
everything we think might ever be useful, rather than just some subset
of it. A rule that says "print some of the RESOURCE information when
RESOURCE is specified but the rest only if VERBOSE is also specified
or if a non-text output format is used" seems to have exactly the sort
of ickiness you're complaining about; I'd rather make it all or
nothing.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Greg Stark <stark(at)mit(dot)edu>
Cc: "<pgsql-hackers(at)postgresql(dot)org>" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Explain analyze getrusage tracking
Date: 2010-11-22 12:40:53
Message-ID: AANLkTinVPqBk-83ZTm8RZ7VXFvYQ6-WF17335ro7FgHQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Nov 15, 2010 at 03:33, Greg Stark <stark(at)mit(dot)edu> wrote:
> This is an update to my earlier patch to add getrusage resource
> tracking to EXPLAIN ANALYZE.
>
> With this patch you get something like:
>
>                                                  QUERY PLAN
> --------------------------------------------------------------------------------------------------------------
>  Seq Scan on i  (cost=0.00..6919.44 rows=262144 width=101) (actual
> time=17.240..1123.751 rows=262144 loops=1)
>   Resources: sys=210.000ms user=430.000ms read=33.6MB
>   Buffers: shared read=4298
>  Total runtime: 1548.651 ms
> (4 rows)
>
> The main change is to make it work under Windows. At least I think the
> changes should make it work under Windows, I haven't been able to test
> it. Actually I'm not to happy with the way I did it, I would be more
> inclined to hack the getrusagestub,h definition of struct rusage to
> have an instr_time in it so that we can use the same macros directly.
> But that's more changes than I would be happy making without being
> able to compile them to test them.

I tried building this under windows, and got a bunch of errors.

First and easiest - you need to rename IOCOUNTERS to IO_COUNTERS in
getrusage.c :P

But then I get a number of:
c:\pgsql\src\include\portability/instr_time.h(133) : error C2371: 'instr_time' :
redefinition; different basic types

and
.\src\backend\utils\adt\pgstatfuncs.c(1345) : error C2039: 'QuadPart' : is not a
member of 'timeval'
C:\Program Files\Microsoft SDKs\Windows\v6.1\include\winsock2.h(176) : s
ee declaration of 'timeval'

which believe are related to the same issue. Haven't looked close
enough to figure out what you actually intend for it to be :-)

Finally, a number of:
c:\pgsql\src\include\executor/instrument.h(19) : fatal error C1083: Cannot open
include file: 'sys/resource.h': No such file or directory

include files simply doesn't exist on Windows. Hiding it behind an
#ifdef complains about fields missing in struct rusage in some cases
and lack of existing rusage definition in others. I think you need
includes of pg_rusage.h, which will make sure it brings in
rusagestub.h when necessary and sys/resource.h when it's there?

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


From: Greg Stark <stark(at)mit(dot)edu>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: "<pgsql-hackers(at)postgresql(dot)org>" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Explain analyze getrusage tracking
Date: 2010-11-22 14:38:52
Message-ID: AANLkTimypN_BCj0Bsbdwm2u1HT4tDO-0cwAo6Aqo=t-y@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Nov 22, 2010 at 12:40 PM, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
> I tried building this under windows, and got a bunch of errors.

Thanks!

--
greg