Re: Sampling Profler for Postgres

Lists: pgsql-hackers
From: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Sampling Profler for Postgres
Date: 2009-03-09 04:55:33
Message-ID: 20090309125146.913C.52131E4D@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello,

I think we need two types of profilers: SQL-based and resource-based.
We have some SQL-based profilers like slow-query logs
(log_min_duration_statement) and contrib/pg_stat_statements in 8.4.
For resource-based profilers, we have DTrace probes[1] and continue to
extend them[2], but unfortunately DTrace only works on Solaris and limited
platforms. Also, it is not so easy for typical users to write profilers
using DTrace without performance degradation.

[1] http://developer.postgresql.org/pgdocs/postgres/dynamic-trace.html
[2] http://archives.postgresql.org/pgsql-hackers/2009-03/msg00226.php

Therefore, I'd like to propose an profiler with sampling approach in 8.5.
The attached patch is an experimental model of the profiler.
Each backends reports its condtion in PgBackendStatus.st_condition
and the stats collector process does polling them every seconds.
This is an extension of the st_waiting field, which reports locking
condition in pg_stat_activity. There are some advantages in portability
and less overhead.

Consideration is needed about how to coexist with DTrace. I added codes to
push/pop conditions just on the same place as TRACE_POSTGRESQL_*_START/DONE().
So, we could merge the codes of DTrace and the profiler, or implement one of
them with another.

I would emphasize that an offical profler is required in this area
because it enables users to share knowledge and documentaions;
information-sharing would be difficult if they use home-made profilers.

Comments welcome.

----
Here is a sample output of the profiler with pgbench on Windows:

$ pgbench -i -s3
$ psql -c "SELECT pg_save_profiles()"
$ pgbench -c4 -T60 -n
transaction type: TPC-B (sort of)
tps = 401.510694

$ psql -c "SELECT * FROM pg_diff_profiles"
profid | profname | percent
--------+--------------------+---------
19 | XLog:Write | 23.04 <- means wal contension
46 | LWLock:WALWrite | 23.04 <- same as the above
32 | Lock:Transaction | 22.61 <- confliction on row locks
15 | Network:Recv | 7.83
21 | Data:Stat | 4.35 <- lseek() is slow on Windows
7 | CPU:Execute | 3.91
3 | CPU | 3.91
1 | Idle:InTransaction | 2.61
5 | CPU:Rewrite | 1.74
16 | Network:Send | 1.74
6 | CPU:Plan | 1.74
31 | Lock:Tuple | 1.74
4 | CPU:Parse | 0.87
11 | CPU:Commit | 0.87
(14 rows)

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

Attachment Content-Type Size
profiler_0309.tar.gz application/octet-stream 13.4 KB
additional_script.sql application/octet-stream 629 bytes

From: "Dickson S(dot) Guedes" <listas(at)guedesoft(dot)net>
To: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Sampling Profler for Postgres
Date: 2009-03-09 14:01:14
Message-ID: 1236607274.4655.18.camel@analise3.cresoltec.com.br
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Em Seg, 2009-03-09 às 13:55 +0900, ITAGAKI Takahiro escreveu:
> Therefore, I'd like to propose an profiler with sampling approach in 8.5.
> The attached patch is an experimental model of the profiler.
> Each backends reports its condtion in PgBackendStatus.st_condition
> and the stats collector process does polling them every seconds.

Hi Takahiro!

Compiled and Works fine here on Ubuntu 8.04 2.6.25.15-bd-mod #1 SMP
PREEMPT Thu Nov 27 10:05:44 BRST 2008 i686 GNU/Linux

dba(at)analise3:/srv/postgresql/HEAD$ ./bin/pgbench -i -s3
dba(at)analise3:/srv/postgresql/HEAD$ ./bin/pgbench -i -s3 -d postgres
transaction type: TPC-B (sort of)
scaling factor: 3
query mode: simple
number of clients: 4
duration: 60 s
number of transactions actually processed: 3730
tps = 62.090946 (including connections establishing)
tps = 62.112183 (excluding connections establishing)
dba(at)analise3:/srv/postgresql/HEAD$ ./bin/psql -c "SELECT * FROM
pg_diff_profiles" -d postgres
profid | profname | percent
--------+------------------+---------
15 | Network:Recv | 50.45
16 | Network:Send | 24.55
32 | Lock:Transaction | 7.14
3 | CPU | 5.80
20 | XLog:Flush | 3.13
31 | Lock:Tuple | 2.68
7 | CPU:Execute | 1.79
6 | CPU:Plan | 1.79
46 | LWLock:WALWrite | 1.34
11 | CPU:Commit | 0.89
19 | XLog:Write | 0.45
(11 rows)

Two questions here:

1) How will be this behavior in a syncrep environment? I don't have one
here to test this, yet.
2) I couldn't find a clear way to disable it. There is one in this patch
or are you planning this to future?

Regards,
--
Dickson S. Guedes
mail/xmpp: guedes(at)guedesoft(dot)net - skype: guediz
http://guedesoft.net - http://planeta.postgresql.org.br


From: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: "Dickson S(dot) Guedes" <listas(at)guedesoft(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Sampling Profler for Postgres
Date: 2009-03-10 01:23:12
Message-ID: 20090310094751.9559.52131E4D@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


"Dickson S. Guedes" <listas(at)guedesoft(dot)net> wrote:

> Compiled and Works fine here on Ubuntu 8.04 2.6.25.15-bd-mod #1 SMP
> PREEMPT Thu Nov 27 10:05:44 BRST 2008 i686 GNU/Linux

Thanks for testing. Network (or communication between pgbench and postgres)
seems to be a bottleneck on your machine.

> Two questions here:
>
> 1) How will be this behavior in a syncrep environment? I don't have one
> here to test this, yet.

I think it has relation with hot-standby, but not syncrep.
Profiling is enabled when stats collector process is running.
We already run the collector during warm-standby, so profiling would
be also available on log-shipping slaves.

> 2) I couldn't find a clear way to disable it. There is one in this patch
> or are you planning this to future?

Ah, I forgot sampling should be disabled when track_activities is off.
I'll fix it in the next patch. Also, I'd better measure overheads
by the patch.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


From: "Dickson S(dot) Guedes" <listas(at)guedesoft(dot)net>
To: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Sampling Profler for Postgres
Date: 2009-03-10 01:54:11
Message-ID: 1236650051.6410.71.camel@guedes-laptop
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Em Ter, 2009-03-10 às 10:23 +0900, ITAGAKI Takahiro escreveu:
> Thanks for testing. Network (or communication between pgbench and postgres)
> seems to be a bottleneck on your machine.

Yes, it is a very poor machine for quicktest. I'll test other
environments tomorrow.

> > Two questions here:
> >
> > 1) How will be this behavior in a syncrep environment? I don't have one
> > here to test this, yet.
>
> I think it has relation with hot-standby, but not syncrep.
> Profiling is enabled when stats collector process is running.
> We already run the collector during warm-standby, so profiling would
> be also available on log-shipping slaves.

OK. Thanks.

> > 2) I couldn't find a clear way to disable it. There is one in this patch
> > or are you planning this to future?
>
> Ah, I forgot sampling should be disabled when track_activities is off.
> I'll fix it in the next patch. Also, I'd better measure overheads
> by the patch.

Will be very nice if I could on/off it. When done, please send us. I'd
like to test it in some stress scenarios, enabling and disabling it on
some environment and comparing with my old benchmarks.

Regards,
--
Dickson S. Guedes
mail/xmpp: guedes(at)guedesoft(dot)net - skype: guediz
http://guedesoft.net - http://planeta.postgresql.org.br


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Sampling Profler for Postgres
Date: 2009-03-10 01:57:27
Message-ID: 19676.1236650247@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> writes:
> For resource-based profilers, we have DTrace probes[1] and continue to
> extend them[2], but unfortunately DTrace only works on Solaris and limited
> platforms.

FWIW, the systemtap guys are really, really close to having a working
DTrace equivalent for Linux:
http://gnu.wildebeest.org/diary/2009/02/24/systemtap-09-markers-everywhere/

It's not *quite* there for our purposes
https://bugzilla.redhat.com/show_bug.cgi?id=488941
but I'll be surprised if I'm not dtracing on my Fedora 10 machine before
the week is out.

I'm not at all convinced that we should be putting effort into a
homegrown, partial substitute for DTrace.

regards, tom lane


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Sampling Profler for Postgres
Date: 2009-03-10 08:17:41
Message-ID: 1236673061.31880.339.camel@ebony.2ndQuadrant
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On Mon, 2009-03-09 at 21:57 -0400, Tom Lane wrote:
> ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> writes:
> > For resource-based profilers, we have DTrace probes[1] and continue to
> > extend them[2], but unfortunately DTrace only works on Solaris and limited
> > platforms.
>
> FWIW, the systemtap guys are really, really close to having a working
> DTrace equivalent for Linux:
> http://gnu.wildebeest.org/diary/2009/02/24/systemtap-09-markers-everywhere/
>
> It's not *quite* there for our purposes
> https://bugzilla.redhat.com/show_bug.cgi?id=488941
> but I'll be surprised if I'm not dtracing on my Fedora 10 machine before
> the week is out.

After all this time, you think it will be done in a week :-)

> I'm not at all convinced that we should be putting effort into a
> homegrown, partial substitute for DTrace.

I was, but I'm not anymore.

Do you think we will be able to enable this in builds for 8.4?

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Sampling Profler for Postgres
Date: 2009-03-10 12:04:35
Message-ID: 25823.1236686675@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
> On Mon, 2009-03-09 at 21:57 -0400, Tom Lane wrote:
>> I'm not at all convinced that we should be putting effort into a
>> homegrown, partial substitute for DTrace.

> I was, but I'm not anymore.

> Do you think we will be able to enable this in builds for 8.4?

The bugzilla entry I pointed to was asking me to enable it for 8.3.
Which I did. It's certainly got some rough edges today, but I fully
expect it to be usable when Fedora 11 ships.

regards, tom lane


From: Stefan Moeding <pgsql(at)moeding(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Sampling Profler for Postgres
Date: 2009-03-10 19:40:45
Message-ID: 87ocw9dv42.fsf@esprit.moeding.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi!

Tom Lane writes:

> I'm not at all convinced that we should be putting effort into a
> homegrown, partial substitute for DTrace.

In my opinion providing DTrace as the only means of profiling would
except a number of users from the tuning benefits. DTrace seems to rely
on specific kernel options on Linux, which you might not be able to
influence if you run your business on leased virtual servers hosted
somewhere. DTrace is also not available for all platforms, most notably
Windows.

DTrace might be a great tool for the developers and should probably be
used. For the rest of the world I see a benefit in having something
like the proposed solution that could be enabled by the database
administrator on every server or maybe even be the default. I think it
would reduce the guesswork on why something might me slow and the work
on 'probable' causes and establish more of a 'tuning by numbers'
attitude.

Looking at the existing probes in HEAD it this seems to be your target
to provide high-level resource usage patterns to the user and I agree
that this is the right abstraction layer. With this proposal I see a
way of providing the resource usage in a (database) user-friendly way:
namely as tupels that the user can access in a familiar manner and
without using shell commands on a server that he might not even have
access to. I also see an easy way of keeping historic data by copying
the current state with a timestamp to a different table and then being
able to look at performance problems of last night when nobody was there
to notice it and fire up a profiler to watch it.

Just my 0.02€.

--
Stefan


From: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: "Dickson S(dot) Guedes" <listas(at)guedesoft(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Sampling Profler for Postgres
Date: 2009-03-11 01:42:12
Message-ID: 20090311102746.02CE.52131E4D@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


"Dickson S. Guedes" <listas(at)guedesoft(dot)net> wrote:

> > > 2) I couldn't find a clear way to disable it. There is one in this patch
> > > or are you planning this to future?
> >
> > Ah, I forgot sampling should be disabled when track_activities is off.
> > I'll fix it in the next patch. Also, I'd better measure overheads
> > by the patch.
>
> Will be very nice if I could on/off it. When done, please send us. I'd
> like to test it in some stress scenarios, enabling and disabling it on
> some environment and comparing with my old benchmarks.

Here is a new version of the patch. I added a new GUC parameter
'profiling_interval' (ms). Profiling is disabled when the value is 0.
The default value is 1 second. You could get more granular results
if you set the value to 100-500ms, but 1 sec should be enough for
continuous regular load (like benchmarks).

I cannot see any differences whether profiling is on/off.
So I think sampling has little overheads for now.
Please notify me report if you see troubles.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

Attachment Content-Type Size
profiler_0311.tar.gz application/octet-stream 14.1 KB

From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Sampling Profler for Postgres
Date: 2009-03-24 12:16:55
Message-ID: 49C8CF37.7050507@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On March 10, 2009, Tom Lane wrote:
> FWIW, the systemtap guys are really, really close to having a working
> DTrace equivalent for Linux:
> http://gnu.wildebeest.org/diary/2009/02/24/systemtap-09-markers-everywhere/
>
> It's not *quite* there for our purposes
> https://bugzilla.redhat.com/show_bug.cgi?id=488941
> but I'll be surprised if I'm not dtracing on my Fedora 10 machine before
> the week is out.

So how is this going? Is it usable? I assume it's source compatible
with the dtrace support that we already have?


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Sampling Profler for Postgres
Date: 2009-03-24 14:02:00
Message-ID: 26552.1237903320@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
> On March 10, 2009, Tom Lane wrote:
>> FWIW, the systemtap guys are really, really close to having a working
>> DTrace equivalent for Linux:
>> http://gnu.wildebeest.org/diary/2009/02/24/systemtap-09-markers-everywhere/

> So how is this going? Is it usable? I assume it's source compatible
> with the dtrace support that we already have?

Their SCM tip successfully builds our code with --enable-dtrace.
I haven't gotten any further with it than to try the sample script
linked on the page above, but that seemed to work (on a Fedora 10
x86_64 box).

The current 0.9 release does *not* work on our CVS tip (dtrace fails
on more-than-6-argument probes, and there are some other issues),
but you can pull from their git repository:

install elfutils-devel
git clone git://sources.redhat.com/git/systemtap.git
configure --prefix=SOMEWHERE
make all
sudo make install

Then build PG with

PATH=SOMEWHERE/bin:$PATH
configure --with-includes=SOMEWHERE/include --enable-dtrace

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Sampling Profler for Postgres
Date: 2009-04-06 00:53:15
Message-ID: 5703.1238979195@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I wrote:
> Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
>> On March 10, 2009, Tom Lane wrote:
>>> FWIW, the systemtap guys are really, really close to having a working
>>> DTrace equivalent for Linux:
>>> http://gnu.wildebeest.org/diary/2009/02/24/systemtap-09-markers-everywhere/

>> So how is this going? Is it usable? I assume it's source compatible
>> with the dtrace support that we already have?

> The current 0.9 release does *not* work on our CVS tip (dtrace fails
> on more-than-6-argument probes, and there are some other issues),
> but you can pull from their git repository:

BTW, systemtap 0.9.5 is now available as part of the standard Fedora 10
package set, so you don't have to install any nonstandard software
anymore. I've checked, and 0.9.5 appears to "just work" with our
CVS HEAD. You need these packages:

$ rpm -qa | grep systemtap
systemtap-sdt-devel-0.9.5-1.fc10.x86_64
systemtap-runtime-0.9.5-1.fc10.x86_64
systemtap-0.9.5-1.fc10.x86_64

Then configure --enable-dtrace, and away you go.

regards, tom lane


From: Itagaki Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Sampling Profler for Postgres
Date: 2009-04-09 10:17:48
Message-ID: 20090409190819.8B3A.52131E4D@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Here is an updated version of sampling profiler patch.

Now condition IDs can be discrete numbers and don't have to be
continuous. It enables us to insert some new conditions between
existing numbers if needed in the future.

I think we need more discussion about how to adjust this patch
and dtrace probes, but I'll submit it to the next commit-fest
for the record.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

Attachment Content-Type Size
profiler-20090409.tar.gz application/octet-stream 19.1 KB