Re: DTrace probes broken in HEAD on Solaris?

Lists: pgsql-hackers
From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: DTrace probes broken in HEAD on Solaris?
Date: 2009-03-13 14:15:31
Message-ID: 29752.1236953731@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Most of the Solaris buildfarm members are unhappy this morning.
It looks like the ones that are busted have --enable-dtrace,
which points the finger at the dtrace probe changes I made yesterday:
http://archives.postgresql.org/pgsql-committers/2009-03/msg00079.php

Those changes work on Linux and OS X, so it's not clear what I did
wrong. Can anyone with a Solaris machine poke into it, at least
to the extent of finding out where it's dumping core?

regards, tom lane


From: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: DTrace probes broken in HEAD on Solaris?
Date: 2009-03-17 13:49:06
Message-ID: 49BFAA52.3000403@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Sorry for late answer, I was on vacation last week. I see that you
already fix a problem.

Answer why it happens when probes are disabled is, that for user
application there are piece of code which prepare DTrace probes
arguments which will be passed into kernel DTrace modul. This code has
performance penalty which depends on number of arguments.

See the code:

FlushBuffer+0x69: call +0x128c6 <smgropen>
FlushBuffer+0x6e: addl $0x10,%esp
FlushBuffer+0x71: movl %eax,0xc(%ebp)
FlushBuffer+0x74: subl $0x4,%esp

FlushBuffer+0x77: movl 0xc(%ebp),%eax
FlushBuffer+0x7a: pushl 0x8(%eax)
FlushBuffer+0x7d: pushl 0x4(%eax)
FlushBuffer+0x80: pushl (%eax)
FlushBuffer+0x82: nop
FlushBuffer+0x83: nop
FlushBuffer+0x84: nop
FlushBuffer+0x85: nop
FlushBuffer+0x86: nop
FlushBuffer+0x87: addl $0x10,%esp

nops reserve space for probe.

For better explanation see:

http://blogs.sun.com/ahl/entry/user_land_tracing_gets_better

Zdenek

Dne 13.03.09 15:15, Tom Lane napsal(a):
> Most of the Solaris buildfarm members are unhappy this morning.
> It looks like the ones that are busted have --enable-dtrace,
> which points the finger at the dtrace probe changes I made yesterday:
> http://archives.postgresql.org/pgsql-committers/2009-03/msg00079.php
>
> Those changes work on Linux and OS X, so it's not clear what I did
> wrong. Can anyone with a Solaris machine poke into it, at least
> to the extent of finding out where it's dumping core?
>
> regards, tom lane
>


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: DTrace probes broken in HEAD on Solaris?
Date: 2009-03-17 18:49:16
Message-ID: 24639.1237315756@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM> writes:
> Answer why it happens when probes are disabled is, that for user
> application there are piece of code which prepare DTrace probes
> arguments which will be passed into kernel DTrace modul. This code has
> performance penalty which depends on number of arguments.

More specifically, it seems that DTrace is designed so that it evaluates
all the arguments to a probe macro before it decides whether to actually
take the trap or not.

This seems to me to be a pretty bad/surprising behavior; it's not the
way that our Assert macros work, for example. There's a performance
issue, which would probably only be brutally obvious if you had an
expensive function in the arguments. (Before you say no one would do
that, note the relpath() calls I was complaining about last week.)
But we've been muttering about dropping probes into some extremely
hot hot-spots, like spinlock acquisition, and even a few more
instructions to copy local variables could make a difference there.

The other thing I don't like is that this implementation exposes people
to any bugs that may exist in the probe arguments, *even when they don't
have any tracing turned on*. (Again, we had two different instances of
that last week, so don't bother arguing that it doesn't happen.)

Both of these considerations are strong arguments for not building
production installations with --enable-dtrace, just as we don't
encourage people to build for production with --enable-cassert.

But of course part of the argument for dtrace support is that people
would like to have such probing available in production installations.

What I've found out about this is that for each probe macro, DTrace
also defines a foo_ENABLED() macro that can be used like this:

if (foo_ENABLED())
foo(...);

I think what we should do about these considerations is fix our macro
definitions so that the if(ENABLED()) test is built into the macros.
I'm not sure what this will require ... probably some post-processing
of the probes.h file ... but if we don't do it we're going to keep
getting bit.

Comments?

regards, tom lane


From: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org, Robert Lor <Robert(dot)Lor(at)Sun(dot)COM>
Subject: Re: DTrace probes broken in HEAD on Solaris?
Date: 2009-03-18 15:34:10
Message-ID: 49C11472.6050608@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dne 17.03.09 19:49, Tom Lane napsal(a):
> Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM> writes:
>> Answer why it happens when probes are disabled is, that for user
>> application there are piece of code which prepare DTrace probes
>> arguments which will be passed into kernel DTrace modul. This code has
>> performance penalty which depends on number of arguments.
>
> More specifically, it seems that DTrace is designed so that it evaluates
> all the arguments to a probe macro before it decides whether to actually
> take the trap or not.

yeah, it is valid for USDT. I guess main reason for this is
implementation is that DTrace does not have idea how compiler generates
code and where he can find arguments. Only what it knows is how to call
a probe and this call is "noped".

> This seems to me to be a pretty bad/surprising behavior; it's not the
> way that our Assert macros work, for example. There's a performance
> issue, which would probably only be brutally obvious if you had an
> expensive function in the arguments. (Before you say no one would do
> that, note the relpath() calls I was complaining about last week.)
> But we've been muttering about dropping probes into some extremely
> hot hot-spots, like spinlock acquisition, and even a few more
> instructions to copy local variables could make a difference there.

yeah, spinlock is problem, but on other side the probes are only in the
branch when spinlock is not free and sleep anyway. And LWLOCK_ACQUIRE
probe has minimal impact if you compare it with rest of LWLockAcquire
function.

> The other thing I don't like is that this implementation exposes people
> to any bugs that may exist in the probe arguments, *even when they don't
> have any tracing turned on*. (Again, we had two different instances of
> that last week, so don't bother arguing that it doesn't happen.)

I don't accept your argument here. Better if it fails every time not
only when you enable the probe. DTrace is designated to be safe on
production machine. It is not good to shut down postgresql server in
production...

> Both of these considerations are strong arguments for not building
> production installations with --enable-dtrace, just as we don't
> encourage people to build for production with --enable-cassert.
>
> But of course part of the argument for dtrace support is that people
> would like to have such probing available in production installations.
>
> What I've found out about this is that for each probe macro, DTrace
> also defines a foo_ENABLED() macro that can be used like this:
>
> if (foo_ENABLED())
> foo(...);
>
> I think what we should do about these considerations is fix our macro
> definitions so that the if(ENABLED()) test is built into the macros.
> I'm not sure what this will require ... probably some post-processing
> of the probes.h file ... but if we don't do it we're going to keep
> getting bit.

I like this idea to used if(foo_ENABLED), but maybe we should used it
only when args evaluation is more complicated?

I'm not sure if it is good idea to modify macros definition.

See how macro definition look like:

#define TRACE_POSTGRESQL_LWLOCK_ACQUIRE(arg0, arg1) \
__dtrace_postgresql___lwlock__acquire(arg0, arg1)
#define TRACE_POSTGRESQL_LWLOCK_ACQUIRE_ENABLED() \
__dtraceenabled_postgresql___lwlock__acquire()

Maybe add something like this at the end of probe.h:

#define TRACE_POSTGRESQL_LWLOCK_ACQUIRE_COND(arg0, arg1) \
if( TRACE_POSTGRESQL_LWLOCK_ACQUIRE_ENABLED() ) \
TRACE_POSTGRESQL_LWLOCK_ACQUIRE(arg0, arg1)

Zdenek


From: Robert Lor <Robert(dot)Lor(at)Sun(dot)COM>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: DTrace probes broken in HEAD on Solaris?
Date: 2009-03-24 21:31:43
Message-ID: 49C9513F.6050907@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM> writes:
>
>> Answer why it happens when probes are disabled is, that for user
>> application there are piece of code which prepare DTrace probes
>> arguments which will be passed into kernel DTrace modul. This code has
>> performance penalty which depends on number of arguments.
>
>
> The other thing I don't like is that this implementation exposes people
> to any bugs that may exist in the probe arguments, *even when they don't
> have any tracing turned on*. (Again, we had two different instances of
> that last week, so don't bother arguing that it doesn't happen.)
>

Yes, the above scenario can occur when compiling with DTrace (passing
--enable-dtrace to configure) even when the probes are not enabled. It
won't be an issue if you don't compile with DTrace though as the macros
are converted to no-ops. Hopefully, any bugs in the probe arguments
would be caught during testing ;-)

> Both of these considerations are strong arguments for not building
> production installations with --enable-dtrace, just as we don't
> encourage people to build for production with --enable-cassert.
>
> But of course part of the argument for dtrace support is that people
> would like to have such probing available in production installations.
>
> What I've found out about this is that for each probe macro, DTrace
> also defines a foo_ENABLED() macro that can be used like this:
>
> if (foo_ENABLED())
> foo(...);
>
> I think what we should do about these considerations is fix our macro
> definitions so that the if(ENABLED()) test is built into the macros.
> I'm not sure what this will require ... probably some post-processing
> of the probes.h file ... but if we don't do it we're going to keep
> getting bit.
>
I was contemplating on using the is-enabled test, but when checking the
arguments to all the probes, they were quite simple (except the
relpath() call which is now removed).

I think the is-enabled test will address the issues you encountered. I
see a few alternative to fixing this:

1) Only use if (foo_ENABLED()) test for probes with expensive function
call/computation in arguments. This will keep the code clean, and we can
document this in the "Definine New Probes" section in the online doc.

2) Add the if(foo_ENABLED()) test to all probes manually and make this a
requirement for all future probes. This makes the check explicit and
avoid confusion.

3) Post-process probes.h so if(foo_ENABLED()) test is added to every
probe. We're doing some post-processing now by pre-pending TRACE_ to the
macros with a sed script. Personally, I don't like doing complex
post-processing of output from another tool because the script can break
if for some reason DTrace's output is changed.

I prefer option 1 the most and 3 the least.

-Robert


From: Greg Stark <stark(at)enterprisedb(dot)com>
To: Robert Lor <Robert(dot)Lor(at)sun(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Zdenek Kotala <Zdenek(dot)Kotala(at)sun(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: DTrace probes broken in HEAD on Solaris?
Date: 2009-03-24 21:49:50
Message-ID: 4136ffa0903241449l2dc20420nf4b380490fe928c6@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Mar 24, 2009 at 9:31 PM, Robert Lor <Robert(dot)Lor(at)sun(dot)com> wrote:
>
> I think the is-enabled test will address the issues you encountered. I see a
> few alternative to fixing this:

Another option is to impose a policy that all arguments to probes must
be simple local variables -- no expressions.

I'm starting to think that would be the better option and more in tune
with the dtrace way. Introducing the _ENABLED check defeats the whole
performance aim of dtrace that when it's disabled it introduces no
overhead. But using the _ENABLED macro only sparsely risks missing
some expression somewhere which looks innocuous but isn't.

I wonder if there's a gcc extension that would let us check if a macro
parameter is a simple variable or expression. That would let us
enforce the polilcy strictly at build-time.

--
greg


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Greg Stark <stark(at)enterprisedb(dot)com>
Cc: Robert Lor <Robert(dot)Lor(at)sun(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Zdenek Kotala <Zdenek(dot)Kotala(at)sun(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: DTrace probes broken in HEAD on Solaris?
Date: 2009-03-24 22:12:09
Message-ID: 20090324221209.GB24546@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Mar 24, 2009 at 09:49:50PM +0000, Greg Stark wrote:
> I wonder if there's a gcc extension that would let us check if a macro
> parameter is a simple variable or expression. That would let us
> enforce the polilcy strictly at build-time.

Not really a GCC extension, but you could have the macro check that it
can take the address of the arguments, which amounts to almost the same
thing. It only doesn't work on constants.

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while
> boarding. Thank you for flying nlogn airlines.


From: Greg Stark <stark(at)enterprisedb(dot)com>
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc: Robert Lor <Robert(dot)Lor(at)sun(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Zdenek Kotala <Zdenek(dot)Kotala(at)sun(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: DTrace probes broken in HEAD on Solaris?
Date: 2009-03-24 22:18:08
Message-ID: 4136ffa0903241518h9a2a442qfaaa7b8741a0fa65@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Mar 24, 2009 at 10:12 PM, Martijn van Oosterhout
<kleptog(at)svana(dot)org> wrote:
> Not really a GCC extension, but you could have the macro check that it
> can take the address of the arguments, which amounts to almost the same
> thing. It only doesn't work on constants.

No, there are all kinds of complex expressions which are lvalues --
anything ending in a pointer dereference for example, which is
precisely the most likely kind of expression to introduce seg fault if
something is unexpectedly null
.

--
greg


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Greg Stark <stark(at)enterprisedb(dot)com>
Cc: Robert Lor <Robert(dot)Lor(at)sun(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Zdenek Kotala <Zdenek(dot)Kotala(at)sun(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: DTrace probes broken in HEAD on Solaris?
Date: 2009-03-24 22:53:18
Message-ID: 20090324225317.GC24546@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Mar 24, 2009 at 10:18:08PM +0000, Greg Stark wrote:
> On Tue, Mar 24, 2009 at 10:12 PM, Martijn van Oosterhout
> <kleptog(at)svana(dot)org> wrote:
> > Not really a GCC extension, but you could have the macro check that it
> > can take the address of the arguments, which amounts to almost the same
> > thing. It only doesn't work on constants.
>
> No, there are all kinds of complex expressions which are lvalues --
> anything ending in a pointer dereference for example, which is
> precisely the most likely kind of expression to introduce seg fault if
> something is unexpectedly null

Sorry, I meant to say that the compiler could determine the address at
compile time, something like:

__builtin_constant_p( !&(__x)) )

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while
> boarding. Thank you for flying nlogn airlines.


From: Greg Stark <stark(at)enterprisedb(dot)com>
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc: Robert Lor <Robert(dot)Lor(at)sun(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Zdenek Kotala <Zdenek(dot)Kotala(at)sun(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: DTrace probes broken in HEAD on Solaris?
Date: 2009-03-24 23:09:51
Message-ID: 4136ffa0903241609w2324ae33x4a694426fb605a30@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Mar 24, 2009 at 10:53 PM, Martijn van Oosterhout
<kleptog(at)svana(dot)org> wrote:
> Sorry, I meant to say that the compiler could determine the address at
> compile time, something like:
>
> __builtin_constant_p( !&(__x)) )

Hm, that's a better idea.

How about something like

(__builtin_constant_p(__x) || __builtin_constant_p(!&(__x)))

This would still pass expressions like foo[x] even if x might be
out-of-bounds or foo might be pointed to a freed object or something
like that though.

--
greg


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Greg Stark <stark(at)enterprisedb(dot)com>
Cc: Robert Lor <Robert(dot)Lor(at)sun(dot)com>, Zdenek Kotala <Zdenek(dot)Kotala(at)sun(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: DTrace probes broken in HEAD on Solaris?
Date: 2009-03-25 00:55:31
Message-ID: 13769.1237942531@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Stark <stark(at)enterprisedb(dot)com> writes:
> On Tue, Mar 24, 2009 at 9:31 PM, Robert Lor <Robert(dot)Lor(at)sun(dot)com> wrote:
>> I think the is-enabled test will address the issues you encountered. I see a
>> few alternative to fixing this:

> Another option is to impose a policy that all arguments to probes must
> be simple local variables -- no expressions.

IOW, if you need to trace on an expression, you have to calculate it
whether or not ENABLE_DTRACE is even defined? This doesn't seem to
me that it solves anything. The cases that are interesting are where a
probe needs a value that otherwise wouldn't be needed.

regards, tom lane


From: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
To: Robert Lor <Robert(dot)Lor(at)Sun(dot)COM>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: DTrace probes broken in HEAD on Solaris?
Date: 2009-03-25 12:54:05
Message-ID: 49CA296D.20102@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dne 24.03.09 22:31, Robert Lor napsal(a):

> I think the is-enabled test will address the issues you encountered. I
> see a few alternative to fixing this:
>
> 1) Only use if (foo_ENABLED()) test for probes with expensive function
> call/computation in arguments. This will keep the code clean, and we can
> document this in the "Definine New Probes" section in the online doc.
>
> 2) Add the if(foo_ENABLED()) test to all probes manually and make this a
> requirement for all future probes. This makes the check explicit and
> avoid confusion.
>
> 3) Post-process probes.h so if(foo_ENABLED()) test is added to every
> probe. We're doing some post-processing now by pre-pending TRACE_ to the
> macros with a sed script. Personally, I don't like doing complex
> post-processing of output from another tool because the script can break
> if for some reason DTrace's output is changed.
>
> I prefer option 1 the most and 3 the least.

I prefer also option 1. In many cases if(foo_ENABLED) has same or bigger
performance penalty like disabled probe itself.

Zdenek


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Lor <Robert(dot)Lor(at)Sun(dot)COM>
Cc: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: DTrace probes broken in HEAD on Solaris?
Date: 2009-03-28 00:14:48
Message-ID: 6913.1238199288@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Lor <Robert(dot)Lor(at)Sun(dot)COM> writes:
> Tom Lane wrote:
>> [ complaining about disabled probes not being no-ops ]

> 1) Only use if (foo_ENABLED()) test for probes with expensive function
> call/computation in arguments. This will keep the code clean, and we can
> document this in the "Definine New Probes" section in the online doc.
> ...
> I prefer option 1 the most and 3 the least.

I got the same advice from the systemtap people, so we'll do it that
way.

regards, tom lane