Re: Review: Revise parallel pg_restore's scheduling heuristic

From: Sam Mason <sam(at)samason(dot)me(dot)uk>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Review: Revise parallel pg_restore's scheduling heuristic
Date: 2009-08-07 15:33:07
Message-ID: 20090807153307.GI5407@samason.me.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Aug 07, 2009 at 10:19:20AM -0500, Kevin Grittner wrote:
> Sam Mason <sam(at)samason(dot)me(dot)uk> wrote:
>
> > What do people do when testing this? I think I'd look to something
> > like Student's t-test to check for statistical significance. My
> > working would go something like:
> >
> > I assume the variance is the same because it's being tested on the
> > same machine.
> >
> > samples = 20
> > stddev = 144.26
> > avg1 = 4783.13
> > avg2 = 4758.46
> > t = 0.54 ((avg1 - avg2) / (stddev * sqrt(2/samples)))
> >
> > We then have to choose how certain we want to be that they're
> > actually different, 90% is a reasonably easy level to hit (i.e. one
> > part in ten, with 95% being more commonly quoted). For 20 samples
> > we have 19 degrees of freedom--giving us a cut-off[1] of 1.328.
> > 0.54 is obviously well below this allowing us to say that there's no
> > "statistical significance" between the two samples at a 90% level.
>
> Thanks for the link; that looks useful. To confirm that I understand
> what this has established (or get a bit of help putting in in
> perspective), what this says to me, in the least technical jargon I
> can muster, is "With this many samples and this degree of standard
> deviation, the average difference is not large enough to have a 90%
> confidence level that the difference is significant." In fact,
> looking at the chart, it isn't enough to reach a 75% confidence level
> that the difference is significant. Significance here would seem to
> mean that at least the given percentage of the time, picking this many
> samples from an infinite set with an average difference that really
> was this big or bigger would generate a value for t this big or
> bigger.
>
> Am I close?

Yes, all that sounds as though you've got it. Note that running the
test more times will tend to reduce the standard deviation a bit as
well, so it may well become significant. In this case it's unlikely to
affect it much though.

> I like to be clear, because it's easy to get confused and take the
> above to mean that there's a 90% confidence that there is no actual
> significant difference in performance based on that sampling. (Given
> Tom's assurance that this version of the patch should have similar
> performance to the last, and the samples from the prior patch went the
> other direction, I'm convinced there is not a significant difference,
> but if I'm going to use the referenced calculations, I want to be
> clear how to interpret the results.)

All we're saying is that we're less than 90% confident that there's
something "significant" going on. All the fiddling with standard
deviations and sample sizes is just easiest way (that I know of) that
statistics currently gives us of determining this more formally than a
hand-wavy "it looks OK to me". Science tells us that humans are liable
to say things are OK when they're not, as well as vice versa; statistics
gives us a way to work past these limitations in some common and useful
situations.

--
Sam http://samason.me.uk/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2009-08-07 15:35:39 Re: Alpha releases: How to tag
Previous Message Kenneth Marshall 2009-08-07 15:29:27 Re: Fixing geometic calculation