Re: XLogInsert

Lists: pgsql-hackers
From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: XLogInsert
Date: 2009-08-19 16:14:08
Message-ID: f67928030908190914g60397727s5a685636ac73a3f7@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

In XLogInsert (src/backend/access/transam/xlog.c), the part that adds back-up
blocks into the rdata chain is described:

/*
* Make additional rdata chain entries for the backup blocks, so that we
* don't need to special-case them in the write loop. Note that we have
* now irrevocably changed the input rdata chain.

If I read the code correctly, the only thing that is irrevocable is
that it writes into
rdt->next, and if it saved an old copy of rdt first, then it could
revoke the changes just
by doing rdt_old->next=NULL. If that were done, then I think this
code could be
moved out of the section holding the WALInsertLock. It could probably be folded
into the part of the code that computes the CRC.

I don't think this wold be a big performance win, as that part of the
code is pretty
fast, but every bit helps in a highly contended lock, and I think the code
would be simpler as well. Do you think it would be worth me giving this a try?

Cheers,

Jeff


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: XLogInsert
Date: 2009-08-19 16:49:07
Message-ID: 24638.1250700547@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Jeff Janes <jeff(dot)janes(at)gmail(dot)com> writes:
> If I read the code correctly, the only thing that is irrevocable is
> that it writes into
> rdt->next, and if it saved an old copy of rdt first, then it could
> revoke the changes just
> by doing rdt_old->next=NULL. If that were done, then I think this
> code could be
> moved out of the section holding the WALInsertLock.

Hmm, I recall that the changes are ... or were ... more complex.
The tricky case I think is where we have to go back and redo the
block-backup decisions after discovering that the checkpoint REDO
pointer has just moved.

If you can get the work out of the WALInsertLock section for just a
few more instructions, it would definitely be worth doing.

regards, tom lane


From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: XLogInsert
Date: 2009-09-14 03:42:11
Message-ID: f67928030909132042v432f7194le9e0eab2d1156b21@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Aug 19, 2009 at 9:49 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Jeff Janes <jeff(dot)janes(at)gmail(dot)com> writes:
> > If I read the code correctly, the only thing that is irrevocable is
> > that it writes into
> > rdt->next, and if it saved an old copy of rdt first, then it could
> > revoke the changes just
> > by doing rdt_old->next=NULL. If that were done, then I think this
> > code could be
> > moved out of the section holding the WALInsertLock.
>
> Hmm, I recall that the changes are ... or were ... more complex.
> The tricky case I think is where we have to go back and redo the
> block-backup decisions after discovering that the checkpoint REDO
> pointer has just moved.
>
> If you can get the work out of the WALInsertLock section for just a
> few more instructions, it would definitely be worth doing.
>

I've attached a patch which removes the iteration over the blocks to be
backed-up from the critical section of XLogInsert. Now those blocks are
only looped over in one piece of code which both computes the CRC and builds
the linked list, rather than having parallel loops.

I've used an elog statement (not shown in patch) to demonstrate that the
"goto begin;" after detecting REDO race actually does get executed under a
standard workload, (pgbench -c10). Two to 4 out of 10 the backends execute
that code path for each checkpoint on my single CPU machine. By doing a
kill -9 on a process, to simulate a crash, during the period after the goto
begin is execercised but before the precipitating heckpoint completes, I can
force it to use the written WAL records in recovery. The database
automatically recovers and the results are self-consistent.

I cannot imagine any other races, rare events, or action at a distance that
could come into play with this code change, so I cannot think of anything
else to test at the moment.

I could not detect a speed difference with pgbench, but as I cannot get
pgbench to be XLogInsert bound, that is not surprising. Using the only
XLogInsert-bound test case I know of, parallel COPY into a skinny, unindexed
table, using 8 parallel copies on a 4 x dual-core x86_64 and with fsync
turned off (to approxiamately simulate SSD, which I do not have), I get a
speed improvement of 2-4% with the patch over unpatched head. Maybe with
more CPUs the benefit would be greater.

That small improvement is probably not very attractive, however I think the
patch makes the overall code a bit cleaner, so it may be warranted on that
ground. Indeed, my motivation for working on this is that I kept beating my
head against the complexity of the old code, and thought that simplifying it
would make future work easier.

Cheers,

Jeff

Attachment Content-Type Size
XLogInsert_rdt2_v1.patch text/x-diff 6.0 KB

From: Pierre Frédéric Caillaud <lists(at)peufeu(dot)com>
To: "Jeff Janes" <jeff(dot)janes(at)gmail(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Patch LWlocks instrumentation
Date: 2009-09-14 10:22:22
Message-ID: op.uz8sfkxycke6l8@soyouz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


A little bit of a reply to Jeff's email about WALInsertLock.

This patch instruments LWLocks, it is controlled with the following
#define's in lwlock.c :

LWLOCK_STATS
LWLOCK_TIMING_STATS

It is an upgrade of current lwlocks stats.

When active, at backend exit, it will display stats as shown below (here,
we have a parallel COPY with 4 concurrent processes into the same table,
on a 4 core machine).
If the (rather wide) sample output is mangled in your mail client, I've
attached it as a separate text file.

-------- Lock stats for PID 22403
PID Lock ShAcq ShWait ShWaitT ShHeldT
ExAcq ExWait ExWaitT ExHeldT Name
22403 7 0 0 0.00 0.00
2500002 730338 24.02 ( 53.49 %) 7.25 ( 16.14 %) WALInsert
22403 8 0 0 0.00 0.00
19501 73 3.48 ( 7.75 %) 0.40 ( 0.88 %) WALWrite

-------- Lock stats for PID 22404
PID Lock ShAcq ShWait ShWaitT ShHeldT
ExAcq ExWait ExWaitT ExHeldT Name
22404 7 0 0 0.00 0.00
2500002 724683 23.34 ( 51.59 %) 8.24 ( 18.20 %) WALInsert
22404 8 0 0 0.00 0.00
19418 90 4.37 ( 9.67 %) 0.44 ( 0.97 %) WALWrite

-------- Lock stats for PID 22402
PID Lock ShAcq ShWait ShWaitT ShHeldT
ExAcq ExWait ExWaitT ExHeldT Name
22402 7 0 0 0.00 0.00
2500002 735958 24.06 ( 52.73 %) 8.05 ( 17.63 %) WALInsert
22402 8 0 0 0.00 0.00
19154 97 4.21 ( 9.22 %) 0.39 ( 0.85 %) WALWrite

-------- Lock stats for PID 22400
PID Lock ShAcq ShWait ShWaitT ShHeldT
ExAcq ExWait ExWaitT ExHeldT Name
22400 7 0 0 0.00 0.00
2500002 736265 25.50 ( 55.59 %) 6.74 ( 14.70 %) WALInsert
22400 8 0 0 0.00 0.00
19391 66 2.95 ( 6.42 %) 0.39 ( 0.85 %) WALWrite

Here we see that PID 22400 spent :

25.50 s waiting to get exclusive on WALInsert
6.74 s while holding exclusive on WALInsert

The percentages represent the fraction of time relative to the backend
process' lifetime.
Here, I've exited the processes right after committing the transactions,
but if you use psql and want accurate %, you'll need to exit quickly after
the query to benchmark.

Here, for example, backends spend more than 50% of their time waiting on
WALInsert...

Attachment Content-Type Size
lwlock_instrumentation.patch application/octet-stream 10.8 KB
lwlock_instrumentation.sample application/octet-stream 1.8 KB

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Pierre Frédéric Caillaud <lists(at)peufeu(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Patch LWlocks instrumentation
Date: 2009-09-14 15:08:27
Message-ID: f67928030909140808n13e6733cxf00a8d24916f0186@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2009/9/14 Pierre Frédéric Caillaud <lists(at)peufeu(dot)com>

>
> A little bit of a reply to Jeff's email about WALInsertLock.
>
> This patch instruments LWLocks, it is controlled with the following
> #define's in lwlock.c :
>
> LWLOCK_STATS
> LWLOCK_TIMING_STATS
>
> It is an upgrade of current lwlocks stats.
>

Hi Pierre,

Have you looked at the total execution time with and without the
LWLOCK_TIMING_STATS?

I've implemented something similar to this myself (only without attempting
to make it portable and otherwise worthy of submitting as a general-interest
patch), what I found is that attempting to time every "hold" time
substantially increased the overall run time (which I would worry distorts
the reported times, queue bad Heisenberg analogies). The problem is that
gettimeofday is slow, and on some multi-processor systems it is a global
point of serialization, making it even slower. I decided to time only the
time spent waiting on a block, and not the time spent holding the lock.
This way you only call gettimeofday twice if you actually need to block, and
not at all if you immediately get the lock. This had a much smaller effect
on runtime, and the info produced was sufficient for my purposes.

Not that this changes your conclusion. With or without that distortion I
completely believe that WALInsertLock is the bottleneck of parallel bulk
copy into unindexed tables. I just can't find anything else it is a primary
bottleneck on. I think the only real solution for bulk copy is to call
XLogInsert less often. For example, it could build blocks in local memory,
then when done copy it into the shared buffers and then toss the entire
block into WAL in one call. Easier said than implemented, of course.

Cheers,

Jeff


From: Pierre Frédéric Caillaud <lists(at)peufeu(dot)com>
To: "Jeff Janes" <jeff(dot)janes(at)gmail(dot)com>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Patch LWlocks instrumentation
Date: 2009-09-14 19:17:01
Message-ID: op.uz9g6najcke6l8@soyouz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


> Have you looked at the total execution time with and without the
> LWLOCK_TIMING_STATS?

It didn't show any significant overhead on the little COPY test I made.
On selects, it probably does (just like EXPLAIN ANALYZE), but I didn't
test.
It is not meant to be always active, it's a #define, so I guess it would
be OK though.

I'm going to modify it according to your suggestions and repost it (why
didn't I do that first ?...)

> Not that this changes your conclusion. With or without that distortion I
> completely believe that WALInsertLock is the bottleneck of parallel bulk
> copy into unindexed tables. I just can't find anything else it is a
> primary
> bottleneck on. I think the only real solution for bulk copy is to call
> XLogInsert less often. For example, it could build blocks in local
> memory,
> then when done copy it into the shared buffers and then toss the entire
> block into WAL in one call. Easier said than implemented, of course.

Actually,

http://archives.postgresql.org/pgsql-hackers/2009-09/msg00806.php


From: Jaime Casanova <jcasanov(at)systemguards(dot)com(dot)ec>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: XLogInsert
Date: 2009-12-09 07:04:16
Message-ID: 3073cc9b0912082304p62bf94f1i95d03ac2c1741932@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Sep 13, 2009 at 10:42 PM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
>
> I've attached a patch which removes the iteration over the blocks to be
> backed-up from the critical section of XLogInsert.  Now those blocks are
> only looped over in one piece of code which both computes the CRC and builds
> the linked list, rather than having parallel loops.
>

ok, i started to review this one...

what the patch is doing seems very obvious and doesn't break the
original logic AFAIUI (i could be wrong, this seems like chinese to me
;)

i made some tests archiving wal and recovering... crashing the server
and restarting... every test was running the regression tests,
don't know what else to do, ideas?

i haven't made any performance tests but it should gain something :),
maybe someone can make those tests?

if not, i will mark the patch as ready for committer some time in the
next hours...

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jaime Casanova <jcasanov(at)systemguards(dot)com(dot)ec>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: XLogInsert
Date: 2009-12-09 14:39:45
Message-ID: 1524.1260369585@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Jaime Casanova <jcasanov(at)systemguards(dot)com(dot)ec> writes:
> i haven't made any performance tests but it should gain something :),
> maybe someone can make those tests?

The argument for changing this at all is only that it will improve
performance, so I'd like some independent confirmation that it does.

regards, tom lane


From: Greg Smith <greg(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jaime Casanova <jcasanov(at)systemguards(dot)com(dot)ec>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: XLogInsert
Date: 2009-12-09 18:18:01
Message-ID: 4B1FE9D9.6000106@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Jaime Casanova <jcasanov(at)systemguards(dot)com(dot)ec> writes:
>
>> i haven't made any performance tests but it should gain something :),
>> maybe someone can make those tests?
>>
>
> The argument for changing this at all is only that it will improve
> performance, so I'd like some independent confirmation that it does.
I've done a little review of this myself, and I'm not quite happy with how this patch was delivered to us. The bar for committing something that touches the WAL is really high--it needs to be a unquestionable win to touch that code. The justification of "the patch makes the overall code a bit cleaner" is a hard sell on something that's hard to debug (triggering bad WAL situations at will isn't easy) and critical to the system. If there's a clear performance improvement, that helps justify why it's worth working on. Here's the original performance justification:

"Using the only XLogInsert-bound test case I know of, parallel COPY into a skinny, unindexed table, using 8 parallel copies on a 4 x dual-core x86_64 and with fsync turned off (to approxiamately simulate SSD, which I do not have), I get a speed improvement of 2-4% with the patch over unpatched head."

That makes sense, and using this spec I could probably come up with the test program to reproduce this. But I'm getting tired of doing that. It's hard enough to reproduce performance changes when someone gives the exact configuration and test program they used. If we're working with a verbal spec for how to reproduce the issues, forget about it--that's more than we can expect a reviewer to handle, and the odds of that whole thing ending well are low.

Jeff: before we do anything else with your patch, I'd like to see a script of some sort that runs the test you describe above, everything changed in the postgresql.conf from the defaults, and the resulting raw numbers that come from it on your system that prove an improvement there--not just a summary of the change. That's really mandatory for a performance patch. If any reviewer who's interested can't just run something and get a report suggesting whether the patch helped or harmed results in five minutes, unless we really, really want your patch it's just going to stall at that point. And unfortunately, in the case of something that touches the WAL path, we really don't want to change anything unless there's a quite good reason to do so.

I've also realized that "Patch LWlocks instrumentation" at http://archives.postgresql.org/message-id/op.uz8sfkxycke6l8@soyouz should have been evaluated as its own patch altogether. I think that the test program you're suggesting also proves its utility though, so for now I'll keep them roped together.

Sorry this ended up so late in this CommitFest, just a series of unexpected stuff rippled down to you. On the bright side, had you submitted this before the whole organized CF process started, you could have waited months longer to get the same feedback.

--
Greg Smith 2ndQuadrant Baltimore, MD
PostgreSQL Training, Services and Support
greg(at)2ndQuadrant(dot)com www.2ndQuadrant.com


From: Jaime Casanova <jcasanov(at)systemguards(dot)com(dot)ec>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: XLogInsert
Date: 2009-12-10 20:48:19
Message-ID: 3073cc9b0912101248o554bf638pfaec199278b96987@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Dec 9, 2009 at 9:39 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> so I'd like some independent confirmation that it does.
>

what kind of tests could show that? or simply running pgbench several
times for 15 minutes each run could show any benefit?

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Jaime Casanova <jcasanov(at)systemguards(dot)com(dot)ec>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: XLogInsert
Date: 2009-12-10 20:58:55
Message-ID: 20091210205855.GP15554@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Jaime Casanova escribió:
> On Wed, Dec 9, 2009 at 9:39 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > so I'd like some independent confirmation that it does.
> >
>
> what kind of tests could show that? or simply running pgbench several
> times for 15 minutes each run could show any benefit?

Isn't the supposed performance improvement in this patch linked strongly
to backup blocks? If that's really the case, I think it would show more
prominently if you had very frequent checkpoints.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


From: Jaime Casanova <jcasanov(at)systemguards(dot)com(dot)ec>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: XLogInsert
Date: 2009-12-13 07:18:02
Message-ID: 3073cc9b0912122318v2a7edc47se0378e7335daaf5f@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Dec 10, 2009 at 3:58 PM, Alvaro Herrera
<alvherre(at)commandprompt(dot)com> wrote:
> Jaime Casanova escribió:
>> On Wed, Dec 9, 2009 at 9:39 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> > so I'd like some independent confirmation that it does.
>> >
>>
>> what kind of tests could show that? or simply running pgbench several
>> times for 15 minutes each run could show any benefit?
>
> Isn't the supposed performance improvement in this patch linked strongly
> to backup blocks?  If that's really the case, I think it would show more
> prominently if you had very frequent checkpoints.
>

Ah! Ok, i was only following the logic that it was eliminating the
need of executing a loop twice...
But you are right while the loop executes always it only do something
meaningful after a checkpoint and the for statement only make 3 loops
each, because XLR_MAX_BKP_BLOCKS is defined as 3 in
src/include/access/xlog.h

looked that way seems like the benefit could be only marginal

to prove that i compile with and without the patch, and change
checkpoint_segments = 1 and checkpoint_timeout = 1min to force
frequent checkpoints (actually they ocurred a few seconds apart)

initialize the pgbench database in each installation with:
pgbench -i -s 200 -F 90 test

and executed 6 times with:
pgbench -n -c 50 -j 5 -l -T 900 test

Results are:

Min (tps)
Unpatched - including connections establishing 133.046069 excluding it
133.085274
Patched - including connections establishing 139.567284 excluding
it 139.591229

Max (tps)
Unpatched - including connections establishing 147.082181 excluding
it 147.108752
Patched - including connections establishing 151.276416 excluding
it 151.311745

Avg (tps)
Unpatched - including connections establishing 140.750998 excluding
it 140.790336
Patched - including connections establishing 146.383735 excluding
it 146.411039

So in this extreme case avg tps is just 6 transactions better

PS: i'm attaching the files i use for the tests

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157

Attachment Content-Type Size
test_patched.sh application/x-sh 647 bytes
test_unpatched.sh application/x-sh 647 bytes

From: Greg Smith <greg(at)2ndquadrant(dot)com>
To: Jaime Casanova <jcasanov(at)systemguards(dot)com(dot)ec>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: XLogInsert
Date: 2009-12-15 06:16:00
Message-ID: 4B2729A0.2090807@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Jaime Casanova wrote:
> So in this extreme case avg tps is just 6 transactions better
>
Great job trying to find the spot where the code worked better. I'm not
so sure I trust pgbench results where the TPS was so low though. Which
leads us right back to exactly how Jeff measured his original results.

As I said already, I think we need more insight into Jeff's performance
report, a way to replicate that test, to look a bit at the latency as
reported by the updated LWLock patch that Pierre submitted. Tweaking
your test to give more useful results is a nice second opinion on top of
that. But we're out of time for now, so this patch is getting returned
with feedback. I encourage Jeff to resubmit the same patch or a better
one with a little more data on performance measurements to our final 8.5
CommitFest in hopes we can confirm this an improvement worth committing.

--
Greg Smith 2ndQuadrant Baltimore, MD
PostgreSQL Training, Services and Support
greg(at)2ndQuadrant(dot)com www.2ndQuadrant.com


From: Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: Jaime Casanova <jcasanov(at)systemguards(dot)com(dot)ec>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: XLogInsert
Date: 2009-12-16 19:07:07
Message-ID: 65937bea0912161107s2359e15enaca65cf980bc5d69@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2009/12/15 Greg Smith <greg(at)2ndquadrant(dot)com>

> Jaime Casanova wrote:
>
>> So in this extreme case avg tps is just 6 transactions better
>>
>>
> Great job trying to find the spot where the code worked better. I'm not so
> sure I trust pgbench results where the TPS was so low though. Which leads
> us right back to exactly how Jeff measured his original results.
>
> As I said already, I think we need more insight into Jeff's performance
> report, a way to replicate that test, to look a bit at the latency as
> reported by the updated LWLock patch that Pierre submitted. Tweaking your
> test to give more useful results is a nice second opinion on top of that.
> But we're out of time for now, so this patch is getting returned with
> feedback. I encourage Jeff to resubmit the same patch or a better one with
> a little more data on performance measurements to our final 8.5 CommitFest
> in hopes we can confirm this an improvement worth committing.
>
>
>
Last week I worked on a FUSE based filesystem, which I call BlackholeFS. Its
similar to /dev/null, but for directories. Basically it simply returns
success for all the writes, but doesn't do any writes on the files under it.

Would moving the pg_xlog/ (and possibly table data too) to such a filesystem
exercise this patch better?

Best regards,
--
Lets call it Postgres

EnterpriseDB http://www.enterprisedb.com

gurjeet[(dot)singh](at)EnterpriseDB(dot)com

singh(dot)gurjeet(at){ gmail | hotmail | indiatimes | yahoo }.com
Twitter: singh_gurjeet
Skype: singh_gurjeet

Mail sent from my BlackLaptop device


From: Andres Freund <andres(at)anarazel(dot)de>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>, Jaime Casanova <jcasanov(at)systemguards(dot)com(dot)ec>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Subject: Re: XLogInsert
Date: 2009-12-16 19:14:21
Message-ID: 200912162014.22461.andres@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wednesday 16 December 2009 20:07:07 Gurjeet Singh wrote:
> 2009/12/15 Greg Smith <greg(at)2ndquadrant(dot)com>
>
> > Jaime Casanova wrote:
> >> So in this extreme case avg tps is just 6 transactions better
> >
> > Great job trying to find the spot where the code worked better. I'm not
> > so sure I trust pgbench results where the TPS was so low though. Which
> > leads us right back to exactly how Jeff measured his original results.
> >
> > As I said already, I think we need more insight into Jeff's performance
> > report, a way to replicate that test, to look a bit at the latency as
> > reported by the updated LWLock patch that Pierre submitted. Tweaking
> > your test to give more useful results is a nice second opinion on top of
> > that. But we're out of time for now, so this patch is getting returned
> > with feedback. I encourage Jeff to resubmit the same patch or a better
> > one with a little more data on performance measurements to our final 8.5
> > CommitFest in hopes we can confirm this an improvement worth committing.
>
> Last week I worked on a FUSE based filesystem, which I call BlackholeFS.
> Its similar to /dev/null, but for directories. Basically it simply returns
> success for all the writes, but doesn't do any writes on the files under
> it.
I doubt that it will be faster than a tmpfs - the additional context switches
et al probably will hurt already.
If you constrain the checkpoint_segments to something sensible it shouldnt use
too much memory.

Andres