some PITR performance data with DBT-2

Lists: pgsql-hackers
From: Mark Wong <markw(at)osdl(dot)org>
To: simon(at)2ndquadrant(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: some PITR performance data with DBT-2
Date: 2004-09-15 16:28:32
Message-ID: 20040915092832.A4016@osdl.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi Simon,

Sorry it has taken so long. Among other things, I doubled the controllers
and drives on the system I was testing this on. But now I have some data
against PostgreSQL-8.0beta2.

Here is the test run with archiving enabled:
http://www.osdl.org/projects/dbt2dev/results/dev4-010/158/

Here is the test run with archiving disabled:
http://www.osdl.org/projects/dbt2dev/results/dev4-010/159/

Here is sar/iostat/vmstat and oprofile data during the first hour of
recovery. Total recovery time took about 6.5 hours:
http://www.developer.osdl.org/markw/pitr/

The overall throughput difference between the two runs with archiving
enabled/disabled was within 1%.

I ran the test over a duration of 3 hours (including a 2 hour rampup of
the driver), as opposed to the 6 hours you originally requested. I
hope that is ok.

System details, which you may be interested in:

4 x 1.5 GHz Itanium 2
16GB RAM
6 x Compaq Computer Corporation Smart Array 64xx
6 x 14 disk 15K RPM drives (split bus)

The database and archive directory were put onto a single LVM volume
across all 84 drives.

Let me know if I left anything out.

--
Mark Wong - - markw(at)osdl(dot)org
Open Source Development Lab Inc - A non-profit corporation
12725 SW Millikan Way - Suite 400 - Beaverton, OR 97005
(503) 626-2455 x 32 (office)
(503) 626-2436 (fax)
http://developer.osdl.org/markw/


From: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
To: "Mark Wong" <markw(at)osdl(dot)org>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: some PITR performance data with DBT-2
Date: 2004-09-15 20:50:17
Message-ID: NOEFLCFHBPDAFHEIPGBOGEIPCEAA.simon@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>Mark Wong wrote
> Hi Simon,
>
> Sorry it has taken so long. Among other things, I doubled the controllers
> and drives on the system I was testing this on. But now I have some data
> against PostgreSQL-8.0beta2.
>

Thanks very much.

> Here is the test run with archiving enabled:
> http://www.osdl.org/projects/dbt2dev/results/dev4-010/158/
>
> Here is the test run with archiving disabled:
> http://www.osdl.org/projects/dbt2dev/results/dev4-010/159/
>

> The overall throughput difference between the two runs with archiving
> enabled/disabled was within 1%.
>

Excellent. I hoped it was that low - my target was < 5%.

Stats check out with no wierdness in the results. TGFT.

Also, I notice the tpm figures have gone up some more - have you got new
hardware, or has the PostgreSQL setup been tuned more? Or can it be that
rel8.0 really is that much faster??

> Here is sar/iostat/vmstat and oprofile data during the first hour of
> recovery. Total recovery time took about 6.5 hours:
> http://www.developer.osdl.org/markw/pitr/
>

That's bad news. My own recovery performance estimates would lead me to hope
that its possible to get the recovery to be quicker than the processes that
wrote the logs, even on a very quick 4 CPU system. I'd be hoping for ~1
hour, or at least <= 4 hours.

> I ran the test over a duration of 3 hours (including a 2 hour rampup of
> the driver), as opposed to the 6 hours you originally requested. I
> hope that is ok.
>
> System details, which you may be interested in:
>
> 4 x 1.5 GHz Itanium 2
> 16GB RAM
> 6 x Compaq Computer Corporation Smart Array 64xx
> 6 x 14 disk 15K RPM drives (split bus)
>
> The database and archive directory were put onto a single LVM volume
> across all 84 drives.
>
> Let me know if I left anything out.
>

First off, thank you again.

I've had a look at all the results, but I found a few things:

- couldnt find postgresql.conf or recovery.conf anywhere, so not sure what
OS command you are using
- log files were very large indeed due to the SPI error messages, so I
haven't been able to download those properly for analysis - any chance you
could grep out the SPI stuff, so I can see the archive and restore commands?

Stats I'd be interested in for analysing recovery performance would be:
- how many log files in total were archived/restored
- where were they archived to
- what was the archive/recovery command?

Best Regards, Simon Riggs


From: Mark Wong <markw(at)osdl(dot)org>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: some PITR performance data with DBT-2
Date: 2004-09-15 22:16:13
Message-ID: 20040915151613.A27518@osdl.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Sep 15, 2004 at 09:50:17PM +0100, Simon Riggs wrote:
> >Mark Wong wrote
> > Hi Simon,
> >
> > Sorry it has taken so long. Among other things, I doubled the controllers
> > and drives on the system I was testing this on. But now I have some data
> > against PostgreSQL-8.0beta2.
> >
>
> Thanks very much.
>
> > Here is the test run with archiving enabled:
> > http://www.osdl.org/projects/dbt2dev/results/dev4-010/158/
> >
> > Here is the test run with archiving disabled:
> > http://www.osdl.org/projects/dbt2dev/results/dev4-010/159/
> >
>
> > The overall throughput difference between the two runs with archiving
> > enabled/disabled was within 1%.
> >
>
> Excellent. I hoped it was that low - my target was < 5%.
>
> Stats check out with no wierdness in the results. TGFT.
>
> Also, I notice the tpm figures have gone up some more - have you got new
> hardware, or has the PostgreSQL setup been tuned more? Or can it be that
> rel8.0 really is that much faster??

It's actually lower than where I was when I started breaking tables out
onto separate volumes. I suspect you may be looking at data from a
different (and slower) system. Slightly old data from the same system are
here:
http://www.osdl.org/projects/dbt2dev/results/fs-64bit.html

> > Here is sar/iostat/vmstat and oprofile data during the first hour of
> > recovery. Total recovery time took about 6.5 hours:
> > http://www.developer.osdl.org/markw/pitr/
> >
>
> That's bad news. My own recovery performance estimates would lead me to hope
> that its possible to get the recovery to be quicker than the processes that
> wrote the logs, even on a very quick 4 CPU system. I'd be hoping for ~1
> hour, or at least <= 4 hours.
>
> > I ran the test over a duration of 3 hours (including a 2 hour rampup of
> > the driver), as opposed to the 6 hours you originally requested. I
> > hope that is ok.
> >
> > System details, which you may be interested in:
> >
> > 4 x 1.5 GHz Itanium 2
> > 16GB RAM
> > 6 x Compaq Computer Corporation Smart Array 64xx
> > 6 x 14 disk 15K RPM drives (split bus)
> >
> > The database and archive directory were put onto a single LVM volume
> > across all 84 drives.
> >
> > Let me know if I left anything out.
> >
>
> First off, thank you again.
>
> I've had a look at all the results, but I found a few things:
>
> - couldnt find postgresql.conf or recovery.conf anywhere, so not sure what
> OS command you are using

For postgresql.conf parameters, I added "database parameters" link to a
"SHOW ALL" command a little late, but it's there now and shows:
archive_command | cp %p /opt/misc/archive/%f

I've already lost the recovery.done file but I used the command:
restore_command = 'cp /opt/misc/archive/%f %p'

> - log files were very large indeed due to the SPI error messages, so I
> haven't been able to download those properly for analysis - any chance you
> could grep out the SPI stuff, so I can see the archive and restore commands?

Ok, there should be a log-sans-spi.txt.gz available now.

> Stats I'd be interested in for analysing recovery performance would be:
> - how many log files in total were archived/restored

I did a line count of "archived transaction log file" and got 7604.
Unforunitely I don't have the output for the restore anymore.

> - where were they archived to

Into a separate directory on the same volume with the rest of the database.
I'm starting to break things out into separate volumes again.

Mark