Re: determining random_page_cost value

Lists: pgsql-hackers
From: Yohanes Santoso <pgsql-hackers(at)microjet(dot)ath(dot)cx>
To: pgsql-hackers mailing list <pgsql-hackers(at)postgresql(dot)org>
Subject: determining random_page_cost value
Date: 2005-10-25 16:29:21
Message-ID: 87irvlef1a.fsf@microjet.ath.cx
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

[To admin: this message was posted earlier via google group. needless
to say, it was stalled waiting for approval, please ignore that
one. Thanks.]

Hi,

Yesterday in #pgsql, I was talking with neilc about determining rpc
value in a more concrete way. So I created a program that compares
exhaustive (all blocks are eventually read) random reads with
sequential reads. The full source is attached.

I tested the db files residing on a software RAID-1 composed of 2 IDE
7200rpm drives on linux 2.6.12.

What I discovered is:

<quote>
random_page_cost (floating point)

Sets the planner's estimate of the cost of a nonsequentially
fetched disk page. This is measured as a multiple of the cost of a
sequential page fetch. A higher value makes it more likely a
sequential scan will be used, a lower value makes it more likely
an index scan will be used. The default is four.
</quote>

is not precise enough. Which pages? Those that belong to the dbase
file or sequential pages on the media?

On dbases smaller (calculated from du <dbase_dir>)than 500M, I got a
ratio (random over sequential time) of 4.5:1. A 3.0GB dbase has a
ratio of 10:1. On a 3GB contiguous file, the ratio is about 4:1.

If, in fact, the pages meant in the quotation are pages occupied by
the dbase files, then does that mean the RPC config should be changed
over time to reflect the varying ratio (which I guess is due to file
fragmentation)? If that's the case, isn't RPC config actually a
per-database config rather than a per-cluster config?

Thanks,
YS (gnome)

Attachment Content-Type Size
determine_rpc.c text/x-csrc 7.1 KB

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Yohanes Santoso <pgsql-hackers(at)microjet(dot)ath(dot)cx>
Subject: Re: determining random_page_cost value
Date: 2005-10-25 18:08:08
Message-ID: 200510251108.09043.josh@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Yohanes,

> Yesterday in #pgsql, I was talking with neilc about determining rpc
> value in a more concrete way. So I created a program that compares
> exhaustive (all blocks are eventually read) random reads with
> sequential reads. The full source is attached.

Thanks for code.

> I tested the db files residing on a software RAID-1 composed of 2 IDE
> 7200rpm drives on linux 2.6.12.

FWIW, most performance-conscious users will be using a SCSI RAID array.

> is not precise enough. Which pages? Those that belong to the dbase
> file or sequential pages on the media?

Well, it's actually calculating the cost ratio of pulling non-sequential
random *rows* from the db files against pulling sequential blocks.

> On dbases smaller (calculated from du <dbase_dir>)than 500M, I got a
> ratio (random over sequential time) of 4.5:1. A 3.0GB dbase has a
> ratio of 10:1. On a 3GB contiguous file, the ratio is about 4:1.

All of this goes to uphold Tom's general assertion that the default of 4 is
more or less correct but the calculation in which we're using that number is
not.

--
Josh Berkus
Aglio Database Solutions
San Francisco


From: Yohanes Santoso <pgsql-hackers(at)microjet(dot)ath(dot)cx>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: determining random_page_cost value
Date: 2005-10-25 20:37:34
Message-ID: 87d5ltcoz5.fsf@microjet.ath.cx
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Josh Berkus <josh(at)agliodbs(dot)com> writes:

>> I tested the db files residing on a software RAID-1 composed of 2 IDE
>> 7200rpm drives on linux 2.6.12.
>
> FWIW, most performance-conscious users will be using a SCSI RAID
> array.

No worry, I'm not out to squeeze every little juice from a particular
installation, which in this case is my home computer. I am interested
in automating estimation of a suitable RPC value for a given
installation.

> Well, it's actually calculating the cost ratio of pulling non-sequential
> random *rows* from the db files against pulling sequential blocks.

Then running it against the db files should yield better estimation
than on sequential pages.

>> On dbases smaller (calculated from du <dbase_dir>)than 500M, I got a
>> ratio (random over sequential time) of 4.5:1. A 3.0GB dbase has a
>> ratio of 10:1. On a 3GB contiguous file, the ratio is about 4:1.
>
> All of this goes to uphold Tom's general assertion that the default of 4 is
> more or less correct

Doesn't this show that 4:1 is a pretty optimistic value considering
that no long-running db files are fragmentation-free?

>but the calculation in which we're using that number is
> not.

The calculation inside the planner, IOW, how the planner uses the RPC
value?

Thanks,
YS.


From: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
To: Yohanes Santoso <pgsql-hackers(at)microjet(dot)ath(dot)cx>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: determining random_page_cost value
Date: 2005-10-26 21:56:11
Message-ID: 20051026215611.GJ16682@pervasive.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Oct 25, 2005 at 04:37:34PM -0400, Yohanes Santoso wrote:
> > All of this goes to uphold Tom's general assertion that the default of 4 is
> > more or less correct
>
> Doesn't this show that 4:1 is a pretty optimistic value considering
> that no long-running db files are fragmentation-free?
>
> >but the calculation in which we're using that number is
> > not.
>
> The calculation inside the planner, IOW, how the planner uses the RPC
> value?

The problem with RPC is that the estimator functions are sub-optimal in
many cases and tend to favor seqscan when they shouldn't. This is why
many people run with RPC set unrealistically low, such as 2.

IMHO until the estimator algorithms improve worrying about RPC is
pointless.
--
Jim C. Nasby, Sr. Engineering Consultant jnasby(at)pervasive(dot)com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461