Re: Script to compute random page cost

From: "Christopher Kings-Lynne" <chriskl(at)familyhealth(dot)com(dot)au>
To: "Bruce Momjian" <pgman(at)candle(dot)pha(dot)pa(dot)us>, "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Script to compute random page cost
Date: 2002-09-09 07:20:42
Message-ID: GNELIHDDFBOCMGBFGEFOMECFCEAA.chriskl@familyhealth.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I got:

random_page_cost = 0.807018

For FreeBSD 4.4/i386

With 512MB RAM & SCSI HDD

Chris

> -----Original Message-----
> From: pgsql-hackers-owner(at)postgresql(dot)org
> [mailto:pgsql-hackers-owner(at)postgresql(dot)org]On Behalf Of Bruce Momjian
> Sent: Monday, 9 September 2002 2:14 PM
> To: PostgreSQL-development
> Subject: Re: [HACKERS] Script to compute random page cost
>
>
>
> OK, turns out that the loop for sequential scan ran fewer times and was
> skewing the numbers. I have a new version at:
>
> ftp://candle.pha.pa.us/pub/postgresql/randcost
>
> I get _much_ lower numbers now for random_page_cost.
>
> ------------------------------------------------------------------
> ---------
>
> Bruce Momjian wrote:
> > Because we have seen many complains about sequential vs index scans, I
> > wrote a script which computes the value for your OS/hardware
> > combination.
> >
> > Under BSD/OS on one SCSI disk, I get a random_page_cost around 60. Our
> > current postgresql.conf default is 4.
> >
> > What do other people get for this value?
> >
> > Keep in mind if we increase this value, we will get a more sequential
> > scans vs. index scans.
> >
> > One flaw in this test is that it randomly reads blocks from different
> > files rather than randomly reading from the same file. Do people have a
> > suggestion on how to correct this? Does it matter?
> >
> > --
> > Bruce Momjian | http://candle.pha.pa.us
> > pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
> > + If your life is a hard drive, | 13 Roberts Road
> > + Christ can be your backup. | Newtown Square,
> Pennsylvania 19073
>
> > #!/bin/bash
> >
> > trap "rm -f /tmp/$$" 0 1 2 3 15
> >
> > BLCKSZ=8192
> >
> > if [ "$RANDOM" = "$RANDOM" ]
> > then echo "Your shell does not support \$RANDOM. Try
> using bash." 1>&2
> > exit 1
> > fi
> >
> > # XXX We assume 0 <= random <= 32767
> >
> > echo "Collecting sizing information ..."
> >
> > TEMPLATE1=`du -s "$PGDATA/base/1" | awk '{print $1}'`
> > FULL=`du -s "$PGDATA/base" | awk '{print $1}'`
> > if [ "$FULL" -lt `expr "$TEMPLATE1" \* 4` ]
> > then echo "Your installation should have at least four
> times the data stored in template1 to yield meaningful results" 1>&2
> > exit 1
> > fi
> >
> > find "$PGDATA/base" -type f -exec ls -ld {} \; |
> > awk '$5 % '"$BLCKSZ"' == 0 {print $5 / '"$BLCKSZ"', $9}' |
> > grep -v '^0 ' > /tmp/$$
> >
> > TOTAL=`awk 'BEGIN {sum=0}
> > {sum += $1}
> > END {print sum}' /tmp/$$`
> >
> > echo "Running random access timing test ..."
> >
> > START=`date '+%s'`
> > PAGES=1000
> >
> > while [ "$PAGES" -ne 0 ]
> > do
> > BIGRAND=`expr "$RANDOM" \* 32768 + "$RANDOM"`
> >
> > OFFSET=`awk 'BEGIN{printf "%d\n", ('"$BIGRAND"' / 2^30) *
> '"$TOTAL"'}'`
> >
> > RESULT=`awk ' BEGIN {offset = 0}
> > offset + $1 > '"$OFFSET"' \
> > {print $2, '"$OFFSET"' - offset ; exit}
> > {offset += $1}' /tmp/$$`
> > FILE=`echo "$RESULT" | awk '{print $1}'`
> > OFFSET=`echo "$RESULT" | awk '{print $2}'`
> >
> > dd bs="$BLCKSZ" seek="$OFFSET" count=1 if="$FILE"
> of="/dev/null" >/dev/null 2>&1
> > PAGES=`expr "$PAGES" - 1`
> > done
> >
> > STOP=`date '+%s'`
> > RANDTIME=`expr "$STOP" - "$START"`
> >
> > echo "Running sequential access timing test ..."
> >
> > START=`date '+%s'`
> > # We run the random test 10 times more because it is quicker and
> > # we need it to run for a while to get accurate results.
> > PAGES=10000
> >
> > while [ "$PAGES" -ne 0 ]
> > do
> > BIGRAND=`expr "$RANDOM" \* 32768 + "$RANDOM"`
> >
> > OFFSET=`awk 'BEGIN{printf "%d\n", ('"$BIGRAND"' / 2^30) *
> '"$TOTAL"'}'`
> >
> > RESULT=`awk ' BEGIN {offset = 0}
> > offset + $1 > '"$OFFSET"' \
> > {print $2, $1; exit}
> > {offset += $1}' /tmp/$$`
> > FILE=`echo "$RESULT" | awk '{print $1}'`
> > FILEPAGES=`echo "$RESULT" | awk '{print $2}'`
> >
> > if [ "$FILEPAGES" -gt "$PAGES" ]
> > then FILEPAGES="$PAGES"
> > fi
> >
> > dd bs="$BLCKSZ" count="$FILEPAGES" if="$FILE"
> of="/dev/null" >/dev/null 2>&1
> > PAGES=`expr "$PAGES" - "$FILEPAGES"`
> > done
> >
> > STOP=`date '+%s'`
> > SEQTIME=`expr "$STOP" - "$START"`
> >
> > echo
> > awk 'BEGIN {printf "random_page_cost = %f\n", ('"$RANDTIME"' /
> '"$SEQTIME"') * 10}'
>
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 3: if posting/reading through Usenet, please send an appropriate
> > subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> > message can get through to the mailing list cleanly
>
> --
> Bruce Momjian | http://candle.pha.pa.us
> pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
> + If your life is a hard drive, | 13 Roberts Road
> + Christ can be your backup. | Newtown Square,
> Pennsylvania 19073
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Dave Page 2002-09-09 07:37:06 Re: Script to compute random page cost
Previous Message Bruce Momjian 2002-09-09 06:13:54 Re: Script to compute random page cost