Re: Index Scans become Seq Scans after VACUUM ANALYSE

From: Hannu Krosing <hannu(at)tm(dot)ee>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Luis Alberto Amigo Navarro <lamigo(at)atc(dot)unican(dot)es>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Index Scans become Seq Scans after VACUUM ANALYSE
Date: 2002-04-17 17:15:40
Message-ID: 1019063745.1924.14.camel@taru.tm.ee
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 2002-04-17 at 17:16, Tom Lane wrote:
>
> It's entirely possible that the default value of random_page_cost is too
> high, at least for many modern machines. The experiments I did to get
> the 4.0 figure were done a couple years ago, on hardware that wasn't
> exactly new at the time. I have not heard of anyone else trying to
> measure it though.
>
> I don't think I have the source code I used anymore, but the principle
> is simple enough:
>
> 1. Make a large file (several times the size of your machine's RAM, to
> ensure you swamp out kernel disk buffering effects). Fill with random
> data. (NB: do not fill with zeroes, some filesystems optimize this away.)

People running postgres often already have large files of random data
under $PGDATA directory :)

> 2. Time reading the file sequentially, 8K per read request.
> Repeat enough to get a statistically trustworthy number.
>
> 3. Time reading randomly-chosen 8K pages from the file. Repeat
> enough to get a trustworthy number (the total volume of pages read
> should be several times the size of your RAM).
>
> 4. Divide.
>
> The only tricky thing about this is making sure you are measuring disk
> access times and not being fooled by re-accessing pages the kernel still
> has cached from a previous access. (The PG planner does try to account
> for caching effects, but that's a separate estimate; the value of
> random_page_cost isn't supposed to include caching effects.) AFAIK the
> only good way to do that is to use a large test, which means it takes
> awhile to run; and you need enough spare disk space for a big test file.

If you have the machine all for yourself you can usually tell it to use
less RAM at boot time.

On linux it is append=" mem=32M" switch in lilo.conf or just mem=32M on
lilo boot command line.

> It'd be interesting to get some numbers for this across a range of
> hardware, filesystems, etc ...

---------------
Hannu

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Hannu Krosing 2002-04-17 17:28:04 Re: Index Scans become Seq Scans after VACUUM ANALYSE
Previous Message mlw 2002-04-17 16:35:13 Re: Index Scans become Seq Scans after VACUUM ANALYSE