Re: Running a query twice to ensure cached results.

From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Luke Lonergan <llonergan(at)greenplum(dot)com>
Cc: Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Running a query twice to ensure cached results.
Date: 2006-06-13 12:42:08
Message-ID: 20060613124208.GC19212@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jun 13, 2006 at 04:54:05AM -0700, Luke Lonergan wrote:
> > Experimental results here suggest that for larger tables Linux seems
> > to detect a seq-scan and not bother caching. It's very reproducible
> > for me here to do a reboot and not see the full speedup on a seq_scan
> > until the third time I run a query.su
>
> What you are seeing is the now infamous "Postgres writes a table one more
> time after loading" behavior.
>
> Simon Riggs once dug into it to find the root cause, and I no longer recall
> exactly why, but after you've loaded data, the first seq scan will re-write
> some large portion of the data while doing the initial scan. This wreaks
> havoc on normal benchmarking practices.

Is it possible it may have something to do with the hint bits? There's
are a bunch of bits in the header to deal with speeding up of MVCC
tests. Maybe changing those bits marks the page dirty and forces a
write?

Have a ncie day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Luke Lonergan 2006-06-13 12:46:23 Re: Running a query twice to ensure cached results.
Previous Message Andrew Dunstan 2006-06-13 12:25:55 Re: CSV mode option for pg_dump