Synchronized Scan preliminary results

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Synchronized Scan preliminary results
Date: 2006-12-10 22:18:49
Message-ID: 457C87C9.9010100@j-davis.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I posted a patch on -patches for reference.

The preliminary results I got against that patch are mixed.

I did 4 runs, each with an 11GB table on a machine with 1GB of RAM. Yes,
I know this is bad, consumer grade hardware, but it controls for
variables in the I/O layer (like the controller's read-ahead or I/O
scheduling) and simplifies the test. Each test was 4 threads doing two
iterations of a COUNT(*). The threads were started 1 minute apart. A
single-threaded COUNT(*) takes about 334 seconds.

My findings:
(1) In the test with shared_buffers=24MB, the plain 8.2 took 2227
seconds to complete all threads (each thread finished 2 scans in about
2100 seconds), whereas with my patch it took 901 seconds (about 721
seconds per thread).
(2) With shared_buffers=128MB, the plain 8.2 took 2842 seconds (about
2600 seconds per thread), whereas with my patch it took 899 (about 718
seconds per thread).

Conclusion from tests: First, my patch can be effective. Second, the
normal behavior is quite unpredictable itself: why did increasing
shared_buffers destroy the performance? It didn't seem like enough of an
increase to wipe out the OS buffer cache. Also, the scans were quite
stable within a test, there weren't wild variations from one scan to the
next.

Luke also provided me with results of his own, but he tested on much
better hardware (the patch he used is identical from a technical
standpoint, but may have some cosmetic differences):

"It uses five simultaneous scans as before, but this time the table is
120GB on a machine with 8GB of RAM. The data is stored on one non-raid
disk, so a single scan should take about 33 minutes.

The scans are started 5 minutes apart, so the last scan would end in
about 3 hours if they were independent (they're not). 8.2 unmodified
runs the test in about 4 hours and 20 minutes. With the first patch, it
runs in about 5 hours and 30 minutes."

So, Luke experienced an actual slowdown.

I think that there is a lot of room for improvement over the
normal behavior. First, I think we can make the normal behavior more
predictable; and second, I think we can make the average case better
when the scans are larger than the available memory.

However, my patch has a long way to go. I need to figure out what causes
the performance degradation in Luke's case. My plan is:

(1) Try to add some better instrumentation to the patch, as Simon suggested.
(2) I'll allocate a new machine and put Solaris on it and try to use
DTrace. Maybe that will tell me something useful. I have limited
experience with DTrace, so if someone else wants to help me let me know.
(3) I'll make sure the patch only turns on if the table size is greater
than some multiple of effective_cache_size.
(4) Jim had the idea to start the scan before the hint, to take
advantage of the already-existing cache trail. This can be done by only
storing the hint if the scan is currently greater than the start
location plus the amount we're subtracting from the hint. We can make
that amount some fraction of effective_cache_size.
(5) Should I issue a warning if there is a collision in the table?
Florian Pflug raised the concern of mysterious performance regressions.
(6) Heikki has the idea for each backend holding a bitmap of the blocks
it has read in local memory. I like this idea a lot, but there are a lot
of considerations.

Others on this list have suggested that some level of enforced
synchronization might help. Tom was interested in packs of scans moving
together, and there was a lot of discussion along those lines. I think
the results show that my patch needs to do something along those lines,
because we don't want to actually lose performance in any case. However,
I think it's too early to say what we need to do without tests. The
challenge for me is that each of these tests take hours, and I can't
necessarily reproduce problems.

Thanks to everyone for your input so far.

Regards,
Jeff Davis

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2006-12-10 22:36:21 Re: Synchrnonized Scan test
Previous Message Heikki Linnakangas 2006-12-10 19:56:29 Re: Grouped Index Tuples