Re: Cost limited statements RFC

From: Greg Smith <greg(at)2ndQuadrant(dot)com>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Cost limited statements RFC
Date: 2013-06-08 21:45:27
Message-ID: 51B3A5F7.7070904@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 6/8/13 5:17 PM, Jeff Janes wrote:

> But my gut feeling is that if autovacuum is trying to read faster than
> the hardware will support, it will just automatically get throttled, by
> inherent IO waits, at a level which can be comfortably supported. And
> this will cause minimal interference with other processes.

If this were true all the time autovacuum tuning would be a lot easier.
You can easily make a whole server unresponsive by letting loose one
rogue process doing a lot of reads. Right now this isn't a problem for
autovacuum because any one process running at 7.8MB/s is usually not a
big deal. It doesn't take too much in the way of read-ahead logic and
throughput to satisfy that. But I've seen people try and push the read
rate upwards who didn't get very far beyond that before it was way too
obtrusive.

I could collect some data from troubled servers to see how high I can
push the read rate before they suffer. Maybe there's a case there for
increasing the default read rate because the write one is a good enough
secondary limiter. I'd be surprised if we could get away with more than
a 2 or 3X increase though, and the idea of going unlimited is really
scary. It took me a year of before/after data collection before I was
confident that it's OK to run unrestricted in all cache hit situations.

> Why is there so much random IO? Do your systems have
> autovacuum_vacuum_scale_factor set far below the default? Unless they
> do, most of the IO (both read and write) should be sequential.

Insert one process doing sequential reads into a stream of other
activity and you can easily get random I/O against the disks out of the
mix. You don't necessarily need the other activity to be random to get
that. N sequential readers eventually acts like N random readers for
high enough values of N. On busy servers, autovacuum is normally
competing against multiple random I/O processes though.

Also, the database's theoretical model that block number correlates
directly with location on disk can break down. I haven't put a hard
number to measuring it directly, but systems with vacuum problems seem
more likely to have noticeable filesystem level fragmentation. I've
been thinking about collecting data from a few systems with filefrag to
see if I'm right about that.

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Smith 2013-06-08 22:11:02 Re: Cost limited statements RFC
Previous Message Noah Misch 2013-06-08 21:41:33 Re: Optimising Foreign Key checks