Re: Backup throttling

From: Antonin Houska <antonin(dot)houska(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Backup throttling
Date: 2014-01-20 16:10:15
Message-ID: 52DD4A67.8040606@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 01/15/2014 10:52 PM, Alvaro Herrera wrote:
> I gave this patch a look. There was a bug that the final bounds check
> for int32 range was not done when there was no suffix, so in effect you
> could pass numbers larger than UINT_MAX and pg_basebackup would not
> complain until the number reached the server via BASE_BACKUP. Maybe
> that's fine, given that numbers above 1G will cause a failure on the
> server side anyway, but it looked like a bug to me. I tweaked the parse
> routine slightly; other than fix the bug, I made it accept fractional
> numbers, so you can say 0.5M for instance.

Thanks.

> Perhaps we should make sure pg_basebackup rejects numbers larger than 1G
> as well.

Is there a good place to define the constant, so that both backend and
client can use it? I'd say 'include/common' but no existing file seems
to be appropriate. I'm not sure if it's worth to add a new file there.

> Another thing I found a bit strange was the use of the latch. What this
> patch does is create a separate latch which is used for the throttling.
> This means that if the walsender process receives a signal, it will not
> wake up if it's sleeping in throttling. Perhaps this is okay: as Andres
> was quoted upthread as saying, maybe this is not a problem because the
> sleep times are typically short anyway. But we're pretty much used to
> the idea that whenever a signal is sent, processes act on it
> *immediately*. Maybe some admin will not feel comfortable about waiting
> some extra 20ms when they cancel their base backups. In any case,
> having a secondary latch to sleep on in a process seems weird. Maybe
> this should be using MyWalSnd->latch somehow.

o.k., MyWalSnd->latch is used now.

> You have this interesting THROTTLING_MAX_FREQUENCY constant defined to
> 128, with the comment "check this many times per second".
> Let's see: if the user requests 1MB/s, this value results in
> throttling_sample = 1MB / 128 = 8192. So for every 8kB transferred, we
> would stop, check the wall clock time, and if less time has lapsed than
> we were supposed to spend transferring those 8kB then we sleep. Isn't a
> check every 8kB a bit too frequent? This doesn't seem sensible to me.
> I think we should be checking perhaps every tenth of the requested
> maximum rate, or something like that, not every 1/128th.
>
> Now, what the code actually does is not necessarily that, because the
> sampling value is clamped to a minimum of 32 kB. But then if we're
> going to do that, why use such a large divisor value in the first place?
> I propose we set that constant to a smaller value such as 8.

I tried to use THROTTLING_SAMPLE_MIN and THROTTLING_MAX_FREQUENCY to
control both the minimum and maximum chunk size. It was probably too
generic, THROTTLING_SAMPLE_MIN is no longer there.

New patch version is attached.

// Antonin Houska (Tony)

Attachment Content-Type Size
backup_throttling_v7.patch text/x-patch 16.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2014-01-20 16:19:21 Re: Performance Improvement by reducing WAL for Update Operation
Previous Message Stephen Frost 2014-01-20 16:02:53 Re: ALTER TABLESPACE ... MOVE ALL TO ...