Re: bg worker: general purpose requirements

From: Markus Wanner <markus(at)bluegap(dot)ch>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Itagaki Takahiro <itagaki(dot)takahiro(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: bg worker: general purpose requirements
Date: 2010-09-21 15:31:25
Message-ID: 4C98CFCD.9070403@bluegap.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 09/21/2010 03:46 PM, Robert Haas wrote:
> Wait, are we in violent agreement here? An overall limit on the
> number of parallel jobs is exactly what I think *does* make sense.
> It's the other knobs I find odd.

Note that the max setting I've been talking about here is the maximum
amount of *idle* workers allowed. It does not include busy bgworkers.

> I guess we differ on the meaning of "cope well"... being able to spin
> up 18 workers in one second seems very fast to me.

Well, it's obviously use case dependent. For Postgres-R (and sync
replication) in general, people are very sensitive to latency. There's
the network latency already, but adding a 50ms latency for no good
reason is not going to make these people happy.

> How many do you expect to ever need?!!

Again, very different. For Postgres-R, easily a couple of dozens. Same
applies for parallel querying when having multiple concurrent parallel
queries.

> Possibly, but I'm still having a hard time understanding why you need
> all the complexity you already have.

To make sure I we only pay the startup cost in very rare occasions, and
not every time the workload changes a bit (or isn't in conformance with
an arbitrary timeout).

(BTW the min/max is hardly any more complex than a timeout. It doesn't
even need a syscall).

> It seems (to me) like your design is being driven by start-up latency,
> which I just don't understand. Sure, 50 ms to start up a worker isn't
> fantastic, but the idea is that it won't happen much because there
> will probably already be a worker in that database from previous
> activity. The only exception is when there's a sudden surge of
> activity.

I'm less optimistic about the consistency of the workload.

> But I don't think that's the case to optimize for. If a
> database hasn't had any activity in a while, I think it's better to
> reclaim the memory and file descriptors and ProcArray slots that we're
> spending on it so that the rest of the system can run faster.

Absolutely. It's what I call a change in workload. The min/max approach
is certainly faster at reclaiming unused workers, but (depending on the
max setting) doesn't necessarily ever go down to zero.

Regards

Markus Wanner

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Fetter 2010-09-21 15:32:19 Re: Shutting down server from a backend process, e.g. walrceiver
Previous Message Kevin Grittner 2010-09-21 15:28:58 moving development branch activity to new git repo