Quick Links

Re: dynamic background workers

Lists:	pgsql-hackers

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	dynamic background workers
Date:	2013-06-14 21:00:13
Message-ID:	CA+TgmoYtQQ-JqAJPxZg3Mjg7EqugzqQ+ZBrpnXo95chWMCZsXw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Parallel query, or any subset of that project such as parallel sort,
will require a way to start background workers on demand. Thanks to
Alvaro's work on 9.3, we now have the ability to configure background
workers via shared_preload_libraries. But if you don't have the right
library loaded at startup time, and subsequently wish to add a
background worker while the server is running, you are out of luck.
Even if you do have the right library loaded, but want to start
workers in response to user activity, rather than when the database
comes on-line, you are also out of luck. Relaxing these restrictions
is essential for parallel query (or parallel processing of any kind),
and useful apart from that. Two patches are attached.

The first patch, max-worker-processes-v1.patch, adds a new GUC
max_worker_processes, which defaults to 8. This fixes the problem
discussed here:

http://www.postgresql.org/message-id/CA+TgmobguVO+qHnHvxBA2TFkDhw67Y=4Bp405FVABEc_EtO4VQ@mail.gmail.com

Apart from fixing that problem, it's a pretty boring patch.

The second patch, dynamic-bgworkers-v1.patch, revises the background
worker API to allow background workers to be started dynamically.
This requires some communication channel from ordinary workers to the
postmaster, because it is the postmaster that must ultimately start
the newly-registered workers. However, that communication channel has
to be designed pretty carefully, lest a shared memory corruption take
out the postmaster and lead to inadvertent failure to restart after a
crash. Here's how I implemented that: there's an array in shared
memory of a size equal to max_worker_processes. This array is
separate from the backend-private list of workers maintained by the
postmaster, but the two are kept in sync. When a new background
worker registration is added to the shared data structure, the backend
adding it uses the existing pmsignal mechanism to kick the postmaster,
which then scans the array for new registrations. I have attempted to
make the code that transfers the shared_memory state into the
postmaster's private state as paranoid as humanly possible. The
precautions taken are documented in the comments. Conversely, when a
background worker flagged as BGW_NEVER_RESTART is considered for
restart (and we decide against it), the corresponding slot in the
shared memory array is marked as no longer in use, allowing it to be
reused for a new registration.

Since the postmaster cannot take locks, synchronization between the
postmaster and other backends using the shared memory segment has to
be lockless. This mechanism is also documented in the comments. An
lwlock is used to prevent two backends that are both registering a new
worker at about the same time from stomping on each other, but the
postmaster need not care about that lwlock.

This patch also extends worker_spi as a demonstration of the new
interface. With this patch, you can CREATE EXTENSION worker_spi and
then call worker_spi_launch(int4) to launch a new background worker,
or combine it with generate_series() to launch a bunch at once. Then
you can kill them off with pg_terminate_backend() and start some new
ones. That, in my humble opinion, is pretty cool.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment	Content-Type	Size
max-worker-processes-v1.patch	application/octet-stream	14.5 KB
dynamic-bgworkers-v1.patch	application/octet-stream	35.0 KB

From:	Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: dynamic background workers
Date:	2013-06-15 01:28:39
Message-ID:	CAB7nPqR8pySQFcsrfVrC2fHwWtaLQQ4UMzHyH5UM594pKd4kxA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Sat, Jun 15, 2013 at 6:00 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> The second patch, dynamic-bgworkers-v1.patch, revises the background
> worker API to allow background workers to be started dynamically.
> This requires some communication channel from ordinary workers to the
> postmaster, because it is the postmaster that must ultimately start
> the newly-registered workers. However, that communication channel has
> to be designed pretty carefully, lest a shared memory corruption take
> out the postmaster and lead to inadvertent failure to restart after a
> crash. Here's how I implemented that: there's an array in shared
> memory of a size equal to max_worker_processes. This array is
> separate from the backend-private list of workers maintained by the
> postmaster, but the two are kept in sync. When a new background
> worker registration is added to the shared data structure, the backend
> adding it uses the existing pmsignal mechanism to kick the postmaster,
> which then scans the array for new registrations. I have attempted to
> make the code that transfers the shared_memory state into the
> postmaster's private state as paranoid as humanly possible. The
> precautions taken are documented in the comments. Conversely, when a
> background worker flagged as BGW_NEVER_RESTART is considered for
> restart (and we decide against it), the corresponding slot in the
> shared memory array is marked as no longer in use, allowing it to be
> reused for a new registration.
>
> Since the postmaster cannot take locks, synchronization between the
> postmaster and other backends using the shared memory segment has to
> be lockless. This mechanism is also documented in the comments. An
> lwlock is used to prevent two backends that are both registering a new
> worker at about the same time from stomping on each other, but the
> postmaster need not care about that lwlock.
>
> This patch also extends worker_spi as a demonstration of the new
> interface. With this patch, you can CREATE EXTENSION worker_spi and
> then call worker_spi_launch(int4) to launch a new background worker,
> or combine it with generate_series() to launch a bunch at once. Then
> you can kill them off with pg_terminate_backend() and start some new
> ones. That, in my humble opinion, is pretty cool.
>
This looks really interesting, +1. I'll test the patch if possible next
week.
--
Michael

From:	Simon Riggs <simon(at)2ndQuadrant(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: dynamic background workers
Date:	2013-06-16 12:21:06
Message-ID:	CA+U5nMLVKbDzRBYK_n3G3CToza23jG7mzTPqMqVddiKCHgcMkw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 14 June 2013 22:00, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> Parallel query, or any subset of that project such as parallel sort,
> will require a way to start background workers on demand. Thanks to
> Alvaro's work on 9.3, we now have the ability to configure background
> workers via shared_preload_libraries. But if you don't have the right
> library loaded at startup time, and subsequently wish to add a
> background worker while the server is running, you are out of luck.
> Even if you do have the right library loaded, but want to start
> workers in response to user activity, rather than when the database
> comes on-line, you are also out of luck. Relaxing these restrictions
> is essential for parallel query (or parallel processing of any kind),
> and useful apart from that. Two patches are attached.

Your proposal is exactly what we envisaged and parallel query always
was a target for background workers. The restrictions were only there
to ensure we got the feature into 9.3, rather than trying to implement
everything and then having it pushed back a release.

So +1.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

From:	Christopher Browne <cbbrowne(at)gmail(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	PostgreSQL Mailing Lists <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: dynamic background workers
Date:	2013-06-17 22:46:13
Message-ID:	CAFNqd5UBrZDv9ko532aH3=keY58MmkaM878HLc_n012yRjCHvw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

BTW, one of the ideas that popped up in the unConference session on
replication was "why couldn't we use a background worker as a replication
agent?"

The main reason pointed out was 'because that means you have to restart the
postmaster to add a replication agent.' (e.g. - like a Slony "slon"
process)

There may well be other better reasons not to do so, but it would be nice
to eliminate this reason. It seems seriously limiting to the bg-worker
concept for them to be thus restricted.
--
When confronted by a difficult problem, solve it by reducing it to the
question, "How would the Lone Ranger handle this?"

From:	Peter Eisentraut <peter_e(at)gmx(dot)net>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: dynamic background workers
Date:	2013-06-18 02:45:07
Message-ID:	1371523507.13762.9.camel@vanquo.pezone.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Fri, 2013-06-14 at 17:00 -0400, Robert Haas wrote:
> Alvaro's work on 9.3, we now have the ability to configure background
> workers via shared_preload_libraries. But if you don't have the right
> library loaded at startup time, and subsequently wish to add a
> background worker while the server is running, you are out of luck.

We could tweak shared_preload_libraries so that it reacts sensibly to
reloads. I basically gave up on that by writing
session_preload_libraries, but if there is more general use for that, we
could try.

(That doesn't invalidate your work, but it's a thought.)

From:	Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: dynamic background workers
Date:	2013-06-18 07:53:50
Message-ID:	CAB7nPqSRBSQOcQfneWwnd5PGeK8t+m6tsPy04quL=9V_5cCRcQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hi,

On Sat, Jun 15, 2013 at 6:00 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> The first patch, max-worker-processes-v1.patch, adds a new GUC
> max_worker_processes, which defaults to 8. This fixes the problem
> discussed here:
>
> http://www.postgresql.org/message-id/CA+TgmobguVO+qHnHvxBA2TFkDhw67Y=4Bp405FVABEc_EtO4VQ@mail.gmail.com
>
> Apart from fixing that problem, it's a pretty boring patch.
I just had a look at the first patch which is pretty simple before
looking in details at the 2nd patch.

Here are some minor comments:
1) Correction of postgresql.conf.sample
Putting the new parameter in the section resource usage is adapted,
however why not adding a new sub-section of this type with some
comments like below?
# - Background workers -
#max_worker_processes = 8 # Maximum number of background worker subprocesses
# (change requires restart)
2) Perhaps it would be better to specify in the docs that if the
number of bgworker that are tried to be started by server is higher
than max_worker_processes, startup of the extra bgworkers will fail
but server will continue running as if nothing happened. This is
something users should be made aware of.
3) In InitProcGlobal:proc.c, wouldn't it be more consistent to do that
when assigning new slots in PGPROC:
else if (i < MaxConnections + autovacuum_max_workers + max_worker_processes + 1)
{
/* PGPROC for bgworker, add to bgworkerFreeProcs list */
procs[i].links.next = (SHM_QUEUE *) ProcGlobal->bgworkerFreeProcs;
ProcGlobal->bgworkerFreeProcs = &procs[i];
}
instead of that?
else if (i < MaxBackends)
{
/* PGPROC for bgworker, add to bgworkerFreeProcs list */
procs[i].links.next = (SHM_QUEUE *) ProcGlobal->bgworkerFreeProcs;
ProcGlobal->bgworkerFreeProcs = &procs[i];
}

I have also done many tests with worker_spi and some home-made
bgworkers and the patch is working as expected, the extra bgworkers
are not started once the maximum number set it reached.
I'll try to look at the other patch soon, but I think that the real
discussion on the topic is just beginning... Btw, IMHO, this first
patch can safely be committed as we would have a nice base for future
discussions/reviews.
Regards,
--
Michael

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: dynamic background workers
Date:	2013-06-20 13:53:36
Message-ID:	CA+TgmoYsNCxkUBnyWpv1086t+Ea9z+yaLN0gsi6vAa7NyjmjsQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, Jun 17, 2013 at 10:45 PM, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
> On Fri, 2013-06-14 at 17:00 -0400, Robert Haas wrote:
>> Alvaro's work on 9.3, we now have the ability to configure background
>> workers via shared_preload_libraries. But if you don't have the right
>> library loaded at startup time, and subsequently wish to add a
>> background worker while the server is running, you are out of luck.
>
> We could tweak shared_preload_libraries so that it reacts sensibly to
> reloads. I basically gave up on that by writing
> session_preload_libraries, but if there is more general use for that, we
> could try.
>
> (That doesn't invalidate your work, but it's a thought.)

Yeah, I thought about that. But it doesn't seem possible to do
anything all that sane. You can't unload libraries if they've been
removed; you can potentially load new ones if they've been added. But
that's a bit confusing, if the config file says that's what's loaded
is bar, and what's actually loaded is foo, bar, baz, bletch, and quux.

Some variant of this might still be worth doing, but figuring out the
details sounded like more than I wanted to get into, so I punted. :-)

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

From:	Markus Wanner <markus(at)bluegap(dot)ch>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: dynamic background workers
Date:	2013-06-20 13:57:03
Message-ID:	51C30A2F.2090203@bluegap.ch
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Robert,

On 06/14/2013 11:00 PM, Robert Haas wrote:
> Parallel query, or any subset of that project such as parallel sort,
> will require a way to start background workers on demand.

thanks for continuing this, very much appreciated. Postgres-R and thus
TransLattice successfully use a similar approach for years, now.

I only had a quick glance over the patch, yet. Some comments on the design:

> This requires some communication channel from ordinary workers to the
> postmaster, because it is the postmaster that must ultimately start
> the newly-registered workers. However, that communication channel has
> to be designed pretty carefully, lest a shared memory corruption take
> out the postmaster and lead to inadvertent failure to restart after a
> crash. Here's how I implemented that: there's an array in shared
> memory of a size equal to max_worker_processes. This array is
> separate from the backend-private list of workers maintained by the
> postmaster, but the two are kept in sync. When a new background
> worker registration is added to the shared data structure, the backend
> adding it uses the existing pmsignal mechanism to kick the postmaster,
> which then scans the array for new registrations.

That sounds like a good simplification. Even if it's an O(n) operation,
the n in question here has relatively low practical limits. It's
unlikely to be much of a concern, I guess.

Back then, I solved it by having a "fork request slot". After starting,
the bgworker then had to clear that slot and register with a coordinator
process (i.e. the original requestor), so that one learned its fork
request was successful. At some point I expanded that to multiple
request slots to better handle multiple concurrent fork requests.
However, it was difficult to get right and requires more IPC than your
approach.

On the pro side: The shared memory area used by the postmaster was very
small in size and read-only to the postmaster. These were my main goals,
which I'm not sure were the best ones, now that I read your concept.

> I have attempted to
> make the code that transfers the shared_memory state into the
> postmaster's private state as paranoid as humanly possible. The
> precautions taken are documented in the comments. Conversely, when a
> background worker flagged as BGW_NEVER_RESTART is considered for
> restart (and we decide against it), the corresponding slot in the
> shared memory array is marked as no longer in use, allowing it to be
> reused for a new registration.

Sounds like the postmaster is writing to shared memory. Not sure why
I've been trying so hard to avoid that, though. After all, it can hardly
hurt itself *writing* to shared memory.

> Since the postmaster cannot take locks, synchronization between the
> postmaster and other backends using the shared memory segment has to
> be lockless. This mechanism is also documented in the comments. An
> lwlock is used to prevent two backends that are both registering a new
> worker at about the same time from stomping on each other, but the
> postmaster need not care about that lwlock.
>
> This patch also extends worker_spi as a demonstration of the new
> interface. With this patch, you can CREATE EXTENSION worker_spi and
> then call worker_spi_launch(int4) to launch a new background worker,
> or combine it with generate_series() to launch a bunch at once. Then
> you can kill them off with pg_terminate_backend() and start some new
> ones. That, in my humble opinion, is pretty cool.

It definitely is. Thanks again.

Regards

Markus Wanner

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Markus Wanner <markus(at)bluegap(dot)ch>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: dynamic background workers
Date:	2013-06-20 14:41:01
Message-ID:	CA+TgmoZj=S3E05jPodMiNjddOoLBN5Bzfyy2kk7jD+ZEMBQZ_w@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, Jun 20, 2013 at 9:57 AM, Markus Wanner <markus(at)bluegap(dot)ch> wrote:
> That sounds like a good simplification. Even if it's an O(n) operation,
> the n in question here has relatively low practical limits. It's
> unlikely to be much of a concern, I guess.

The constant factor is also very small. Generally, I would expect
num_worker_processes <~ # CPUs, and scanning a 32, 64, or even 128
element array is not a terribly time-consuming operation. We might
need to re-think this when systems with 4096 processors become
commonplace, but considering how many other things would also need to
be fixed to work well in that universe, I'm not too concerned about it
just yet.

One thing I think we probably want to explore in the future, for both
worker backends and regular backends, is pre-forking. We could avoid
some of the latency associated with starting up a new backend or
opening a new connection in that way. However, there are quite a few
details to be thought through there, so I'm not eager to pursue that
just yet. Once we have enough infrastructure to implement meaningful
parallelism, we can benchmark it and find out where the bottlenecks
are, and which solutions actually help most.

> Back then, I solved it by having a "fork request slot". After starting,
> the bgworker then had to clear that slot and register with a coordinator
> process (i.e. the original requestor), so that one learned its fork
> request was successful. At some point I expanded that to multiple
> request slots to better handle multiple concurrent fork requests.
> However, it was difficult to get right and requires more IPC than your
> approach.

I do think we need a mechanism to allow the backend that requested the
bgworker to know whether or not the bgworker got started, and whether
it unexpectedly died. Once we get to the point of calling user code
within the bgworker process, it can use any number of existing
mechanisms to make sure that it won't die without notifying the
backend that started it (short of a PANIC, in which case it won't
matter anyway). But we need a way to report failures that happen
before that point. I have some ideas about that, but decided to leave
them for future passes. The remit of this patch is just to make it
possible to dynamically register bgworkers. Allowing a bgworker to be
"tied" to the session that requested it via some sort of feedback loop
is a separate project - which I intend to tackle before CF2, assuming
this gets committed (and so far nobody is objecting to that).

>> I have attempted to
>> make the code that transfers the shared_memory state into the
>> postmaster's private state as paranoid as humanly possible. The
>> precautions taken are documented in the comments. Conversely, when a
>> background worker flagged as BGW_NEVER_RESTART is considered for
>> restart (and we decide against it), the corresponding slot in the
>> shared memory array is marked as no longer in use, allowing it to be
>> reused for a new registration.
>
> Sounds like the postmaster is writing to shared memory. Not sure why
> I've been trying so hard to avoid that, though. After all, it can hardly
> hurt itself *writing* to shared memory.

I think there's ample room for paranoia about postmaster interaction
with shared memory, but all it's doing is setting a flag, which is no
different from what CheckPostmasterSignal() already does.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

From:	Markus Wanner <markus(at)bluegap(dot)ch>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: dynamic background workers
Date:	2013-06-20 14:59:33
Message-ID:	51C318D5.7010903@bluegap.ch
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 06/20/2013 04:41 PM, Robert Haas wrote:
> The constant factor is also very small. Generally, I would expect
> num_worker_processes <~ # CPUs

That assumption might hold for parallel querying, yes. In case of
Postgres-R, it doesn't. In the worst case, i.e. with a 100% write load,
a cluster of n nodes, each with m backends performing transactions, all
of them replicated to all other (n-1) nodes, you end up with ((n-1) * m)
bgworkers. Which is pretty likely to be way above the # CPUs on any
single node.

I can imagine other extensions or integral features like autonomous
transactions that might possibly want many more bgworkers as well.

> and scanning a 32, 64, or even 128
> element array is not a terribly time-consuming operation.

I'd extend that to say scanning an array with a few thousand elements is
not terribly time-consuming, either. IMO the simplicity is worth it,
ATM. It's all relative to your definition of ... eh ... "terribly".

.oO( ... premature optimization ... all evil ... )

> We might
> need to re-think this when systems with 4096 processors become
> commonplace, but considering how many other things would also need to
> be fixed to work well in that universe, I'm not too concerned about it
> just yet.

Agreed.

> One thing I think we probably want to explore in the future, for both
> worker backends and regular backends, is pre-forking. We could avoid
> some of the latency associated with starting up a new backend or
> opening a new connection in that way. However, there are quite a few
> details to be thought through there, so I'm not eager to pursue that
> just yet. Once we have enough infrastructure to implement meaningful
> parallelism, we can benchmark it and find out where the bottlenecks
> are, and which solutions actually help most.

Do you mean pre-forking and connecting to a specific database? Or really
just the forking?

> I do think we need a mechanism to allow the backend that requested the
> bgworker to know whether or not the bgworker got started, and whether
> it unexpectedly died. Once we get to the point of calling user code
> within the bgworker process, it can use any number of existing
> mechanisms to make sure that it won't die without notifying the
> backend that started it (short of a PANIC, in which case it won't
> matter anyway). But we need a way to report failures that happen
> before that point. I have some ideas about that, but decided to leave
> them for future passes. The remit of this patch is just to make it
> possible to dynamically register bgworkers. Allowing a bgworker to be
> "tied" to the session that requested it via some sort of feedback loop
> is a separate project - which I intend to tackle before CF2, assuming
> this gets committed (and so far nobody is objecting to that).

Okay, sounds good. Given my background, I considered that a solved
problem. Thanks for pointing it out.

>> Sounds like the postmaster is writing to shared memory. Not sure why
>> I've been trying so hard to avoid that, though. After all, it can hardly
>> hurt itself *writing* to shared memory.
>
> I think there's ample room for paranoia about postmaster interaction
> with shared memory, but all it's doing is setting a flag, which is no
> different from what CheckPostmasterSignal() already does.

Sounds good to me.

Regards

Markus Wanner

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Markus Wanner <markus(at)bluegap(dot)ch>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: dynamic background workers
Date:	2013-06-20 15:29:27
Message-ID:	CA+TgmoZ73afXXADphkvD0jgwDj7A8+RsyPxr4KNXLW--qonoxA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, Jun 20, 2013 at 10:59 AM, Markus Wanner <markus(at)bluegap(dot)ch> wrote:
> On 06/20/2013 04:41 PM, Robert Haas wrote:
>> The constant factor is also very small. Generally, I would expect
>> num_worker_processes <~ # CPUs
>
> That assumption might hold for parallel querying, yes. In case of
> Postgres-R, it doesn't. In the worst case, i.e. with a 100% write load,
> a cluster of n nodes, each with m backends performing transactions, all
> of them replicated to all other (n-1) nodes, you end up with ((n-1) * m)
> bgworkers. Which is pretty likely to be way above the # CPUs on any
> single node.
>
> I can imagine other extensions or integral features like autonomous
> transactions that might possibly want many more bgworkers as well.

Yeah, maybe. I think in general it's not going to work great to have
zillions of backends floating around, because eventually the OS
scheduler overhead - and the memory overhead - are going to become
pain points. And I'm hopeful that autonomous transactions can be
implemented without needing to start a new backend for each one,
because that sounds pretty expensive. Some users of other database
products will expect autonomous transactions to be cheap; aside from
that, cheap is better than expensive. But we will see. At any rate I
think your basic point is that people might end up creating a lot more
background workers than I'm imagining, which is certainly a fair
point.

>> and scanning a 32, 64, or even 128
>> element array is not a terribly time-consuming operation.
>
> I'd extend that to say scanning an array with a few thousand elements is
> not terribly time-consuming, either. IMO the simplicity is worth it,
> ATM. It's all relative to your definition of ... eh ... "terribly".
>
> .oO( ... premature optimization ... all evil ... )

Yeah, that thing.

>> One thing I think we probably want to explore in the future, for both
>> worker backends and regular backends, is pre-forking. We could avoid
>> some of the latency associated with starting up a new backend or
>> opening a new connection in that way. However, there are quite a few
>> details to be thought through there, so I'm not eager to pursue that
>> just yet. Once we have enough infrastructure to implement meaningful
>> parallelism, we can benchmark it and find out where the bottlenecks
>> are, and which solutions actually help most.
>
> Do you mean pre-forking and connecting to a specific database? Or really
> just the forking?

I've considered both at various times, although in this context I was
mostly thinking about just the forking. Pre-connecting to a specific
database would save an unknown but possibly significant amount of
additional latency. Against that, it's more complex (because we've
got to track which preforked workers are associated with which
databases) and there's some cost to guessing wrong (because then we're
keeping workers around that we can't use, or maybe even having to turn
around and kill them to make slots for the workers we actually need).
I suspect we'll want to pursue the idea at some point but it's not
near the top of my list.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

From:	Andres Freund <andres(at)2ndquadrant(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Markus Wanner <markus(at)bluegap(dot)ch>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: dynamic background workers
Date:	2013-06-20 15:45:26
Message-ID:	20130620154526.GE16659@awork2.anarazel.de
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 2013-06-20 11:29:27 -0400, Robert Haas wrote:
> > Do you mean pre-forking and connecting to a specific database? Or really
> > just the forking?
>
> I've considered both at various times, although in this context I was
> mostly thinking about just the forking. Pre-connecting to a specific
> database would save an unknown but possibly significant amount of
> additional latency. Against that, it's more complex (because we've
> got to track which preforked workers are associated with which
> databases) and there's some cost to guessing wrong (because then we're
> keeping workers around that we can't use, or maybe even having to turn
> around and kill them to make slots for the workers we actually need).
> I suspect we'll want to pursue the idea at some point but it's not
> near the top of my list.

Just as a datapoint, if you benchmark the numbers of forks that can be
performed by a single process (i.e. postmaster) the number is easily in
the 10s of thousands. Now forking that much has some scalability
implications inside the kernel, but still.
I'd be surprised if the actual fork is more than 5-10% of the current
cost of starting a new backend.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

From:	Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: dynamic background workers
Date:	2013-06-24 07:51:55
Message-ID:	CAB7nPqRtXgNt26wV6na5vRULc4Mq9wnqnLDuMRntvW8jEpPMzw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hi,

Please find some review for the 2nd patch, with the 1st patch applied
on top of it.

On Sat, Jun 15, 2013 at 6:00 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> The second patch, dynamic-bgworkers-v1.patch, revises the background
> worker API to allow background workers to be started dynamically.
> This requires some communication channel from ordinary workers to the
> postmaster, because it is the postmaster that must ultimately start
> the newly-registered workers. However, that communication channel has
> to be designed pretty carefully, lest a shared memory corruption take
> out the postmaster and lead to inadvertent failure to restart after a
> crash. Here's how I implemented that: there's an array in shared
> memory of a size equal to max_worker_processes. This array is
> separate from the backend-private list of workers maintained by the
> postmaster, but the two are kept in sync. When a new background
> worker registration is added to the shared data structure, the backend
> adding it uses the existing pmsignal mechanism to kick the postmaster,
> which then scans the array for new registrations. I have attempted to
> make the code that transfers the shared_memory state into the
> postmaster's private state as paranoid as humanly possible. The
> precautions taken are documented in the comments. Conversely, when a
> background worker flagged as BGW_NEVER_RESTART is considered for
> restart (and we decide against it), the corresponding slot in the
> shared memory array is marked as no longer in use, allowing it to be
> reused for a new registration.
>
> Since the postmaster cannot take locks, synchronization between the
> postmaster and other backends using the shared memory segment has to
> be lockless. This mechanism is also documented in the comments. An
> lwlock is used to prevent two backends that are both registering a new
> worker at about the same time from stomping on each other, but the
> postmaster need not care about that lwlock.
>
> This patch also extends worker_spi as a demonstration of the new
> interface. With this patch, you can CREATE EXTENSION worker_spi and
> then call worker_spi_launch(int4) to launch a new background worker,
> or combine it with generate_series() to launch a bunch at once. Then
> you can kill them off with pg_terminate_backend() and start some new
> ones. That, in my humble opinion, is pretty cool.

The patch applies cleanly, I found a couple of whitespaces though:
/home/ioltas/download/dynamic-bgworkers-v1.patch:452: space before tab
in indent.
slot = &BackgroundWorkerData->slot[rw->rw_shmem_slot];
/home/ioltas/download/dynamic-bgworkers-v1.patch:474: space before tab
in indent.
slot = &BackgroundWorkerData->slot[slotno];
/home/ioltas/download/dynamic-bgworkers-v1.patch:639: trailing whitespace.
success = true;
warning: 3 lines add whitespace errors.

The code compiles, has no warnings, and make check passes.

Then, here are some impressions after reading the code. It is good to
see that all the bgworker APIs are moved under the same banner
bgworker.c.
1) Just for clarity, I think that this code in worker_spi.c deserves a
comment mentioning that this code path cannot cannot be taken for a
bgworker not loaded via shared_preload_libraries.
+
+ if (!process_shared_preload_libraries_in_progress)
+ return;
+
2) s/NUL-terminated/NULL-terminated @ bgworker.c
3) Why not adding an other function in worker_spi.c being the opposite
of worker_spi_launch to stop dynamic bgworkers for a given index
number? This would be only a wrapper of pg_terminate_backend, OK, but
at least it would give the user all the information needed to start
and to stop a dynamic bgworker with a single extension, here
worker_spi.c. It can be painful to stop
4) Not completely related to this patch, but one sanity check in
SanityCheckBackgroundWorker:bgworker.c is not listed in the
documentation: when requesting a database connection, bgworker needs
to have access to shmem. It looks that this should be also fixed in
REL9_3_STABLE.
5) Why not adding some documentation? Both dynamic and static
bgworkers share the same infrastructure, so some lines in the existing
chapter might be fine?
6) Just wondering something: it looks that the current code is not
able to identify what was the way used to start a given bgworker.
Would it be interesting to be able to identify if a bgworker has been
registered though RegisterBackgroundWorker or
RegisterDynamicBackgroundWorker?

I have also done some tests, and the infrastructure is working nicely.
The workers started dynamically are able to receive SIGHUP and
SIGTERM. Workers are also not started if the maximum number of
bgworkers authorized is reached. It is really a nice feature!

Also, I will try to do some more tests to test the robustness of the
slist and the protocol used.

Regards,
--
Michael

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc:	PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: dynamic background workers
Date:	2013-07-03 14:19:49
Message-ID:	CA+TgmoZSXohQHxRXSh3QbbD49UOe+foY2Fg4N9EkWiR092EMMA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, Jun 24, 2013 at 3:51 AM, Michael Paquier
<michael(dot)paquier(at)gmail(dot)com> wrote:
> 3) Why not adding an other function in worker_spi.c being the opposite
> of worker_spi_launch to stop dynamic bgworkers for a given index
> number? This would be only a wrapper of pg_terminate_backend, OK, but
> at least it would give the user all the information needed to start
> and to stop a dynamic bgworker with a single extension, here
> worker_spi.c. It can be painful to stop

Well, there's currently no mechanism for the person who starts a new
backend to know the PID of the process that actually got started. I
plan to write a patch to address that problem, but it's not this
patch.

> 4) Not completely related to this patch, but one sanity check in
> SanityCheckBackgroundWorker:bgworker.c is not listed in the
> documentation: when requesting a database connection, bgworker needs
> to have access to shmem. It looks that this should be also fixed in
> REL9_3_STABLE.

That's fine; I think it's separate from this patch. Please feel free
to propose something.

> 5) Why not adding some documentation? Both dynamic and static
> bgworkers share the same infrastructure, so some lines in the existing
> chapter might be fine?

I'll take a look.

> 6) Just wondering something: it looks that the current code is not
> able to identify what was the way used to start a given bgworker.
> Would it be interesting to be able to identify if a bgworker has been
> registered though RegisterBackgroundWorker or
> RegisterDynamicBackgroundWorker?

I don't think that's a good thing to expose.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

From:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To:	Andres Freund <andres(at)2ndquadrant(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Markus Wanner <markus(at)bluegap(dot)ch>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: dynamic background workers
Date:	2013-07-03 15:15:44
Message-ID:	20130703151544.GB3592@eldon.alvh.no-ip.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Andres Freund escribió:

> Just as a datapoint, if you benchmark the numbers of forks that can be
> performed by a single process (i.e. postmaster) the number is easily in
> the 10s of thousands. Now forking that much has some scalability
> implications inside the kernel, but still.
> I'd be surprised if the actual fork is more than 5-10% of the current
> cost of starting a new backend.

I played at having some thousands of registered bgworkers on my laptop,
and there wasn't even that much load. So yeah, you can have lots of
forks.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

From:	Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: dynamic background workers
Date:	2013-07-04 00:10:29
Message-ID:	CAB7nPqSrJsgLa_gZjSkx9xw8sX2+T9yhB=hb1QhBfWeOBRr+6Q@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, Jul 3, 2013 at 11:19 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Mon, Jun 24, 2013 at 3:51 AM, Michael Paquier
> <michael(dot)paquier(at)gmail(dot)com> wrote:
>> 3) Why not adding an other function in worker_spi.c being the opposite
>> of worker_spi_launch to stop dynamic bgworkers for a given index
>> number? This would be only a wrapper of pg_terminate_backend, OK, but
>> at least it would give the user all the information needed to start
>> and to stop a dynamic bgworker with a single extension, here
>> worker_spi.c. It can be painful to stop
>
> Well, there's currently no mechanism for the person who starts a new
> backend to know the PID of the process that actually got started. I
> plan to write a patch to address that problem, but it's not this
> patch.
OK. Understood, this functionality would be a good addition to have.

>> 4) Not completely related to this patch, but one sanity check in
>> SanityCheckBackgroundWorker:bgworker.c is not listed in the
>> documentation: when requesting a database connection, bgworker needs
>> to have access to shmem. It looks that this should be also fixed in
>> REL9_3_STABLE.
>
> That's fine; I think it's separate from this patch. Please feel free
> to propose something.
I'll send a patch about that.

>> 6) Just wondering something: it looks that the current code is not
>> able to identify what was the way used to start a given bgworker.
>> Would it be interesting to be able to identify if a bgworker has been
>> registered though RegisterBackgroundWorker or
>> RegisterDynamicBackgroundWorker?
>
> I don't think that's a good thing to expose.
My concerns here are covered by the functionality you propose in 1),
where a user who launched a custom bgworker would know its PID, this
would allow users to keep track of which bgworker has been started
dynamically.

Regards,
--
Michael

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc:	Andres Freund <andres(at)2ndquadrant(dot)com>, Markus Wanner <markus(at)bluegap(dot)ch>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: dynamic background workers
Date:	2013-07-16 17:09:40
Message-ID:	CA+TgmoaPNPX6XXedOroj7CgB6FGxKNc0j4Pu5bOpU20WK90sZA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, Jul 3, 2013 at 11:15 AM, Alvaro Herrera
<alvherre(at)2ndquadrant(dot)com> wrote:
> Andres Freund escribió:
>> Just as a datapoint, if you benchmark the numbers of forks that can be
>> performed by a single process (i.e. postmaster) the number is easily in
>> the 10s of thousands. Now forking that much has some scalability
>> implications inside the kernel, but still.
>> I'd be surprised if the actual fork is more than 5-10% of the current
>> cost of starting a new backend.
>
> I played at having some thousands of registered bgworkers on my laptop,
> and there wasn't even that much load. So yeah, you can have lots of
> forks.

Since no one seems to be objecting to this patch beyond the lack of
documentation, I've added documentation and committed it, with
appropriate rebasing and a few minor cleanups. One loose end is
around the bgw_sighup and bgw_sigterm structure members. If you're
registering a background worker for a library that is not loaded in
the postmaster, you can't (safely) use these for anything, because
it's possible (though maybe not likely) for the worker process to map
the shared library at a different address than where they are mapped
in the backend that requests the new process to be started. However,
that doesn't really matter; AFAICS, you can just as well call pqsignal
to set the handlers to anything you want from the main entrypoint
before unblocking signals. So I'm inclined to say we should just
remove bgw_sighup and bgw_sigterm altogether and tell people to do it
that way.

Alternatively, we could give them the same treatment that I gave
bgw_main: let the user specify a function name and we'll search the
appropriate DSO for it. But that's probably less convenient for
anyone using this facility than just calling pqsignal() before
unblocking signals, so I don't see any real reason to go that route.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company