Re: dynamic background workers

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: dynamic background workers
Date: 2013-06-24 07:51:55
Message-ID: CAB7nPqRtXgNt26wV6na5vRULc4Mq9wnqnLDuMRntvW8jEpPMzw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Please find some review for the 2nd patch, with the 1st patch applied
on top of it.

On Sat, Jun 15, 2013 at 6:00 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> The second patch, dynamic-bgworkers-v1.patch, revises the background
> worker API to allow background workers to be started dynamically.
> This requires some communication channel from ordinary workers to the
> postmaster, because it is the postmaster that must ultimately start
> the newly-registered workers. However, that communication channel has
> to be designed pretty carefully, lest a shared memory corruption take
> out the postmaster and lead to inadvertent failure to restart after a
> crash. Here's how I implemented that: there's an array in shared
> memory of a size equal to max_worker_processes. This array is
> separate from the backend-private list of workers maintained by the
> postmaster, but the two are kept in sync. When a new background
> worker registration is added to the shared data structure, the backend
> adding it uses the existing pmsignal mechanism to kick the postmaster,
> which then scans the array for new registrations. I have attempted to
> make the code that transfers the shared_memory state into the
> postmaster's private state as paranoid as humanly possible. The
> precautions taken are documented in the comments. Conversely, when a
> background worker flagged as BGW_NEVER_RESTART is considered for
> restart (and we decide against it), the corresponding slot in the
> shared memory array is marked as no longer in use, allowing it to be
> reused for a new registration.
>
> Since the postmaster cannot take locks, synchronization between the
> postmaster and other backends using the shared memory segment has to
> be lockless. This mechanism is also documented in the comments. An
> lwlock is used to prevent two backends that are both registering a new
> worker at about the same time from stomping on each other, but the
> postmaster need not care about that lwlock.
>
> This patch also extends worker_spi as a demonstration of the new
> interface. With this patch, you can CREATE EXTENSION worker_spi and
> then call worker_spi_launch(int4) to launch a new background worker,
> or combine it with generate_series() to launch a bunch at once. Then
> you can kill them off with pg_terminate_backend() and start some new
> ones. That, in my humble opinion, is pretty cool.

The patch applies cleanly, I found a couple of whitespaces though:
/home/ioltas/download/dynamic-bgworkers-v1.patch:452: space before tab
in indent.
slot = &BackgroundWorkerData->slot[rw->rw_shmem_slot];
/home/ioltas/download/dynamic-bgworkers-v1.patch:474: space before tab
in indent.
slot = &BackgroundWorkerData->slot[slotno];
/home/ioltas/download/dynamic-bgworkers-v1.patch:639: trailing whitespace.
success = true;
warning: 3 lines add whitespace errors.

The code compiles, has no warnings, and make check passes.

Then, here are some impressions after reading the code. It is good to
see that all the bgworker APIs are moved under the same banner
bgworker.c.
1) Just for clarity, I think that this code in worker_spi.c deserves a
comment mentioning that this code path cannot cannot be taken for a
bgworker not loaded via shared_preload_libraries.
+
+ if (!process_shared_preload_libraries_in_progress)
+ return;
+
2) s/NUL-terminated/NULL-terminated @ bgworker.c
3) Why not adding an other function in worker_spi.c being the opposite
of worker_spi_launch to stop dynamic bgworkers for a given index
number? This would be only a wrapper of pg_terminate_backend, OK, but
at least it would give the user all the information needed to start
and to stop a dynamic bgworker with a single extension, here
worker_spi.c. It can be painful to stop
4) Not completely related to this patch, but one sanity check in
SanityCheckBackgroundWorker:bgworker.c is not listed in the
documentation: when requesting a database connection, bgworker needs
to have access to shmem. It looks that this should be also fixed in
REL9_3_STABLE.
5) Why not adding some documentation? Both dynamic and static
bgworkers share the same infrastructure, so some lines in the existing
chapter might be fine?
6) Just wondering something: it looks that the current code is not
able to identify what was the way used to start a given bgworker.
Would it be interesting to be able to identify if a bgworker has been
registered though RegisterBackgroundWorker or
RegisterDynamicBackgroundWorker?

I have also done some tests, and the infrastructure is working nicely.
The workers started dynamically are able to receive SIGHUP and
SIGTERM. Workers are also not started if the maximum number of
bgworkers authorized is reached. It is really a nice feature!

Also, I will try to do some more tests to test the robustness of the
slist and the protocol used.

Regards,
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Cédric Villemain 2013-06-24 08:20:03 Re: [Review] Re: minor patch submission: CREATE CAST ... AS EXPLICIT
Previous Message Andres Freund 2013-06-24 06:41:29 Re: Clean switchover