Re: [v9.3] Extra Daemons (Re: elegant and effective way for running jobs inside a database)

From: Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, "David E(dot) Wheeler" <david(at)justatheory(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [v9.3] Extra Daemons (Re: elegant and effective way for running jobs inside a database)
Date: 2012-11-16 11:01:20
Message-ID: CADyhKSUC=TZ-anRfH_SBw=FK=eUVxmycszxZ54J6w0h78m68pw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2012/10/22 Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>:
> Here's an updated version of this patch, which also works in
> an EXEC_BACKEND environment. (I haven't tested this at all on Windows,
> but I don't see anything that would create a portability problem there.)
>
I also tried to check the latest patch "briefly".
Let me comment on several points randomly.

Once bgworker process got crashed, postmaster tries to restart it about 60
seconds later. My preference is this interval being configurable with initial
registration parameter; that includes "never restart again" option.
Probably, some extensions don't want to restart again on unexpected crash.

Regarding to process restart, this interface allows to specify the timing to
start bgworker using BgWorkerStart_* label. On the other hand,
bgworker_should_start_now() checks correspondence between pmState
and the configured start-time using equal matching. It can make such a
scenarios, for instance, a bgworker launched at PM_INIT state, then it got
crashed. Author expected it shall be restarted, however, pmState was
already progressed to PM_RUN, so nobody can restart this bgworker again.
How about an idea to provide the start time with bitmap? It allows extensions
to specify multiple candidates to launch bgworker.

Stop is one other significant event for bgworker, not only start-up.
This interface has no way to specify when we want to stop bgworker.
Probably, we can offer two options. Currently, bgworkers are simultaneously
terminated with other regular backends. In addition, I want an option to make
sure bgworker being terminated after all the regular backends exit.
It is critical issue for me, because I try to implement parallel
calculation server
with bgworker, thus, it should not be terminated earlier than regular backend.

do_start_bgworker() initializes process state according to normal manner,
then it invokes worker->bgw_main(). Once ERROR is raised inside of the
main routine, the control is backed to the sigsetjmp() on do_start_bgworker(),
then the bgworker is terminated.
I'm not sure whether it is suitable for bgworker that tries to connect database,
because it may raise an error everywhere, such as zero division and so on...
I think it is helpful to provide a utility function in the core; that
handles to abort
transactions in progress, clean-up resources, and so on.
spi_worker invokes BackgroundWorkerInitializeConnection() at the head of
main routine. In a similar fashion, is it available to support a
utility function
that initialize the process as transaction-aware background-worker?
I don't think it is a good manner each extension has to implement its own
sigsetjmp() block and rollback logic. I'd also like to investigate the necessary
logic for transaction abort and resource cleanup here....

Some other misc comments below.

It ensures bgworker must be registered within shared_preload_libraries.
Why not raise an error if someone try to register bgworker later?

How about to move bgw_name field into tail of BackgroundWorker structure?
It makes simplifies the logic in RegisterBackgroundWorker(), if it is
located as:
typedef struct BackgroundWorker
{
int bgw_flags;
BgWorkerStartTime bgw_start_time;
bgworker_main_type bgw_main;
void *bgw_main_arg;
bgworker_sighdlr_type bgw_sighup;
bgworker_sighdlr_type bgw_sigterm;
char bgw_name[1]; <== (*)
} BackgroundWorker;

StartOneBackgroundWorker always scan the BackgroundWorkerList from
the head. Isn't it available to save the current position at static variable?
If someone tries to manage thousand of bgworkers, it makes a busy loop. :(

Thanks,
--
KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit kapila 2012-11-16 11:22:14 Re: [PATCH] Patch to compute Max LSN of Data Pages
Previous Message Andres Freund 2012-11-16 10:58:04 Re: [PATCH 05/14] Add a new relmapper.c function RelationMapFilenodeToOid that acts as a reverse of RelationMapOidToFilenode