Re: Scheduler in Postgres

From: Marco Colombo <pgsql(at)esiway(dot)net>
To: Christopher Browne <cbbrowne(at)acm(dot)org>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Scheduler in Postgres
Date: 2004-12-17 09:50:58
Message-ID: Pine.LNX.4.61.0412170950390.26177@Megathlon.ESI
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Thu, 16 Dec 2004, Christopher Browne wrote:

> A "cron implementation using PostgreSQL as data store" would have a
> wonderfully natural place to record log information in a usefully
> structured fashion.
>
> When a job runs, it would be a splendid idea to record such things as:
> - Job ID (perhaps an OID, or some other candidate primary key)
> - PID
> - Start time
> - End time
> - Exit code
>
> Given all of the above, a job might look at the logs and
> self-terminate if there's another instance still running from last
> hour.
>
> Jobs that are supposed to be mutually exclusive could detect as much.
>
> You could _attempt_ to run a job every hour, and have it decide "Oh,
> I've already run successfully in the last [interval], so I'll not
> bother."
>
> None of this means forcing it into the database implementation; it
> just means that it would be useful. "pgcron" sounds like an utterly
> splendid idea.

Is the Oracle one _just_ that? A cron/at replacement? What about porting
every UNIX utility to the DB engine (that would be a cross-platfrom Unix
- wow)?
Why don't they put web and application server functionality (apache and
PHP) in the DB? No, wait... ehm... :-)

Seriously, such an application (scheduler) _will_ have to deal with OS
differences. Interesting things to log about the spawned jobs will be
different. They way you run then (I don't mean the actual system call,
think of nice level instead) may be different.

Now, the idea of different frontends to a DB-based backend may sound
cool, but either the backend supports only the minimum set of features
the various OSes offer when it comes at running jobs, or it implements
some OS-dependant features conditionally, which exposes to the applications
what it is supposed to hide. Overall I'm not impressed.

I wonder if limiting the application domain to DB-related jobs only
would help. I mean, it is quite common to run time based procedures
at DB level, like report generation or table summarization. Usually,
this activities are driven by _external_ schedulers (cron), via
scripts that need to connect and _authenticate_, which leads to
security nightmares.

An in-core scheduler, which runs in-core procedures only (if you want
to write procedures that invoke external programs go ahead, but it's
outside the scope of the scheduler), might be a good idea.
User X may schedule a query to be executed at a given time. I can see
some advantages:
- security: no need for authentication. The query was added to the
scheduler by an authenticated client and will run under the same
permissions (superusers will be able to alter the schedule of all
users of course);
- portability: no need for external tools. Note that the details
of running an external process don't matter here. Moreover, we need
only a timer from the OS, no other features. So either the OS provides
it (most do) or not, in which case the scheduler will be disabled
(and requests for adding jobs will fail gracefully).
- simplicity: it will log the query, start time, end time, success/failure
(maybe _what_ error if available). No PIDs, no OS-level details.
If you need more extensive logging, implement your own in your function.
- availability: the scheduler will be available to any client, in a standard
way. Right now, either you run a cron job on the server (if you have
access), or on the client (though you may need a 24/24 uptime), or
you code some hack in the middleware (think of PHP-based apps - they need
to piggyback procedures on web requests).

Final note: most of us, I believe, do well with cron and friends, or
already found a different solution. There's no _need_ for an in-core
scheduler. But, if such a feature existed, I think many of us would use it.
And be happier than now. :-)

.TM.
--
____/ ____/ /
/ / / Marco Colombo
___/ ___ / / Technical Manager
/ / / ESI s.r.l.
_____/ _____/ _/ Colombo(at)ESI(dot)it

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Raymond O'Donnell 2004-12-17 09:53:48 Re: Cannot drop template1
Previous Message Peter Eisentraut 2004-12-17 09:29:05 Re: sorting problem