Re: pg_background (and more parallelism infrastructure patches)

Lists: pgsql-hackers
From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: pg_background (and more parallelism infrastructure patches)
Date: 2014-07-25 18:11:32
Message-ID: CA+Tgmoam66dTzCP8N2cRcS6S6dBMFX+JMba+mDf68H=KAkNjPQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Attached is a contrib module that lets you launch arbitrary command in
a background worker, and supporting infrastructure patches for core.
You can launch queries and fetch the results back, much as you could
do with a dblink connection back to the local database but without the
hassles of dealing with authentication; and you can also run utility
commands, like VACUUM. For people who have always wanted to be able
to launch a vacuum (or an autonomous transaction, or a background
task) from a procedural language ... enjoy.

Here's an example of running vacuum and then fetching the results.
Notice that the notices from the original session are propagated to
our session; if an error had occurred, it would be re-thrown locally
when we try to read the results.

rhaas=# create table foo (a int);
CREATE TABLE
rhaas=# select pg_background_launch('vacuum verbose foo');
pg_background_launch
----------------------
51101
(1 row)

rhaas=# select * from pg_background_result(51101) as (x text);
INFO: vacuuming "public.foo"
INFO: "foo": found 0 removable, 0 nonremovable row versions in 0 out of 0 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
x
--------
VACUUM
(1 row)

Here's an overview of the attached patches:

Patches 1 and 2 add a few new interfaces to the shm_mq and dsm APIs
that happen to be convenient for later patches in the series. I'm
pretty sure I could make all this work without these, but it would
take more code and be less efficient, so here they are.

Patch 3 adds the ability for a backend to request that the protocol
messages it would normally send to the frontend get redirected to a
shm_mq. I did this by adding a couple of hook functions. The best
design is definitely arguable here, so if you'd like to bikeshed, this
is probably the patch to look at. This patch also adds a function to
help you parse an ErrorResponse or NoticeResponse and re-throw the
error or notice in the originating backend. Obviously, parallelism is
going to need this kind of functionality, but I suspect a variety of
other applications people may develop using background workers may
want it too; and it's certainly important for pg_background itself.

Patch 4 adds infrastructure that allows one session to save all of its
non-default GUC values and another session to reload those values.
This was written by Amit Khandekar and Noah Misch. It allows
pg_background to start up the background worker with the same GUC
settings that the launching process is using. I intend this as a
demonstration of how to synchronize any given piece of state between
cooperating backends. For real parallelism, we'll need to synchronize
snapshots, combo CIDs, transaction state, and so on, in addition to
GUCs. But GUCs are ONE of the things that we'll need to synchronize
in that context, and this patch shows the kind of API we're thinking
about for these sorts of problems.

Patch 5 is a trivial patch to add a function to get the authenticated
user ID. Noah pointed out to me that it's important for the
authenticated user ID, session user ID, and current user ID to all
match between the original session and the background worker.
Otherwise, pg_background could be used to circumvent restrictions that
we normally impose when those values differ from each other. The
session and current user IDs are restored by the GUC save-and-restore
machinery ("session_authorization" and "role") but the authenticated
user ID requires special treatment. To make that happen, it has to be
exposed somehow.

Patch 6 is pg_background itself. I'm quite pleased with how easily
this came together. The existing background worker, dsm, shm_toc, and
shm_mq infrastructure handles most of the heavily lifting here -
obviously with some exceptions addressed by the preceding patches.
Again, this is the kind of set-up that I'm expecting will happen in a
background worker used for actual parallelism - clearly, more state
will need to be restored there than here, but nonetheless the general
flow of the code here is about what I'm imagining, just with somewhat
more different kinds of state. Most of the work of writing this patch
was actually figuring out how to execute the query itself; what I
ended up with is mostly copied form exec_simple_query, but with some
difference here and there. I'm not sure if it would be
possible/advisable to try to refactor to reduce duplication.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
0001-Extend-shm_mq-API-with-new-functions-shm_mq_sendv-an.patch text/x-patch 7.8 KB
0002-Extend-dsm-API-with-a-new-function-dsm_unkeep_mappin.patch text/x-patch 2.0 KB
0003-Support-frontend-backend-protocol-communication-usin.patch text/x-patch 13.0 KB
0004-Add-infrastructure-to-save-and-restore-GUC-values.patch text/x-patch 12.2 KB
0005-Add-a-function-to-get-the-authenticated-user-ID.patch text/x-patch 1.5 KB
0006-pg_background-Run-commands-in-a-background-worker-an.patch text/x-patch 31.1 KB

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-07-25 20:16:25
Message-ID: 20140725201625.GA5476@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Jul 25, 2014 at 02:11:32PM -0400, Robert Haas wrote:

> + pq_mq_busy = true;
> +
> + iov[0].data = &msgtype;
> + iov[0].len = 1;
> + iov[1].data = s;
> + iov[1].len = len;
> +
> + Assert(pq_mq_handle != NULL);
> + result = shm_mq_sendv(pq_mq_handle, iov, 2, false);
> +
> + pq_mq_busy = false;

Don't you need a PG_TRY block here to reset pq_mq_busy?

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-07-26 08:37:05
Message-ID: 20140726083705.GD17793@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On 2014-07-25 14:11:32 -0400, Robert Haas wrote:
> Attached is a contrib module that lets you launch arbitrary command in
> a background worker, and supporting infrastructure patches for core.

Cool.

I assume this 'fell out' of the work towards parallelism? Do you think
all of the patches (except the contrib one) are required for that or is
some, e.g. 3), only required to demonstrate the others?

> Patch 3 adds the ability for a backend to request that the protocol
> messages it would normally send to the frontend get redirected to a
> shm_mq. I did this by adding a couple of hook functions. The best
> design is definitely arguable here, so if you'd like to bikeshed, this
> is probably the patch to look at.

Uh. This doesn't sound particularly nice. Shouldn't this rather be
clearly layered by making reading/writing from the client a proper API
instead of adding hook functions here and there?

Also, you seem to have only touched receiving from the client, and not
sending back to the subprocess. Is that actually sufficient? I'd expect
that for this facility to be fully useful it'd have to be two way
communication. But perhaps I'm overestimating what it could be used for.

> This patch also adds a function to
> help you parse an ErrorResponse or NoticeResponse and re-throw the
> error or notice in the originating backend. Obviously, parallelism is
> going to need this kind of functionality, but I suspect a variety of
> other applications people may develop using background workers may
> want it too; and it's certainly important for pg_background itself.

I would have had use for it previously.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-07-26 16:02:26
Message-ID: CA+TgmoZCODi+bqGseu+Fxg56ABG6GFgUv8cPgzZJZceY3CMuOg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Jul 25, 2014 at 4:16 PM, Alvaro Herrera
<alvherre(at)2ndquadrant(dot)com> wrote:
> On Fri, Jul 25, 2014 at 02:11:32PM -0400, Robert Haas wrote:
>> + pq_mq_busy = true;
>> +
>> + iov[0].data = &msgtype;
>> + iov[0].len = 1;
>> + iov[1].data = s;
>> + iov[1].len = len;
>> +
>> + Assert(pq_mq_handle != NULL);
>> + result = shm_mq_sendv(pq_mq_handle, iov, 2, false);
>> +
>> + pq_mq_busy = false;
>
> Don't you need a PG_TRY block here to reset pq_mq_busy?

No. If shm_mq_sendv is interrupted, we can't use the shm_mq any more.
But since that should only happen if an interrupt arrives while the
queue is full, I think that's OK. (Think about the alternatives: if
the queue is full, we have no way of notifying the launching process
without waiting for it to retrieve the results, but it might not do
that right away, and if we've been killed we need to die *now* not
later.)

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-07-26 16:20:34
Message-ID: CA+TgmoZLx-dmeKa-Y+tGrcm+OWTZ4qZKjBYL2Prn=HVdtRQBMw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Jul 26, 2014 at 4:37 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> On 2014-07-25 14:11:32 -0400, Robert Haas wrote:
>> Attached is a contrib module that lets you launch arbitrary command in
>> a background worker, and supporting infrastructure patches for core.
>
> Cool.
>
> I assume this 'fell out' of the work towards parallelism? Do you think
> all of the patches (except the contrib one) are required for that or is
> some, e.g. 3), only required to demonstrate the others?

I'm fairly sure that patches 3, 4, and 5 are all required in some form
as building blocks for parallelism. Patch 1 contains two functions,
one of which (shm_mq_set_handle) I think is generally useful for
people using background workers, but not absolutely required; and one
of which is infrastructure for patch 3 which might not be necessary
with different design choices (shm_mq_sendv). Patch 2 is only
included because pg_background can benefit from it; we could instead
use an eoxact callback, at the expense of doing cleanup at
end-of-transaction rather than end-of-query. But it's a mighty small
patch and seems like a reasonable extension to the API, so I lean
toward including it.

>> Patch 3 adds the ability for a backend to request that the protocol
>> messages it would normally send to the frontend get redirected to a
>> shm_mq. I did this by adding a couple of hook functions. The best
>> design is definitely arguable here, so if you'd like to bikeshed, this
>> is probably the patch to look at.
>
> Uh. This doesn't sound particularly nice. Shouldn't this rather be
> clearly layered by making reading/writing from the client a proper API
> instead of adding hook functions here and there?

I don't know exactly what you have in mind here. There is an API for
writing to the client that is used throughout the backend, but right
now "the client" always has to be a socket. Hooking a couple of parts
of that API lets us write someplace else instead. If you've got
another idea how to do this, suggest away...

> Also, you seem to have only touched receiving from the client, and not
> sending back to the subprocess. Is that actually sufficient? I'd expect
> that for this facility to be fully useful it'd have to be two way
> communication. But perhaps I'm overestimating what it could be used for.

Well, the basic shm_mq infrastructure can be used to send any kind of
messages you want between any pair of processes that care to establish
them. But in general I expect that data is going to flow mostly in
one direction - the user backend will launch workers and give them an
initial set of instructions, and then results will stream back from
the workers to the user backend. Other messaging topologies are
certainly possible, and probably useful for something, but I don't
really know exactly what those things will be yet, and I'm not sure
the FEBE protocol will be the right tool for the job anyway. But
error propagation, which is the main thrust of this, seems like a need
that will likely be pretty well ubiquitous.

>> This patch also adds a function to
>> help you parse an ErrorResponse or NoticeResponse and re-throw the
>> error or notice in the originating backend. Obviously, parallelism is
>> going to need this kind of functionality, but I suspect a variety of
>> other applications people may develop using background workers may
>> want it too; and it's certainly important for pg_background itself.
>
> I would have had use for it previously.

Cool. I know Petr was interested as well (possibly for the same project?).

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-07-28 17:50:51
Message-ID: 20140728175051.GQ17793@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2014-07-26 12:20:34 -0400, Robert Haas wrote:
> On Sat, Jul 26, 2014 at 4:37 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> > On 2014-07-25 14:11:32 -0400, Robert Haas wrote:
> >> Attached is a contrib module that lets you launch arbitrary command in
> >> a background worker, and supporting infrastructure patches for core.
> >
> > Cool.
> >
> > I assume this 'fell out' of the work towards parallelism? Do you think
> > all of the patches (except the contrib one) are required for that or is
> > some, e.g. 3), only required to demonstrate the others?
>
> I'm fairly sure that patches 3, 4, and 5 are all required in some form
> as building blocks for parallelism. Patch 1 contains two functions,
> one of which (shm_mq_set_handle) I think is generally useful for
> people using background workers, but not absolutely required; and one
> of which is infrastructure for patch 3 which might not be necessary
> with different design choices (shm_mq_sendv). Patch 2 is only
> included because pg_background can benefit from it; we could instead
> use an eoxact callback, at the expense of doing cleanup at
> end-of-transaction rather than end-of-query. But it's a mighty small
> patch and seems like a reasonable extension to the API, so I lean
> toward including it.

Don't get me wrong, I don't object to anything in here. It's just that
the bigger picture can help giving sensible feedback.

> >> Patch 3 adds the ability for a backend to request that the protocol
> >> messages it would normally send to the frontend get redirected to a
> >> shm_mq. I did this by adding a couple of hook functions. The best
> >> design is definitely arguable here, so if you'd like to bikeshed, this
> >> is probably the patch to look at.
> >
> > Uh. This doesn't sound particularly nice. Shouldn't this rather be
> > clearly layered by making reading/writing from the client a proper API
> > instead of adding hook functions here and there?
>
> I don't know exactly what you have in mind here. There is an API for
> writing to the client that is used throughout the backend, but right
> now "the client" always has to be a socket. Hooking a couple of parts
> of that API lets us write someplace else instead. If you've got
> another idea how to do this, suggest away...

What I'm thinking of is providing an actual API for the writes instead
of hooking into the socket API in a couple places. I.e. have something
like

typedef struct DestIO DestIO;

struct DestIO
{
void (*flush)(struct DestIO *io);
int (*putbytes)(struct DestIO *io, const char *s, size_t len);
int (*getbytes)(struct DestIO *io, const char *s, size_t len);
...
}

and do everything through it. I haven't thought much about the specific
API we want, but abstracting the communication properly instead of
adding hooks here and there is imo much more likely to succeed in the
long term.

> > Also, you seem to have only touched receiving from the client, and not
> > sending back to the subprocess. Is that actually sufficient? I'd expect
> > that for this facility to be fully useful it'd have to be two way
> > communication. But perhaps I'm overestimating what it could be used for.
>
> Well, the basic shm_mq infrastructure can be used to send any kind of
> messages you want between any pair of processes that care to establish
> them. But in general I expect that data is going to flow mostly in
> one direction - the user backend will launch workers and give them an
> initial set of instructions, and then results will stream back from
> the workers to the user backend. Other messaging topologies are
> certainly possible, and probably useful for something, but I don't
> really know exactly what those things will be yet, and I'm not sure
> the FEBE protocol will be the right tool for the job anyway.

It's imo not particularly unreasonable to e.g. COPY to/from a bgworker. Which
would require the ability to both read/write from the other side.

> But
> error propagation, which is the main thrust of this, seems like a need
> that will likely be pretty well ubiquitous.

Agreed.

> >> This patch also adds a function to
> >> help you parse an ErrorResponse or NoticeResponse and re-throw the
> >> error or notice in the originating backend. Obviously, parallelism is
> >> going to need this kind of functionality, but I suspect a variety of
> >> other applications people may develop using background workers may
> >> want it too; and it's certainly important for pg_background itself.
> >
> > I would have had use for it previously.
>
> Cool. I know Petr was interested as well (possibly for the same project?).

Well, I was aware of Petr's project, but I also have my own pet project
I'd been playing with :). Nothing real.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-07-29 16:51:18
Message-ID: CA+TgmoZP2JtKdVfu0RXqoL+vo-wsjv2OxwTU83QmP_GqCyukVA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jul 28, 2014 at 1:50 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> Don't get me wrong, I don't object to anything in here. It's just that
> the bigger picture can help giving sensible feedback.

Right. I did not get you wrong. :-)

The reason I'm making a point of it is that, if somebody wants to
object to the way those facilities are designed, it'd be good to get
that out of the way now rather than waiting until 2 or 3 patch sets
from now and then saying "Uh, could you guys go back and rework all
that stuff?". I'm not going to complain too loudly now if somebody
wants something in there done in a different way, but it's easier to
do that now while there's only pg_background sitting on top of it.

> What I'm thinking of is providing an actual API for the writes instead
> of hooking into the socket API in a couple places. I.e. have something
> like
>
> typedef struct DestIO DestIO;
>
> struct DestIO
> {
> void (*flush)(struct DestIO *io);
> int (*putbytes)(struct DestIO *io, const char *s, size_t len);
> int (*getbytes)(struct DestIO *io, const char *s, size_t len);
> ...
> }
>
> and do everything through it. I haven't thought much about the specific
> API we want, but abstracting the communication properly instead of
> adding hooks here and there is imo much more likely to succeed in the
> long term.

This sounds suspiciously like the DestReceiver thing we've already
got, except that the DestReceiver only applies to tuple results, not
errors and notices and so on. I'm not totally unamenable to a bigger
refactoring here, but right now it looks to me like a solution in
search of a problem. The hooks are simple and seem to work fine; I
don't want to add notation for its own sake.

>> > Also, you seem to have only touched receiving from the client, and not
>> > sending back to the subprocess. Is that actually sufficient? I'd expect
>> > that for this facility to be fully useful it'd have to be two way
>> > communication. But perhaps I'm overestimating what it could be used for.
>>
>> Well, the basic shm_mq infrastructure can be used to send any kind of
>> messages you want between any pair of processes that care to establish
>> them. But in general I expect that data is going to flow mostly in
>> one direction - the user backend will launch workers and give them an
>> initial set of instructions, and then results will stream back from
>> the workers to the user backend. Other messaging topologies are
>> certainly possible, and probably useful for something, but I don't
>> really know exactly what those things will be yet, and I'm not sure
>> the FEBE protocol will be the right tool for the job anyway.
>
> It's imo not particularly unreasonable to e.g. COPY to/from a bgworker. Which
> would require the ability to both read/write from the other side.

Well, that should work fine if the background worker and user backend
generate the CopyData messages via some bespoke code rather than
expecting to be able to jump into copy.c and have everything work. If
you want that to work, why? It doesn't make much sense for
pg_background, because I don't think it would be sensible for SELECT
pg_background_result(...) to return CopyInResponse or CopyOutResponse,
and even if it were sensible it doesn't seem useful. And I can't
think of any other application off-hand, either.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-09-08 05:09:52
Message-ID: CAA4eK1+pr05QZKspDRJ1s_eS6CAm4v2CSqvVv8edfBnWBG+r+Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Jul 26, 2014 at 9:32 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Fri, Jul 25, 2014 at 4:16 PM, Alvaro Herrera
> <alvherre(at)2ndquadrant(dot)com> wrote:
> > On Fri, Jul 25, 2014 at 02:11:32PM -0400, Robert Haas wrote:
> >> + pq_mq_busy = true;
> >> +
> >> + iov[0].data = &msgtype;
> >> + iov[0].len = 1;
> >> + iov[1].data = s;
> >> + iov[1].len = len;
> >> +
> >> + Assert(pq_mq_handle != NULL);
> >> + result = shm_mq_sendv(pq_mq_handle, iov, 2, false);
> >> +
> >> + pq_mq_busy = false;
> >
> > Don't you need a PG_TRY block here to reset pq_mq_busy?
>
> No. If shm_mq_sendv is interrupted, we can't use the shm_mq any more.
> But since that should only happen if an interrupt arrives while the
> queue is full, I think that's OK.

I think here not only on interrupt, but any other error in this
function shm_mq_sendv() path (one example is WaitLatch())
could lead to same behaviour.

> (Think about the alternatives: if
> the queue is full, we have no way of notifying the launching process
> without waiting for it to retrieve the results, but it might not do
> that right away, and if we've been killed we need to die *now* not
> later.)

So in such cases what is the advise to users, currently they will
see the below message:
postgres=# select * from pg_background_result(5124) as (x int);
ERROR: lost connection to worker process with PID 5124

One way is to ask them to check logs, but what about if they want
to handle error and take some action?

Another point about error handling is that to execute the sql in
function pg_background_worker_main(), it starts the transaction
which I think doesn't get aborted if error occurs and similarly handling
for timeout seems to be missing in error path.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-09-08 06:05:54
Message-ID: CAA4eK1JWr4RGCOASBJiY4bPaawxqBMwgjnBi2Hgh3JvUsVX5dg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Sep 8, 2014 at 10:39 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
wrote:
>
> On Sat, Jul 26, 2014 at 9:32 PM, Robert Haas <robertmhaas(at)gmail(dot)com>
wrote:
> > On Fri, Jul 25, 2014 at 4:16 PM, Alvaro Herrera
> > <alvherre(at)2ndquadrant(dot)com> wrote:
> > > On Fri, Jul 25, 2014 at 02:11:32PM -0400, Robert Haas wrote:
> > >> + pq_mq_busy = true;
> > >> +
> > >> + iov[0].data = &msgtype;
> > >> + iov[0].len = 1;
> > >> + iov[1].data = s;
> > >> + iov[1].len = len;
> > >> +
> > >> + Assert(pq_mq_handle != NULL);
> > >> + result = shm_mq_sendv(pq_mq_handle, iov, 2, false);
> > >> +
> > >> + pq_mq_busy = false;
> > >
> > > Don't you need a PG_TRY block here to reset pq_mq_busy?
> >
> > No. If shm_mq_sendv is interrupted, we can't use the shm_mq any more.
> > But since that should only happen if an interrupt arrives while the
> > queue is full, I think that's OK.
>
> I think here not only on interrupt, but any other error in this
> function shm_mq_sendv() path (one example is WaitLatch())
> could lead to same behaviour.
>
> > (Think about the alternatives: if
> > the queue is full, we have no way of notifying the launching process
> > without waiting for it to retrieve the results, but it might not do
> > that right away, and if we've been killed we need to die *now* not
> > later.)
>
> So in such cases what is the advise to users, currently they will
> see the below message:
> postgres=# select * from pg_background_result(5124) as (x int);
> ERROR: lost connection to worker process with PID 5124
>
> One way is to ask them to check logs, but what about if they want
> to handle error and take some action?
>
> Another point about error handling is that to execute the sql in
> function pg_background_worker_main(), it starts the transaction
> which I think doesn't get aborted if error occurs

For this I was just referring error handling code of
StartBackgroundWorker(), however during shutdown of process, it
will call AbortOutOfAnyTransaction() which will take care of aborting
transaction.

> and similarly handling
> for timeout seems to be missing in error path.

As we are anyway going to exit on error, so not sure, if this will be
required, however having it for clean exit seems to be better.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-09-08 21:01:26
Message-ID: CA+TgmoYYqzz8DuNziiy=h4PmEajWRgzo=s+xA3-gwBdUYboG_g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Sep 8, 2014 at 1:09 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>> > Don't you need a PG_TRY block here to reset pq_mq_busy?
>>
>> No. If shm_mq_sendv is interrupted, we can't use the shm_mq any more.
>> But since that should only happen if an interrupt arrives while the
>> queue is full, I think that's OK.
>
> I think here not only on interrupt, but any other error in this
> function shm_mq_sendv() path (one example is WaitLatch())
> could lead to same behaviour.

Probably true. But I think we generally count on that to be no-fail,
or close to it, so I'm not really worried about it.

>> (Think about the alternatives: if
>> the queue is full, we have no way of notifying the launching process
>> without waiting for it to retrieve the results, but it might not do
>> that right away, and if we've been killed we need to die *now* not
>> later.)
>
> So in such cases what is the advise to users, currently they will
> see the below message:
> postgres=# select * from pg_background_result(5124) as (x int);
> ERROR: lost connection to worker process with PID 5124
>
> One way is to ask them to check logs, but what about if they want
> to handle error and take some action?

They have to check the logs. To reiterate what I said before, there
is no reasonable way to both have the background worker terminate
quickly and also guarantee that the full error message is received by
the process that started it. You have to pick one, and I stick by the
one I picked.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-09-08 21:02:17
Message-ID: CA+Tgmoat=Wrr9hUs9Y3ZauFSUUmbMXMOo+c1AZAx4JYV=B6u8g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Sep 8, 2014 at 2:05 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>> Another point about error handling is that to execute the sql in
>> function pg_background_worker_main(), it starts the transaction
>> which I think doesn't get aborted if error occurs
>
> For this I was just referring error handling code of
> StartBackgroundWorker(), however during shutdown of process, it
> will call AbortOutOfAnyTransaction() which will take care of aborting
> transaction.

Yeah.

>> and similarly handling
>> for timeout seems to be missing in error path.
>
> As we are anyway going to exit on error, so not sure, if this will be
> required, however having it for clean exit seems to be better.

Can you be more specific?

I'm generally of the view that there's little point in spending time
cleaning things up that will go away anyway when the process exits.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-09-09 10:00:23
Message-ID: CAA4eK1JFk7E3MWkmOTVFoFoDZQX46PRcQryc0jcA_-sBz2kgNg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Sep 9, 2014 at 2:32 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Mon, Sep 8, 2014 at 2:05 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
wrote:
> >> Another point about error handling is that to execute the sql in
> >> function pg_background_worker_main(), it starts the transaction
> >> which I think doesn't get aborted if error occurs
> >
> > For this I was just referring error handling code of
> > StartBackgroundWorker(), however during shutdown of process, it
> > will call AbortOutOfAnyTransaction() which will take care of aborting
> > transaction.
>
> Yeah.
>
> >> and similarly handling
> >> for timeout seems to be missing in error path.
> >
> > As we are anyway going to exit on error, so not sure, if this will be
> > required, however having it for clean exit seems to be better.
>
> Can you be more specific?

I was thinking to handle errors similar to what PostgreMain does,
however if it is going to exit then it might no be worth.

> I'm generally of the view that there's little point in spending time
> cleaning things up that will go away anyway when the process exits.

Yeah, in that case it might not make much sense. The argument here
could be why it has to exit, why can't it wait till the launcher asks it
to exit. You have mentioned in previous mail that you want to stick to
the approach taken by patch, however it is not clear why you think that
is better. If I try to think about how the worker backend should behave
incase the master backend assigns some task to worker, then I think it
would be sensible for worker to not exit after completing it's task (either
it has finished the execution of work assigned or otherwise) as master
backend can assign further tasks to the same worker. Another advantage
would be that setup and tear down cost of worker will be saved. Now If
we just think w.r.t pg_background it might not be so important to let
worker wait till launcher asks to quit, however as you have mentioned
in your mail upthread that you are expecting this kind of set-up for actual
parallelism, so it might not be completely insensible to expect worker
to wait till it has been asked to quit.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


From: Petr Jelinek <petr(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-09-09 16:33:46
Message-ID: 540F2BEA.2080009@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 29/07/14 18:51, Robert Haas wrote:
> On Mon, Jul 28, 2014 at 1:50 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
>> What I'm thinking of is providing an actual API for the writes instead
>> of hooking into the socket API in a couple places. I.e. have something
>> like
>>
>> typedef struct DestIO DestIO;
>>
>> struct DestIO
>> {
>> void (*flush)(struct DestIO *io);
>> int (*putbytes)(struct DestIO *io, const char *s, size_t len);
>> int (*getbytes)(struct DestIO *io, const char *s, size_t len);
>> ...
>> }
>>
>> and do everything through it. I haven't thought much about the specific
>> API we want, but abstracting the communication properly instead of
>> adding hooks here and there is imo much more likely to succeed in the
>> long term.
>
> This sounds suspiciously like the DestReceiver thing we've already
> got, except that the DestReceiver only applies to tuple results, not
> errors and notices and so on. I'm not totally unamenable to a bigger
> refactoring here, but right now it looks to me like a solution in
> search of a problem. The hooks are simple and seem to work fine; I
> don't want to add notation for its own sake.
>

I am not sure if what Andres proposed is the right solution, but I do
agree here that the hook definitely isn't.

I am also not sure that I like the pq_redirect_to_shm_mq being directly
exposed for use in bgworker, what I would like is to have elog interface
to which you tell that you want errors sent to shm_mq handle and that
one would then do the necessary calls (It's basically same principle but
for the sake of cleanliness/correctness I think it should be elog and
not pq from the point of bgworker).

So the interface needs work.

I do agree that we don't need two way communication over this channel, I
think the dsm_mq can be used directly quite well for generic usecase.

Otherwise I like the patch, and I am glad you also made the GUC
save/restore infrastructure.

Please don't forget to add this to next commitfest.

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Petr Jelinek <petr(at)2ndquadrant(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-09-09 16:49:25
Message-ID: CA+TgmoYqC_UWjObd7fdHU7ToCtYQ1JsG5x4fguses-Ed34qf_g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Sep 9, 2014 at 12:33 PM, Petr Jelinek <petr(at)2ndquadrant(dot)com> wrote:
> I am not sure if what Andres proposed is the right solution, but I do agree
> here that the hook definitely isn't.

Well, that doesn't get me very far. Andres never responded to my
previous comments about why I did it that way, and you're not
proposing a clear alternative that I can either implement or comment
on.

> I am also not sure that I like the pq_redirect_to_shm_mq being directly
> exposed for use in bgworker, what I would like is to have elog interface to
> which you tell that you want errors sent to shm_mq handle and that one would
> then do the necessary calls (It's basically same principle but for the sake
> of cleanliness/correctness I think it should be elog and not pq from the
> point of bgworker).

I think that's completely wrong. As the patch series demonstrates,
it's not limited to propagating ErrorResponse and NoticeResponse. It
can also propagate NotifyResponse and RowDescription and DataRow and
anything else that comes along. We are not just propagating errors;
we are propagating all protocol messages of whatever type. So tying
it to elog specifically is not right.

> So the interface needs work.
>
> I do agree that we don't need two way communication over this channel, I
> think the dsm_mq can be used directly quite well for generic usecase.

I'm glad you agree, but that leaves me baffled as to what's wrong with
the hook approach. There's no crying need for extensibility in this
code; it's barely changed in a very long time, and I know of no other
need which might soon require changing it again. The hook is very
lightweight and shouldn't materially affect performance when it's not
used, as a more complex approach might.

> Otherwise I like the patch, and I am glad you also made the GUC save/restore
> infrastructure.

Cool.

> Please don't forget to add this to next commitfest.

OK, done. But it would be awfully nice if we could actually make some
progress on hammering out a design everyone can live with here.
Letting it sit for another 5 weeks is not going to help anything, and
it will hold up getting any more stuff after this done in time for
9.5.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Petr Jelinek <petr(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-09-09 17:18:49
Message-ID: 540F3679.8090907@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 09/09/14 18:49, Robert Haas wrote:
> On Tue, Sep 9, 2014 at 12:33 PM, Petr Jelinek <petr(at)2ndquadrant(dot)com> wrote:
>> I am also not sure that I like the pq_redirect_to_shm_mq being directly
>> exposed for use in bgworker, what I would like is to have elog interface to
>> which you tell that you want errors sent to shm_mq handle and that one would
>> then do the necessary calls (It's basically same principle but for the sake
>> of cleanliness/correctness I think it should be elog and not pq from the
>> point of bgworker).
>
> I think that's completely wrong. As the patch series demonstrates,
> it's not limited to propagating ErrorResponse and NoticeResponse. It
> can also propagate NotifyResponse and RowDescription and DataRow and
> anything else that comes along. We are not just propagating errors;
> we are propagating all protocol messages of whatever type. So tying
> it to elog specifically is not right.
>

Oh in that case, I think what Andres proposed is actually quite good. I
know the hook works fine it just seems like using somewhat hackish
solution to save 20 lines of code.
The reason why I am not proposing anything better is that my solution
when I did prototyping in this area was to just check if my pq_dsm_mq
handle is set or not and call the appropriate function from the wrapper
based on that which does not seem like big improvement over the hook
approach... Anything better/more flexible that I could come up with is
basically same idea what Andres already wrote.

>> Please don't forget to add this to next commitfest.
>
> OK, done. But it would be awfully nice if we could actually make some
> progress on hammering out a design everyone can live with here.
> Letting it sit for another 5 weeks is not going to help anything, and
> it will hold up getting any more stuff after this done in time for
> 9.5.
>

Yeah I said that just as formality, I wrote the response now and not in
5 weeks exactly for this reason, I am all for discussing this now and I
think it's important to hammer this infrastructure out sooner rather
than later.

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Petr Jelinek <petr(at)2ndquadrant(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-09-09 20:48:45
Message-ID: CA+TgmoaXQnzSPpJOWBNQbLMkDw2Os_eTBwHA4=44DubXz1p0sw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Sep 9, 2014 at 1:18 PM, Petr Jelinek <petr(at)2ndquadrant(dot)com> wrote:
>> I think that's completely wrong. As the patch series demonstrates,
>> it's not limited to propagating ErrorResponse and NoticeResponse. It
>> can also propagate NotifyResponse and RowDescription and DataRow and
>> anything else that comes along. We are not just propagating errors;
>> we are propagating all protocol messages of whatever type. So tying
>> it to elog specifically is not right.
>
> Oh in that case, I think what Andres proposed is actually quite good. I know
> the hook works fine it just seems like using somewhat hackish solution to
> save 20 lines of code.

If it's 20 lines of code, I'm probably fine to go that way. Let's see
if we can figure out what those 20 lines look like.

libpq.h exports 29 functions that do a variety of different things.
Of those, 20 are in pqcomm.c and the others are in be-secure.c. I
presume that pluggability for the latter group, if needed at all, is a
separate project. The remaining ones break down like this:

- StreamServerPort, StreamConnection, StreamClose, and
TouchSocketFiles are intended to be called only from the postmaster,
to set up and tear down the listening socket and individual
connections. Presumably this is not what we care about here.
- pq_getbytes(), pq_getstring(), pq_getmessage(), pq_getbyte(),
pq_peekbyte(), and pq_getbyte_if_available() handle *reading* data
from the socket. Since you previously agreed that we didn't need to
build two-way communication on top of this, I would thank that would
mean that these don't need to be pluggable either. But maybe I'm
wrong.
- The remaining functions are pq_init(), pq_comm_reset(), pq_flush(),
pq_flush_if_writable(), pq_is_send_pending(), pq_putmessage(),
pq_putmessage_noblock(), pq_startcopyout(), and pq_endcopyout().
These are the ones that I think are potentially interesting.

I didn't choose to provide hooks for all of these in the submitted
patch because they're not all needed for I want to do here:
pq_startcopyout() and pq_endcopyout() are only needed for V2 protocol
support, which did not interest me (nor did COPY, really);
pq_putmessage_noblock(), pq_flush_if_writable(), and
pq_is_send_pending() are only used for the walsender protocol, which
doesn't seem useful to redirect to a non-socket; and I just didn't
happen to have any use for pq_init() or pq_comm_reset(). Hence what I
ended up with.

But, I could revisit that. Suppose I provide a structure with 10
function pointers for all ten of those functions, or maybe 9 since
pq_init() is called so early that it's not likely we'll have control
to put the hooks in place before that point, and anyway whatever code
installs the hooks can do its own initialization then. Then make a
global variable like pqSendMethods and #define pq_comm_reset() to be
pqSendMethods->comm_reset(), pflush() to be pqSendMethods->flush(),
and so on for all 9 or 10 methods. Then the pqmq code could just
change pqSendMethods to point to its own method structure instead of
the default one. Would that address the concern this concern? It's
more than 20 lines of code, but it's not TOO bad.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Petr Jelinek <petr(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-09-09 22:03:03
Message-ID: 540F7917.9060508@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 09/09/14 22:48, Robert Haas wrote:
> On Tue, Sep 9, 2014 at 1:18 PM, Petr Jelinek <petr(at)2ndquadrant(dot)com> wrote:
>>> I think that's completely wrong. As the patch series demonstrates,
>>> it's not limited to propagating ErrorResponse and NoticeResponse. It
>>> can also propagate NotifyResponse and RowDescription and DataRow and
>>> anything else that comes along. We are not just propagating errors;
>>> we are propagating all protocol messages of whatever type. So tying
>>> it to elog specifically is not right.
>>
>> Oh in that case, I think what Andres proposed is actually quite good. I know
>> the hook works fine it just seems like using somewhat hackish solution to
>> save 20 lines of code.
>
> If it's 20 lines of code, I'm probably fine to go that way. Let's see
> if we can figure out what those 20 lines look like.
>
> libpq.h exports 29 functions that do a variety of different things.
> Of those, 20 are in pqcomm.c and the others are in be-secure.c. I
> presume that pluggability for the latter group, if needed at all, is a
> separate project. The remaining ones break down like this:
>

Ugh

> - The remaining functions are pq_init(), pq_comm_reset(), pq_flush(),
> pq_flush_if_writable(), pq_is_send_pending(), pq_putmessage(),
> pq_putmessage_noblock(), pq_startcopyout(), and pq_endcopyout().
> These are the ones that I think are potentially interesting.
>
> I didn't choose to provide hooks for all of these in the submitted
> patch because they're not all needed for I want to do here:
> pq_startcopyout() and pq_endcopyout() are only needed for V2 protocol
> support, which did not interest me (nor did COPY, really);
> pq_putmessage_noblock(), pq_flush_if_writable(), and
> pq_is_send_pending() are only used for the walsender protocol, which
> doesn't seem useful to redirect to a non-socket; and I just didn't
> happen to have any use for pq_init() or pq_comm_reset(). Hence what I
> ended up with.
>
> But, I could revisit that. Suppose I provide a structure with 10
> function pointers for all ten of those functions, or maybe 9 since
> pq_init() is called so early that it's not likely we'll have control
> to put the hooks in place before that point, and anyway whatever code
> installs the hooks can do its own initialization then. Then make a
> global variable like pqSendMethods and #define pq_comm_reset() to be
> pqSendMethods->comm_reset(), pflush() to be pqSendMethods->flush(),
> and so on for all 9 or 10 methods. Then the pqmq code could just
> change pqSendMethods to point to its own method structure instead of
> the default one. Would that address the concern this concern? It's
> more than 20 lines of code, but it's not TOO bad.
>

Yes, although my issue with the hooks was not that you only provided
them for 2 functions, but the fact that it had no structure and the
implementation was "if hook set do this, else do that" which I don't see
like a good way of doing it.

All I personally want is structure and extensibility so struct with just
the needed functions is good enough for me and I would be ok to leave
the other 8 functions just as a option for whomever needs to make them
overridable in the future.

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Petr Jelinek <petr(at)2ndquadrant(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-09-10 04:57:39
Message-ID: CAA4eK1LSv4s5D0f2tao9QHsvo+tcurZpB8JtUvmD7gKFKhL51w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Sep 10, 2014 at 2:18 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Tue, Sep 9, 2014 at 1:18 PM, Petr Jelinek <petr(at)2ndquadrant(dot)com> wrote:
> >> I think that's completely wrong. As the patch series demonstrates,
> >> it's not limited to propagating ErrorResponse and NoticeResponse. It
> >> can also propagate NotifyResponse and RowDescription and DataRow and
> >> anything else that comes along. We are not just propagating errors;
> >> we are propagating all protocol messages of whatever type. So tying
> >> it to elog specifically is not right.
> >
> > Oh in that case, I think what Andres proposed is actually quite good. I
know
> > the hook works fine it just seems like using somewhat hackish solution
to
> > save 20 lines of code.
>
> If it's 20 lines of code, I'm probably fine to go that way. Let's see
> if we can figure out what those 20 lines look like.
>
> libpq.h exports 29 functions that do a variety of different things.
> Of those, 20 are in pqcomm.c and the others are in be-secure.c. I
> presume that pluggability for the latter group, if needed at all, is a
> separate project. The remaining ones break down like this:
>
> - StreamServerPort, StreamConnection, StreamClose, and
> TouchSocketFiles are intended to be called only from the postmaster,
> to set up and tear down the listening socket and individual
> connections. Presumably this is not what we care about here.
> - pq_getbytes(), pq_getstring(), pq_getmessage(), pq_getbyte(),
> pq_peekbyte(), and pq_getbyte_if_available() handle *reading* data
> from the socket. Since you previously agreed that we didn't need to
> build two-way communication on top of this, I would thank that would
> mean that these don't need to be pluggable either. But maybe I'm
> wrong.
> - The remaining functions are pq_init(), pq_comm_reset(), pq_flush(),
> pq_flush_if_writable(), pq_is_send_pending(), pq_putmessage(),
> pq_putmessage_noblock(), pq_startcopyout(), and pq_endcopyout().
> These are the ones that I think are potentially interesting.
>
> I didn't choose to provide hooks for all of these in the submitted
> patch because they're not all needed for I want to do here:
> pq_startcopyout() and pq_endcopyout() are only needed for V2 protocol
> support, which did not interest me (nor did COPY, really);
> pq_putmessage_noblock(), pq_flush_if_writable(), and
> pq_is_send_pending() are only used for the walsender protocol, which
> doesn't seem useful to redirect to a non-socket; and I just didn't
> happen to have any use for pq_init() or pq_comm_reset(). Hence what I
> ended up with.
>
> But, I could revisit that. Suppose I provide a structure with 10
> function pointers for all ten of those functions, or maybe 9 since
> pq_init() is called so early that it's not likely we'll have control
> to put the hooks in place before that point, and anyway whatever code
> installs the hooks can do its own initialization then.

Can we use pq_init() to install function pointers?
Let us say that it will take COMM_METHOD (tcp, shm_mq) as
input and then install function pointers based on communication
method. We can call this from main function of bgworker (in case
of patch from pg_background_worker_main()) with COMM_METHOD
as shm_mq and BackendInitialize() will pass it as tcp.

> Then make a
> global variable like pqSendMethods and #define pq_comm_reset() to be
> pqSendMethods->comm_reset(), pflush() to be pqSendMethods->flush(),
> and so on for all 9 or 10 methods. Then the pqmq code could just
> change pqSendMethods to point to its own method structure instead of
> the default one. Would that address the concern this concern? It's
> more than 20 lines of code, but it's not TOO bad.

This idea seems to be better than directly using hooks, however I
don't see any harm in defining pqReceiveMethods for get API's as
well because it can make the whole layer extendable. Having said
that I think as currently there is no usage of it, so we can leave it
for another patch as well.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Petr Jelinek <petr(at)2ndquadrant(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-09-10 20:01:07
Message-ID: CA+TgmoYc9_qUAFmze8fvg=MBPvZr5qf-si6KOpVseS==+B9F5A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Sep 9, 2014 at 6:03 PM, Petr Jelinek <petr(at)2ndquadrant(dot)com> wrote:
>> - The remaining functions are pq_init(), pq_comm_reset(), pq_flush(),
>> pq_flush_if_writable(), pq_is_send_pending(), pq_putmessage(),
>> pq_putmessage_noblock(), pq_startcopyout(), and pq_endcopyout().
>> These are the ones that I think are potentially interesting.
>>
>> I didn't choose to provide hooks for all of these in the submitted
>> patch because they're not all needed for I want to do here:
>> pq_startcopyout() and pq_endcopyout() are only needed for V2 protocol
>> support, which did not interest me (nor did COPY, really);
>> pq_putmessage_noblock(), pq_flush_if_writable(), and
>> pq_is_send_pending() are only used for the walsender protocol, which
>> doesn't seem useful to redirect to a non-socket; and I just didn't
>> happen to have any use for pq_init() or pq_comm_reset(). Hence what I
>> ended up with.
>>
>> But, I could revisit that. Suppose I provide a structure with 10
>> function pointers for all ten of those functions, or maybe 9 since
>> pq_init() is called so early that it's not likely we'll have control
>> to put the hooks in place before that point, and anyway whatever code
>> installs the hooks can do its own initialization then. Then make a
>> global variable like pqSendMethods and #define pq_comm_reset() to be
>> pqSendMethods->comm_reset(), pflush() to be pqSendMethods->flush(),
>> and so on for all 9 or 10 methods. Then the pqmq code could just
>> change pqSendMethods to point to its own method structure instead of
>> the default one. Would that address the concern this concern? It's
>> more than 20 lines of code, but it's not TOO bad.
>>
>
> Yes, although my issue with the hooks was not that you only provided them
> for 2 functions, but the fact that it had no structure and the
> implementation was "if hook set do this, else do that" which I don't see
> like a good way of doing it.

We've done it that way in a bunch of other places, like ExecutorRun().
An advantage of this approach (I think) is that jumping to a fixed
address is faster than jumping through a function pointer - so with
the approach I've taken here, the common case where we're talking to
the client incurs only the overhead of a null-test, and the larger
overhead of the function pointer jump is incurred only when the hook
is in use. Maybe that's not enough of a difference to matter to
anything, but I think the contention that I've invented some novel
kind of interface here doesn't hold up to scrutiny. We have lots of
hooks that work just like what I did here.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Petr Jelinek <petr(at)2ndquadrant(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-09-10 20:53:12
Message-ID: CA+TgmoaDy77LUqoYuUVDWK9ir865yqJBfuSyWiRJo=cnmLknWw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Sep 10, 2014 at 4:01 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> Yes, although my issue with the hooks was not that you only provided them
>> for 2 functions, but the fact that it had no structure and the
>> implementation was "if hook set do this, else do that" which I don't see
>> like a good way of doing it.
>
> We've done it that way in a bunch of other places, like ExecutorRun().
> An advantage of this approach (I think) is that jumping to a fixed
> address is faster than jumping through a function pointer - so with
> the approach I've taken here, the common case where we're talking to
> the client incurs only the overhead of a null-test, and the larger
> overhead of the function pointer jump is incurred only when the hook
> is in use. Maybe that's not enough of a difference to matter to
> anything, but I think the contention that I've invented some novel
> kind of interface here doesn't hold up to scrutiny. We have lots of
> hooks that work just like what I did here.

Here's what the other approach looks like. I can't really see doing
this way and then only providing hooks for those two functions, so
this is with hooks for all the send-side stuff.

Original version: 9 files changed, 295 insertions(+), 3 deletions(-)
This version: 9 files changed, 428 insertions(+), 47 deletions(-)

There is admittedly a certain elegance to providing a complete set of
hooks, so maybe this is the way to go. The remaining patches in the
patch series work with either the old version or this one; the changes
here don't affect anything else.

Anyone else have an opinion on which way is better here?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
0003-Support-frontend-backend-protocol-communication-usin-v2.patch text/x-patch 22.3 KB

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-09-11 11:34:31
Message-ID: CAA4eK1J=X_Kuhm8XQD37+yQiqnYF04OsZy4KCRxWu4Wz1rYy5Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Jul 25, 2014 at 11:41 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> Patch 4 adds infrastructure that allows one session to save all of its
> non-default GUC values and another session to reload those values.
> This was written by Amit Khandekar and Noah Misch. It allows
> pg_background to start up the background worker with the same GUC
> settings that the launching process is using. I intend this as a
> demonstration of how to synchronize any given piece of state between
> cooperating backends. For real parallelism, we'll need to synchronize
> snapshots, combo CIDs, transaction state, and so on, in addition to
> GUCs. But GUCs are ONE of the things that we'll need to synchronize
> in that context, and this patch shows the kind of API we're thinking
> about for these sorts of problems.

Don't we need some way to prohibit changing GUC by launching process,
once it has shared the existing GUC?

> Patch 6 is pg_background itself. I'm quite pleased with how easily
> this came together. The existing background worker, dsm, shm_toc, and
> shm_mq infrastructure handles most of the heavily lifting here -
> obviously with some exceptions addressed by the preceding patches.
> Again, this is the kind of set-up that I'm expecting will happen in a
> background worker used for actual parallelism - clearly, more state
> will need to be restored there than here, but nonetheless the general
> flow of the code here is about what I'm imagining, just with somewhat
> more different kinds of state. Most of the work of writing this patch
> was actually figuring out how to execute the query itself; what I
> ended up with is mostly copied form exec_simple_query, but with some
> difference here and there. I'm not sure if it would be
> possible/advisable to try to refactor to reduce duplication.

1. This patch generates warning on windows
1>pg_background.obj : error LNK2001: unresolved external symbol
StatementTimeout

You need to add PGDLLIMPORT for StatementTimeout

2.
CREATE FUNCTION pg_background_launch(sql pg_catalog.text,
queue_size pg_catalog.int4 DEFAULT 65536)

Here shouldn't queue_size be pg_catalog.int8 as I could see
some related functions in test_shm_mq uses int8?

CREATE FUNCTION test_shm_mq(queue_size pg_catalog.int8,
CREATE FUNCTION test_shm_mq_pipelined(queue_size pg_catalog.int8,

Anyway I think corresponding C function doesn't use matching function
to extract the function args.

pg_background_launch(PG_FUNCTION_ARGS)
{
text *sql = PG_GETARG_TEXT_PP(0);
int32 queue_size = PG_GETARG_INT64(1);

Here it should _INT32 variant to match with current sql definition,
otherwise it leads to below error.

postgres=# select pg_background_launch('vacuum verbose foo');
ERROR: queue size must be at least 64 bytes

3.
Comparing execute_sql_string() and exec_simple_query(), I could see below
main differences:
a. Allocate a new memory context different from message context
b. Start/commit control of transaction is outside execute_sql_string
c. enable/disable statement timeout is done from outside incase of
execute_sql_string()
d. execute_sql_string() prohibits Start/Commit/Abort transaction statements.
e. execute_sql_string() doesn't log statements
f. execute_sql_string() uses binary format for result whereas
exec_simple_query()
uses TEXT as defult format
g. processed stat related info from caller incase of execute_sql_string().

Can't we handle all these or other changes inside exec_simple_query()
based on some parameter or may be a use a hook similar to what we
do in ProcessUtility?

Basically it looks bit odd to have a duplicate (mostly) copy of
exec_simple_query().

4.
Won't it be better if pg_background_worker_main() can look more
like PostgresMain() (in terms of handling different kind of messages),
so that it can be extended in future to handle parallel worker.

5.
pg_background_result()
{
..
/* Read and processes messages from the shared memory queue. */
}

Shouldn't the processing of messages be a separate function as
we do for pqParseInput3().

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


From: Petr Jelinek <petr(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-09-11 15:40:06
Message-ID: 5411C256.6050103@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 10/09/14 22:53, Robert Haas wrote:
> Here's what the other approach looks like. I can't really see doing
> this way and then only providing hooks for those two functions, so
> this is with hooks for all the send-side stuff.
>
> Original version: 9 files changed, 295 insertions(+), 3 deletions(-)
> This version: 9 files changed, 428 insertions(+), 47 deletions(-)
>
> There is admittedly a certain elegance to providing a complete set of
> hooks, so maybe this is the way to go. The remaining patches in the
> patch series work with either the old version or this one; the changes
> here don't affect anything else.
>
> Anyone else have an opinion on which way is better here?

Ok so it is more than 20 lines :)

I do like this approach more though, it looks cleaner to me.

On 10/09/14 22:53, Robert Haas wrote:
> +extern int (*pq_putmessage_hook)(char msgtype, const char *s, size_tlen);
> +extern int (*pq_flush_hook)(void);

Btw you forgot to remove the above.

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-09-11 18:37:47
Message-ID: CA+TgmoZAObrXf_0Ca28X6yTfV=bxeF_ZLnXB_iX-m=brQyBWqg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Sep 11, 2014 at 7:34 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> Don't we need some way to prohibit changing GUC by launching process,
> once it has shared the existing GUC?

Nope. I mean, eventually, for true parallelism ... we absolutely will
need that. But pg_background itself doesn't need that; it's perfectly
fine for configuration settings to get changed in the background
worker. So it's a different piece of infrastructure from this patch
set.

> 1. This patch generates warning on windows
> 1>pg_background.obj : error LNK2001: unresolved external symbol
> StatementTimeout
>
> You need to add PGDLLIMPORT for StatementTimeout

OK. I still think we should go back and PGDLLIMPORT-ize all the GUC variables.

> 2.
> CREATE FUNCTION pg_background_launch(sql pg_catalog.text,
> queue_size pg_catalog.int4 DEFAULT 65536)
>
> Here shouldn't queue_size be pg_catalog.int8 as I could see
> some related functions in test_shm_mq uses int8?

I think if you think you want a queue that's more than 2GB in size,
you should should re-think.

> pg_background_launch(PG_FUNCTION_ARGS)
> {
> text *sql = PG_GETARG_TEXT_PP(0);
> int32 queue_size = PG_GETARG_INT64(1);
>
> Here it should _INT32 variant to match with current sql definition,
> otherwise it leads to below error.
>
> postgres=# select pg_background_launch('vacuum verbose foo');
> ERROR: queue size must be at least 64 bytes

Oops.

> 3.
> Comparing execute_sql_string() and exec_simple_query(), I could see below
> main differences:
> a. Allocate a new memory context different from message context
> b. Start/commit control of transaction is outside execute_sql_string
> c. enable/disable statement timeout is done from outside incase of
> execute_sql_string()
> d. execute_sql_string() prohibits Start/Commit/Abort transaction statements.
> e. execute_sql_string() doesn't log statements
> f. execute_sql_string() uses binary format for result whereas
> exec_simple_query()
> uses TEXT as defult format
> g. processed stat related info from caller incase of execute_sql_string().
>
> Can't we handle all these or other changes inside exec_simple_query()
> based on some parameter or may be a use a hook similar to what we
> do in ProcessUtility?
>
> Basically it looks bit odd to have a duplicate (mostly) copy of
> exec_simple_query().

It is. But I didn't think hacking up exec_simple_query() was a better
option. We could do that if most people like that approach, but to me
it seemed there were enough differences to make it unappealing.

> 4.
> Won't it be better if pg_background_worker_main() can look more
> like PostgresMain() (in terms of handling different kind of messages),
> so that it can be extended in future to handle parallel worker.

I don't think that a parallel worker will look like pg_background in
much more than broad outline. Some of the same constructs will get
reused, but overall I think it's a different problem that I'd rather
not conflate with this patch.

> 5.
> pg_background_result()
> {
> ..
> /* Read and processes messages from the shared memory queue. */
> }
>
> Shouldn't the processing of messages be a separate function as
> we do for pqParseInput3().

I guess we could. It's not all that much code, though.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Petr Jelinek <petr(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-09-11 19:27:15
Message-ID: 5411F793.4000707@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 11/09/14 20:37, Robert Haas wrote:
>> 1. This patch generates warning on windows
>> 1>pg_background.obj : error LNK2001: unresolved external symbol
>> StatementTimeout
>>
>> You need to add PGDLLIMPORT for StatementTimeout
>
> OK. I still think we should go back and PGDLLIMPORT-ize all the GUC variables.

+1

>> 4.
>> Won't it be better if pg_background_worker_main() can look more
>> like PostgresMain() (in terms of handling different kind of messages),
>> so that it can be extended in future to handle parallel worker.
>
> I don't think that a parallel worker will look like pg_background in
> much more than broad outline. Some of the same constructs will get
> reused, but overall I think it's a different problem that I'd rather
> not conflate with this patch.

Yeah I agree here. While the patch provides a lot of necessary plumbing
that any kind of parallel processing needs, the pg_background itself
does not seem to be base for that.
Actually, when I first seen the pg_background, I was thinking that it
looks like a good base for some kind of background job scheduling
mechanism that was requested few times over the years (the actual
scheduling is the only part missing now IMHO but that's separate
discussion).

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Petr Jelinek <petr(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-09-11 19:36:50
Message-ID: 20140911193650.GF17294@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2014-09-11 21:27:15 +0200, Petr Jelinek wrote:
> On 11/09/14 20:37, Robert Haas wrote:
> >OK. I still think we should go back and PGDLLIMPORT-ize all the GUC variables.
>
> +1

Same here. I think Tom was the only one against it, but pretty much
everyone else was for it. We should fix all the GUCs declared as externs
in multiple .c files at the same time imo.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-09-20 07:03:49
Message-ID: CAA4eK1JfQHjiw1eh=eEH8c7b-o0_FChF5x7pWZv8mQMCmBXPxA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Sep 12, 2014 at 12:07 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Thu, Sep 11, 2014 at 7:34 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
wrote:
> > Don't we need some way to prohibit changing GUC by launching process,
> > once it has shared the existing GUC?
>
> Nope. I mean, eventually, for true parallelism ... we absolutely will
> need that. But pg_background itself doesn't need that; it's perfectly
> fine for configuration settings to get changed in the background
> worker. So it's a different piece of infrastructure from this patch
> set.

Okay, but as there is no predictability (it can be either same as what
launching process has at the when it has launched background worker
or it could be some different value if got changed later due to sighup)
which GUC value will be used by background worker, it might be good
to clarify the same in pg_bacground docs (which are not there currently,
but I assume it will eventually be part of this patch).

>
> > 3.
> > Can't we handle all these or other changes inside exec_simple_query()
> > based on some parameter or may be a use a hook similar to what we
> > do in ProcessUtility?
> >
> > Basically it looks bit odd to have a duplicate (mostly) copy of
> > exec_simple_query().
>
> It is. But I didn't think hacking up exec_simple_query() was a better
> option. We could do that if most people like that approach, but to me
> it seemed there were enough differences to make it unappealing.

Okay, but I think in that case we need to carefully evaluate the
differences else it might lead to behaviour differences in statement
execution. Few observations related to differences are as follows:

1.
Keeping transaction control (Start/commit) outside the function
execute_sql_string() could lead to EndCommand() message being
sent before transaction end which could be a problem in case
transaction commit fails due to any reason. exec_simple_query() takes
care of the same by calling finish_xact_command() before reporting
command-complete for last parse tree. It even has comment indicating
that we should follow this protocol.

2.
+static void
+execute_sql_string(const char *sql)
{
..

+ /* Be sure to advance the command counter after the last script command */

+ CommandCounterIncrement();
}

Won't CommandCounterIncrement() required after every command like
we do in exec_simple_query()?

3.
+static void
+execute_sql_string(const char *sql)
{
..
+ /*
+ * Send a CommandComplete message even if we suppressed the query
+
* results. The user backend will report these in the absence of
+ * any true query results.
+
*/
+ EndCommand(completionTag, DestRemote);
+
+ /* Clean up the portal. */
+
PortalDrop(portal, false);
..
}

Whats the need to reverse the order of execution for EndCommand()
and PortalDrop()? Any error in PortalDrop() will lead to wrong
message being sent to client.

4.
What is the reason for not logging statements executed via
pg_background, even though it allows to report the sql statement
to various monitoring tools by setting debug_query_string?

5.
Isn't it better to add a comment why execute_sql_string() uses
binary format for result?

> > 4.
> > Won't it be better if pg_background_worker_main() can look more
> > like PostgresMain() (in terms of handling different kind of messages),
> > so that it can be extended in future to handle parallel worker.
>
> I don't think that a parallel worker will look like pg_background in
> much more than broad outline. Some of the same constructs will get
> reused, but overall I think it's a different problem that I'd rather
> not conflate with this patch.

No issues.

> > 5.
> > pg_background_result()
> > {
> > ..
> > /* Read and processes messages from the shared memory queue. */
> > }
> >
> > Shouldn't the processing of messages be a separate function as
> > we do for pqParseInput3().
>
> I guess we could. It's not all that much code, though.

Sure, please take a call based on what you feel is right here, I
mentioned it because I felt it might be little bit easier for other people
to understand that code.

Some other comments are as follows:

1.
+pg_background_result(PG_FUNCTION_ARGS)
{
..
..
+ /*
+ * Whether we succeed or fail, a future invocation of this function
+
* may not try to read from the DSM once we've begun to do so.
+ * Accordingly, make arrangements to
clean things up at end of query.
+ */
+ dsm_unkeep_mapping(info->seg);

There can be certain scenarios where user might like to invoke this
again. Assume, user calls function
pg_background_launch('select count(*) from t1') and this statement
execution via background worker is going to take very long time before
it could return anything. After sometime user tries to retrieve data via
pg_background_result(), but the call couldn't came out as it is waiting
for results, so user presses cancel and on again trying after sometime,
he won't get any data. I think this behaviour is bit annoying.

I am able to reproduce this by halting the background worked via
debugger.

postgres=# select pg_background_launch('select count(*) from t1');
pg_background_launch
----------------------
656
(1 row)

postgres=# select * from pg_background_result(656) as (cnt int);
Cancel request sent
ERROR: canceling statement due to user request
postgres=# select * from pg_background_result(656) as (cnt int);
ERROR: PID 656 is not attached to this session

2.
To avoid user to wait for longer, function pg_background_result()
can take an additional parameter where user can specify whether
to WAIT incase results are not available.

3.
+ case 'E':
+ case 'N':
+ {
+
ErrorData edata;
+
+ /* Parse
ErrorResponse or NoticeResponse. */
+ pq_parse_errornotice(&msg, &edata);
+
+
/*
+ * Limit the maximum error level to
ERROR. We don't want
+ * a FATAL inside the background worker to kill the
user
+ * session.
+ */
+
if (edata.elevel > ERROR)
+ edata.elevel = ERROR;

Why FATAL inside background worker is not propagated at same level by
launcher process?
If PANIC in background worker can kill other backends and restart server
then ideally FATAL in background worker should also be treated at same
level by client backend.

4.
+void
+pg_background_worker_main(Datum main_arg)
+{
..
+ responseq = shm_mq_attach(mq, seg, NULL);
+
+ /* Redirect protocol messages to responseq. */
+
pq_redirect_to_shm_mq(mq, responseq);

Any error ("unable to map dynamic shared memory segment") before
pq_redirect_to_shm_mq() will not reach launcher. Launcher client will
get "ERROR: lost connection to worker process with PID 4020".

I think it is very difficult for user to know why such an error has
occurred and what he can do to proceed. I am not sure if there is any
sensible way to report such an error, but OTOH it seems we should
provide some information regarding what has happened to client.

5.
Commands not supported in pg_background should get proper
message.

postgres=# select pg_background_launch('copy t1 from stdin');
pg_background_launch
----------------------
4672
(1 row)

postgres=# select * from pg_background_result(4672) as (result TEXT);
WARNING: unknown message type: G (6 bytes)
ERROR: there is no client connection
CONTEXT: COPY t1, line 1: ""

Something similar to what is returned for transaction statements
("transaction control statements are not allowed in pg_background")
would be better.

6.
+ /*
+ * Tuples returned by any command other than the last are simply
+
* discarded; but those returned by the last (or only) command are
+ * redirected to the shared
memory queue we're using for communication
+ * with the launching backend. If the launching
backend is gone or has
+ * detached us, these messages will just get dropped on the floor.
+
*/
+ --commands_remaining;
+ if (commands_remaining > 0)
+ receiver =
CreateDestReceiver(DestNone);
+ else
+ {
+ receiver =
CreateDestReceiver(DestRemote);
+ SetRemoteDestReceiverParams(receiver, portal);
+
}

If you have to discard results of statements other than last,
then why at first place you want to allow multiple statements?

Like in below case, how will user identify whether the result is
for first statement or second statement?

postgres=# select pg_background_launch('select count(*) from t1;select
count(*)
from t1');
pg_background_launch
----------------------
3996
(1 row)

postgres=# select * from pg_background_result(3996) as (result bigint);
result
--------
11
(1 row)

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-09-26 18:48:03
Message-ID: CA+TgmoYz-PkawVnLLYES5MszPWthu=g8ynTDmVHUECh4Cns7Bg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Sep 20, 2014 at 3:03 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> Okay, but as there is no predictability (it can be either same as what
> launching process has at the when it has launched background worker
> or it could be some different value if got changed later due to sighup)
> which GUC value will be used by background worker, it might be good
> to clarify the same in pg_bacground docs (which are not there currently,
> but I assume it will eventually be part of this patch).

OK, I will mention that in the documentation when I write it. I
didn't sweat that too much originally because I wasn't sure how much
churn there was going to be in the user-visible API, but so far
everybody seems happy with that, so maybe it's time to go document it.
It's a pretty simple API but, as you say, there are a few details
worth mentioning. I still need some review of the earlier patches in
the series before this really gets urgent, though: so far no one has
commented on #1, #2, #4, or #5, and I'm not entirely whether my
revised version of #3 passed muster.

> Keeping transaction control (Start/commit) outside the function
> execute_sql_string() could lead to EndCommand() message being
> sent before transaction end which could be a problem in case
> transaction commit fails due to any reason. exec_simple_query() takes
> care of the same by calling finish_xact_command() before reporting
> command-complete for last parse tree. It even has comment indicating
> that we should follow this protocol.

Fixed in the attached version.

> Won't CommandCounterIncrement() required after every command like
> we do in exec_simple_query()?

Fixed in the attached version.

> Whats the need to reverse the order of execution for EndCommand()
> and PortalDrop()? Any error in PortalDrop() will lead to wrong
> message being sent to client.

Fixed in the attached version.

> What is the reason for not logging statements executed via
> pg_background, even though it allows to report the sql statement
> to various monitoring tools by setting debug_query_string?

I wasn't really sure whether core GUCs should affect the behavior of a
contrib module, and I wasn't excited about duplicating the code.

> Isn't it better to add a comment why execute_sql_string() uses
> binary format for result?

Done in the attached version.

> Sure, please take a call based on what you feel is right here, I
> mentioned it because I felt it might be little bit easier for other people
> to understand that code.

I played around with this a bit but it didn't seem like it worked out
to a win. There were a bunch of things that had to be passed down
into that function and it didn't seem like it was really reducing
complexity. What I think makes sense is to keep an eye on the
complexity of the handling for each individual message type and move
any handlers that get complex to their own functions.

> There can be certain scenarios where user might like to invoke this
> again. Assume, user calls function
> pg_background_launch('select count(*) from t1') and this statement
> execution via background worker is going to take very long time before
> it could return anything. After sometime user tries to retrieve data via
> pg_background_result(), but the call couldn't came out as it is waiting
> for results, so user presses cancel and on again trying after sometime,
> he won't get any data. I think this behaviour is bit annoying.

Yep. I don't have a better solution at the moment, but there may be one.

> To avoid user to wait for longer, function pg_background_result()
> can take an additional parameter where user can specify whether
> to WAIT incase results are not available.

That gets complicated. Until any results are available? Until all
results are available? What if we try to read from the queue to find
out if results are available, and the first message in the queue is
long enough that it wraps the queue, so that we have to block and wait
for the background worker to send more data before we can continue?

> Why FATAL inside background worker is not propagated at same level by
> launcher process?
> If PANIC in background worker can kill other backends and restart server
> then ideally FATAL in background worker should also be treated at same
> level by client backend.

It was initially like that, but then I realized it was silly. If the
background worker hits some error that forces its session to
terminate, there is no need to terminate the user's session too - and
in fact doing so is really annoying, as I rapidly found out while
experimenting with this. Generally a FATAL is something we do because
backend-local state is corrupted to a degree that makes it impractical
to continue, but the fact that that other backend is messed up does
not mean our backend is messed up too.

> Any error ("unable to map dynamic shared memory segment") before
> pq_redirect_to_shm_mq() will not reach launcher. Launcher client will
> get "ERROR: lost connection to worker process with PID 4020".
>
> I think it is very difficult for user to know why such an error has
> occurred and what he can do to proceed. I am not sure if there is any
> sensible way to report such an error, but OTOH it seems we should
> provide some information regarding what has happened to client.

I don't think this is really a fixable problem. There's no way to
communicate an error that happens before you establish communications.
The same problem exists for user connections, but it's not serious in
practice because it's rare. I expect the same to be true here.

> postgres=# select * from pg_background_result(4672) as (result TEXT);
> WARNING: unknown message type: G (6 bytes)
> ERROR: there is no client connection
> CONTEXT: COPY t1, line 1: ""
>
> Something similar to what is returned for transaction statements
> ("transaction control statements are not allowed in pg_background")
> would be better.

Fixed in the attached version.

> If you have to discard results of statements other than last,
> then why at first place you want to allow multiple statements?

You can run statements with side effects, or you can run multiply
utility commands.

> Like in below case, how will user identify whether the result is
> for first statement or second statement?

By reading the documentation that I will write.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
0006-pg_background-Run-commands-in-a-background-worker-an.patch text/x-patch 31.1 KB

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-09-29 16:05:20
Message-ID: 20140929160520.GX16422@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert,

* Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> Attached is a contrib module that lets you launch arbitrary command in
> a background worker, and supporting infrastructure patches for core.

Very cool! Started looking into this while waiting on a few
CLOBBER_CACHE_ALWAYS runs to finish (ugh...).

Perhaps I'm just being a bit over the top, but all this per-character
work feels a bit ridiculous.. When we're using MAXIMUM_ALIGNOF, I
suppose it's not so bad, but is there no hope to increase that and make
this whole process more efficient? Just a thought.

After reading through the code for 0001, I decided to actually take it
out for a spin- see attached. I then passed a few megabytes of data
through it and it seemed to work just fine.

In general, I'm quite excited about this capability and will be looking
over the later patches also. I also prefer the function-pointer based
approach which was taken up in later versions to the hook-based approach
in the initial patches, so glad to see things going in that direction.
Lastly, I will say that I feel it'd be good to support bi-directional
communication as I think it'll be needed eventually, but I'm not sure
that has to happen now.

Thanks!

Stephen

Attachment Content-Type Size
test-shm_mq_sendv.c text/x-csrc 4.8 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-09-29 17:38:37
Message-ID: CA+TgmoZkDxSyxa4u5=PpfPsk8Wu=UZK9KsOJAqppapryexsq1g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Sep 29, 2014 at 12:05 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> Perhaps I'm just being a bit over the top, but all this per-character
> work feels a bit ridiculous.. When we're using MAXIMUM_ALIGNOF, I
> suppose it's not so bad, but is there no hope to increase that and make
> this whole process more efficient? Just a thought.

I'm not sure I understand what you're getting at here.

> After reading through the code for 0001, I decided to actually take it
> out for a spin- see attached. I then passed a few megabytes of data
> through it and it seemed to work just fine.

That's good.

> In general, I'm quite excited about this capability and will be looking
> over the later patches also. I also prefer the function-pointer based
> approach which was taken up in later versions to the hook-based approach
> in the initial patches, so glad to see things going in that direction.

OK.

> Lastly, I will say that I feel it'd be good to support bi-directional
> communication as I think it'll be needed eventually, but I'm not sure
> that has to happen now.

I agree we need bidirectional communication; I just don't agree that
the other direction should use the libpq format. The data going from
the worker to the process that launched it is stuff like errors and
tuples, for which we already have a wire format. The data going in
the other direction is going to be things like plan trees to be
executed, for which we don't. But if we can defer the issue, so much
the better. Things will become clearer as we get closer to being
done.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-10-01 20:56:46
Message-ID: 20141001205646.GH28859@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> On Mon, Sep 29, 2014 at 12:05 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> > Perhaps I'm just being a bit over the top, but all this per-character
> > work feels a bit ridiculous.. When we're using MAXIMUM_ALIGNOF, I
> > suppose it's not so bad, but is there no hope to increase that and make
> > this whole process more efficient? Just a thought.
>
> I'm not sure I understand what you're getting at here.

Was just thinking that we might be able to work out what needs to be
done without having to actually do it on a per-character basis. That
said, I'm not sure it's really worth the effort given that we're talking
about at most 8 bytes currently. I had originally been thinking that we
might increase the minimum size as it might make things more efficient,
but it's not clear that'd really end up being the case either and,
regardless, it's probably not worth worrying about at this point.

> > After reading through the code for 0001, I decided to actually take it
> > out for a spin- see attached. I then passed a few megabytes of data
> > through it and it seemed to work just fine.
>
> That's good.

Just to note this- the testing which I did used a random number of
segments and random amounts of data with each and hit specific edge
cases and all looked good.

> > Lastly, I will say that I feel it'd be good to support bi-directional
> > communication as I think it'll be needed eventually, but I'm not sure
> > that has to happen now.
>
> I agree we need bidirectional communication; I just don't agree that
> the other direction should use the libpq format. The data going from
> the worker to the process that launched it is stuff like errors and
> tuples, for which we already have a wire format. The data going in
> the other direction is going to be things like plan trees to be
> executed, for which we don't. But if we can defer the issue, so much
> the better. Things will become clearer as we get closer to being
> done.

Sounds good to me.

Thanks,

Stephen


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-10-02 12:42:52
Message-ID: CA+TgmoZJ20yi+YMxUtQ-vq2FVa0UJH2X6SzJbL4-f2UgvFKfpA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Oct 1, 2014 at 4:56 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> * Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
>> On Mon, Sep 29, 2014 at 12:05 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
>> > Perhaps I'm just being a bit over the top, but all this per-character
>> > work feels a bit ridiculous.. When we're using MAXIMUM_ALIGNOF, I
>> > suppose it's not so bad, but is there no hope to increase that and make
>> > this whole process more efficient? Just a thought.
>>
>> I'm not sure I understand what you're getting at here.
>
> Was just thinking that we might be able to work out what needs to be
> done without having to actually do it on a per-character basis. That
> said, I'm not sure it's really worth the effort given that we're talking
> about at most 8 bytes currently. I had originally been thinking that we
> might increase the minimum size as it might make things more efficient,
> but it's not clear that'd really end up being the case either and,
> regardless, it's probably not worth worrying about at this point.

I'm still not entirely sure we're on the same page. Most of the data
movement for shm_mq is done via memcpy(), which I think is about as
efficient as it gets. The detailed character-by-character handling
only really comes up when shm_mq_send() is passed multiple chunks with
lengths that are not a multiple of MAXIMUM_ALIGNOF. Then we have to
fiddle a bit more. So making MAXIMUM_ALIGNOF bigger would actually
cause us to do more fiddling, not less.

When I originally designed this queue, I had the idea in mind that
life would be simpler if the queue head and tail pointers always
advanced in multiples of MAXIMUM_ALIGNOF. That didn't really work out
as well as I'd hoped; maybe I would have been better off having the
queue pack everything in tightly and ignore alignment. However, there
is some possible efficiency advantage of the present system: when a
message fits in the queue without wrapping, shm_mq_receive() returns a
pointer to the message, and the caller can assume that message is
properly aligned. If the queue didn't maintain alignment internally,
you'd need to do a copy there before accessing anything inside the
message that might be aligned. Granted, we don't have any code that
takes advantage of that yet, at least not in core, but the potential
is there. Still, I may have made the wrong call. But, I don't think
it's the of this patch to rework the whole framework; we can do that
in the future after this has a few more users and the pros and cons of
various approaches are (hopefully) more clear. It's not a complex
API, so substituting a different implementation later on probably
wouldn't be too hard.

Anyway, it sounds like you're on board with me committing the first
patch of the series, so I'll do that next week absent objections.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-10-02 13:11:19
Message-ID: 20141002131118.GM28859@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert,

* Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> On Wed, Oct 1, 2014 at 4:56 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> > Was just thinking that we might be able to work out what needs to be
> > done without having to actually do it on a per-character basis. That
> > said, I'm not sure it's really worth the effort given that we're talking
> > about at most 8 bytes currently. I had originally been thinking that we
> > might increase the minimum size as it might make things more efficient,
> > but it's not clear that'd really end up being the case either and,
> > regardless, it's probably not worth worrying about at this point.
>
> I'm still not entirely sure we're on the same page. Most of the data
> movement for shm_mq is done via memcpy(), which I think is about as
> efficient as it gets.

Right- agreed. I had originally thought we were doing things on a
per-MAXIMUM_ALIGNOF-basis somewhere else, but that appears to be an
incorrect assumption (which I'm glad for).

> The detailed character-by-character handling
> only really comes up when shm_mq_send() is passed multiple chunks with
> lengths that are not a multiple of MAXIMUM_ALIGNOF. Then we have to
> fiddle a bit more. So making MAXIMUM_ALIGNOF bigger would actually
> cause us to do more fiddling, not less.

Sorry- those were two independent items. Regarding the per-character
work, I was thinking we could work out the number of characters which
need to be moved and then use memcpy directly rather than doing the
per-character work, but as noted, we shouldn't be going through that
loop very often or for very many iterations anyway, and we have to deal
with moving forward through the iovs, so we'd still have to have a loop
there regardless.

> When I originally designed this queue, I had the idea in mind that
> life would be simpler if the queue head and tail pointers always
> advanced in multiples of MAXIMUM_ALIGNOF. That didn't really work out
> as well as I'd hoped; maybe I would have been better off having the
> queue pack everything in tightly and ignore alignment. However, there
> is some possible efficiency advantage of the present system: when a
> message fits in the queue without wrapping, shm_mq_receive() returns a
> pointer to the message, and the caller can assume that message is
> properly aligned. If the queue didn't maintain alignment internally,
> you'd need to do a copy there before accessing anything inside the
> message that might be aligned. Granted, we don't have any code that
> takes advantage of that yet, at least not in core, but the potential
> is there. Still, I may have made the wrong call. But, I don't think
> it's the of this patch to rework the whole framework; we can do that
> in the future after this has a few more users and the pros and cons of
> various approaches are (hopefully) more clear. It's not a complex
> API, so substituting a different implementation later on probably
> wouldn't be too hard.

Agreed.

> Anyway, it sounds like you're on board with me committing the first
> patch of the series, so I'll do that next week absent objections.

Works for me.

Thanks!

Stephen


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-10-08 14:05:46
Message-ID: 20141008140546.GB5053@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2014-09-29 13:38:37 -0400, Robert Haas wrote:
> On Mon, Sep 29, 2014 at 12:05 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> > Lastly, I will say that I feel it'd be good to support bi-directional
> > communication as I think it'll be needed eventually, but I'm not sure
> > that has to happen now.
>
> I agree we need bidirectional communication; I just don't agree that
> the other direction should use the libpq format. The data going from
> the worker to the process that launched it is stuff like errors and
> tuples, for which we already have a wire format. The data going in
> the other direction is going to be things like plan trees to be
> executed, for which we don't. But if we can defer the issue, so much
> the better. Things will become clearer as we get closer to being
> done.

I think that might be true for your usecase, but not for others. It's
perfectly conceivable that one might want to ship tuples to a couple
bgworkers using the COPY protocol or such.

I don't think it needs to be fully implemented, but I think we should
design it a way that it's unlikely to require larger changes to the
added code from here to add it.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Petr Jelinek <petr(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-10-08 14:09:43
Message-ID: 20141008140943.GC5053@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2014-09-10 16:53:12 -0400, Robert Haas wrote:
> diff --git a/src/include/libpq/libpq.h b/src/include/libpq/libpq.h
> index 5da9d8d..0b8db42 100644
> --- a/src/include/libpq/libpq.h
> +++ b/src/include/libpq/libpq.h
> @@ -37,6 +37,31 @@ typedef struct
> } u;
> } PQArgBlock;
>
> +typedef struct
> +{
> + void (*comm_reset)(void);
> + int (*flush)(void);
> + int (*flush_if_writable)(void);
> + bool (*is_send_pending)(void);
> + int (*putmessage)(char msgtype, const char *s, size_t len);
> + void (*putmessage_noblock)(char msgtype, const char *s, size_t len);
> + void (*startcopyout)(void);
> + void (*endcopyout)(bool errorAbort);
> +} PQsendMethods;
> +
> +PQsendMethods *PqSendMethods;

WRT my complaint in the other subthread, I think PQsendMethods should
just be renamed to PQcommMethods or similar. That'd, in combination with
a /* at some point we might want to add methods for receiving data here
*/ looks like it'd already satisfy my complaint ;)

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Petr Jelinek <petr(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-10-08 19:03:01
Message-ID: CA+TgmoaJY35OPKWAq5TMeQ3H17+bTEQdxJbk6mo7FhP4J33-uA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Oct 8, 2014 at 10:09 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> On 2014-09-10 16:53:12 -0400, Robert Haas wrote:
>> diff --git a/src/include/libpq/libpq.h b/src/include/libpq/libpq.h
>> index 5da9d8d..0b8db42 100644
>> --- a/src/include/libpq/libpq.h
>> +++ b/src/include/libpq/libpq.h
>> @@ -37,6 +37,31 @@ typedef struct
>> } u;
>> } PQArgBlock;
>>
>> +typedef struct
>> +{
>> + void (*comm_reset)(void);
>> + int (*flush)(void);
>> + int (*flush_if_writable)(void);
>> + bool (*is_send_pending)(void);
>> + int (*putmessage)(char msgtype, const char *s, size_t len);
>> + void (*putmessage_noblock)(char msgtype, const char *s, size_t len);
>> + void (*startcopyout)(void);
>> + void (*endcopyout)(bool errorAbort);
>> +} PQsendMethods;
>> +
>> +PQsendMethods *PqSendMethods;
>
> WRT my complaint in the other subthread, I think PQsendMethods should
> just be renamed to PQcommMethods or similar. That'd, in combination with
> a /* at some point we might want to add methods for receiving data here
> */ looks like it'd already satisfy my complaint ;)

Well, I'm happy enough to go ahead and change that, if that's all it
takes to make you happy. Is it?

So far the status of these patches AIUI is:

#1 - Just committed, per discussion last week.
#2 - No comments, possibly because it's pretty boring.
#3 - Most people seem happy with v2, modulo the above renaming request.
#4 - No comments.
#5 - No comments; trivial.
#6 - Updated per Amit's feedback. Not sure if anyone else wants to
review. Still needs docs.

I'm tempted to go ahead and do the renaming you've requested here and
then commit #2, #3, #4, and #5. But I'll wait if people want to
review more first. #6 probably needs at least one more thorough
review, and of course the documentation.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Petr Jelinek <petr(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-10-08 20:16:21
Message-ID: 54359B95.1070800@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 08/10/14 21:03, Robert Haas wrote:
> On Wed, Oct 8, 2014 at 10:09 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
>>
>> WRT my complaint in the other subthread, I think PQsendMethods should
>> just be renamed to PQcommMethods or similar. That'd, in combination with
>> a /* at some point we might want to add methods for receiving data here
>> */ looks like it'd already satisfy my complaint ;)
>
> Well, I'm happy enough to go ahead and change that, if that's all it
> takes to make you happy. Is it?
>
> So far the status of these patches AIUI is:
>
> #1 - Just committed, per discussion last week.
> #2 - No comments, possibly because it's pretty boring.

Yeah there is nothing much to say there, it can just go in.

> #4 - No comments.

I think I said I like it which I meant as I think it's good as it is, no
objections to committing that one either.

> #6 - Updated per Amit's feedback. Not sure if anyone else wants to
> review. Still needs docs.
>

I'd like to see this one when it's updated since I lost track a little
about how the changes looks like exactly.

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-10-08 22:32:21
Message-ID: 20141008223221.GG5053@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> /*
> + * Arrange to remove a dynamic shared memory mapping at cleanup time.
> + *
> + * dsm_keep_mapping() can be used to preserve a mapping for the entire
> + * lifetime of a process; this function reverses that decision, making
> + * the segment owned by the current resource owner. This may be useful
> + * just before performing some operation that will invalidate the segment
> + * for future use by this backend.
> + */
> +void
> +dsm_unkeep_mapping(dsm_segment *seg)
> +{
> + Assert(seg->resowner == NULL);
> + ResourceOwnerEnlargeDSMs(CurrentResourceOwner);
> + seg->resowner = CurrentResourceOwner;
> + ResourceOwnerRememberDSM(seg->resowner, seg);
> +}

Hm, I dislike the name unkeep. I guess you want to be symmetric to
dsm_keep_mapping? dsm_manage_mapping(), dsm_ensure_mapping_cleanup()
dm_remember_mapping()?

> From c835a06f20792556d35a0eee4c2fa21f5f23e8a3 Mon Sep 17 00:00:00 2001
> From: Robert Haas <rhaas(at)postgresql(dot)org>
> Date: Fri, 11 Jul 2014 09:53:40 -0400
> Subject: [PATCH 6/6] pg_background: Run commands in a background worker, and
> get the results.
>
> The currently-active GUC values from the user session will be copied
> to the background worker. If the command returns a result set, you
> can retrieve the result set; if not, you can retrieve the command
> tags. If the command fails with an error, the same error will be
> thrown in the launching process when the results are retrieved.
> Warnings and other messages generated by the background worker, and
> notifications received by it, are also propagated to the foreground
> process.

I got to ask: Why is it helpful that we have this in contrib? I have a
good share of blame to bear for that, but I think we need to stop
dilluting contrib evermore with test programs. These have a place, but I
don't think it should be contrib.

> +/* Table-of-contents constants for our dynamic shared memory segment. */
> +#define PG_BACKGROUND_MAGIC 0x50674267
> +#define PG_BACKGROUND_KEY_FIXED_DATA 0
> +#define PG_BACKGROUND_KEY_SQL 1
> +#define PG_BACKGROUND_KEY_GUC 2
> +#define PG_BACKGROUND_KEY_QUEUE 3
> +#define PG_BACKGROUND_NKEYS 4
> +
> +/* Fixed-size data passed via our dynamic shared memory segment. */
> +typedef struct pg_background_fixed_data
> +{
> + Oid database_id;
> + Oid authenticated_user_id;
> + Oid current_user_id;
> + int sec_context;
> + char database[NAMEDATALEN];
> + char authenticated_user[NAMEDATALEN];
> +} pg_background_fixed_data;

Why not NameData?

> +/* Private state maintained by the launching backend for IPC. */
> +typedef struct pg_background_worker_info
> +{
> + pid_t pid;
> + dsm_segment *seg;
> + BackgroundWorkerHandle *handle;
> + shm_mq_handle *responseq;
> + bool consumed;
> +} pg_background_worker_info;

whitespace damage.

> +static HTAB *worker_hash;

Hm. So the lifetime of this hash's contents is managed via
on_dsm_detach(), do I understand that correctly?

> +PG_FUNCTION_INFO_V1(pg_background_launch);
> +PG_FUNCTION_INFO_V1(pg_background_result);
> +PG_FUNCTION_INFO_V1(pg_background_detach);

> +void pg_background_worker_main(Datum);
> +
> +/*
> + * Start a dynamic background worker to run a user-specified SQL command.
> + */
> +Datum
> +pg_background_launch(PG_FUNCTION_ARGS)
> +{
> + text *sql = PG_GETARG_TEXT_PP(0);
> + int32 queue_size = PG_GETARG_INT64(1);
> + int32 sql_len = VARSIZE_ANY_EXHDR(sql);
> + Size guc_len;
> + Size segsize;
> + dsm_segment *seg;
> + shm_toc_estimator e;
> + shm_toc *toc;
> + char *sqlp;

wrong indentation.

> + char *gucstate;
> + shm_mq *mq;
> + BackgroundWorker worker;
> + BackgroundWorkerHandle *worker_handle;
> + pg_background_fixed_data *fdata;
> + pid_t pid;
> + shm_mq_handle *responseq;
> + MemoryContext oldcontext;
> +
> + /* Ensure a valid queue size. */
> + if (queue_size < 0 || ((uint64) queue_size) < shm_mq_minimum_size)
> + ereport(ERROR,
> + (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> + errmsg("queue size must be at least %zu bytes",
> + shm_mq_minimum_size)));

Hm. So every user can do this once the extension is created as the
functions are most likely to be PUBLIC. Is this a good idea?

> + /* Wait for the worker to start. */
> + switch (WaitForBackgroundWorkerStartup(worker_handle, &pid))
> + {
> + case BGWH_STARTED:
> + /* Success. */
> + break;
> + case BGWH_STOPPED:
> + pfree(worker_handle);
> + ereport(ERROR,
> + (errcode(ERRCODE_INSUFFICIENT_RESOURCES),
> + errmsg("could not start background process"),
> + errhint("More details may be available in the server log.")));
> + break;
> + case BGWH_POSTMASTER_DIED:

space vs. tab.

> +/*
> + * Retrieve the results of a background query previously launched in this
> + * session.
> + */
> +Datum
> +pg_background_result(PG_FUNCTION_ARGS)
> +{
> + int32 pid = PG_GETARG_INT32(0);
> + shm_mq_result res;
> + FuncCallContext *funcctx;
> + TupleDesc tupdesc;
> + StringInfoData msg;
> + pg_background_result_state *state;
> +
> + /* First-time setup. */
> + if (SRF_IS_FIRSTCALL())
> + {
> + MemoryContext oldcontext;
> + pg_background_worker_info *info;
> +
> + funcctx = SRF_FIRSTCALL_INIT();
> + oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
> +
> + /* See if we have a connection to the specified PID. */
> + if ((info = find_worker_info(pid)) == NULL)
> + ereport(ERROR,
> + (errcode(ERRCODE_UNDEFINED_OBJECT),
> + errmsg("PID %d is not attached to this session", pid)));
> +
> + /* Can't read results twice. */
> + if (info->consumed)
> + ereport(ERROR,
> + (errcode(ERRCODE_UNDEFINED_OBJECT),
> + errmsg("results for PID %d have already been consumed", pid)));
> + info->consumed = true;

trailing whitespace.

> + /*
> + * Whether we succeed or fail, a future invocation of this function
> + * may not try to read from the DSM once we've begun to do so.
> + * Accordingly, make arrangements to clean things up at end of query.
> + */
> + dsm_unkeep_mapping(info->seg);
> +
> + /* Set up tuple-descriptor based on colum definition list. */
> + if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
> + ereport(ERROR,
> + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> + errmsg("function returning record called in context "
> + "that cannot accept type record"),
> + errhint("Try calling the function in the FROM clause "
> + "using a column definition list.")));

Hm, normally we don't add linebreaks inside error messages.

> + funcctx->tuple_desc = BlessTupleDesc(tupdesc);
> +
> + /* Cache state that will be needed on every call. */
> + state = palloc0(sizeof(pg_background_result_state));
> + state->info = info;
> + if (funcctx->tuple_desc->natts > 0)
> + {
> + int natts = funcctx->tuple_desc->natts;
> + int i;
> +
> + state->receive_functions = palloc(sizeof(FmgrInfo) * natts);
> + state->typioparams = palloc(sizeof(Oid) * natts);
> +
> + for (i = 0; i < natts; ++i)
> + {
> + Oid receive_function_id;
> +
> + getTypeBinaryInputInfo(funcctx->tuple_desc->attrs[i]->atttypid,
> + &receive_function_id,
> + &state->typioparams[i]);
> + fmgr_info(receive_function_id, &state->receive_functions[i]);
> + }
> + }

I'm unsure right now about the rules surrounding this, but shouldn't we
check that the user is allowed to execute these? And shouldn't we fall
back to non binary functions if no binary ones are available?

> + /* Read and processes messages from the shared memory queue. */
> + for (;;)
> + {
> + char msgtype;
> + Size nbytes;
> + void *data;
> +
> + /* Get next message. */
> + res = shm_mq_receive(state->info->responseq, &nbytes, &data, false);
> + if (res != SHM_MQ_SUCCESS)
> + break;
> +
> + /*
> + * Message-parsing routines operate on a null-terminated StringInfo,
> + * so we must construct one.
> + */
> + resetStringInfo(&msg);
> + enlargeStringInfo(&msg, nbytes);
> + msg.len = nbytes;
> + memcpy(msg.data, data, nbytes);
> + msg.data[nbytes] = '\0';
> + msgtype = pq_getmsgbyte(&msg);
> +
> + /* Dispatch on message type. */
> + switch (msgtype)
> + {
> + case 'E':
> + case 'N':
> + {
> + ErrorData edata;
> +
> + /* Parse ErrorResponse or NoticeResponse. */
> + pq_parse_errornotice(&msg, &edata);
> +
> + /*
> + * Limit the maximum error level to ERROR. We don't want
> + * a FATAL inside the background worker to kill the user
> + * session.
> + */
> + if (edata.elevel > ERROR)
> + edata.elevel = ERROR;

Hm. But we still should report that it FATALed? Maybe add error context
notice about it? Not nice, but I don't have a immediately better idea. I
think it generally would be rather helpful to add the information that
this wasn't originally an error triggered by this process. The user
might otherwise be confused when looking for the origin of the error in
the log.

> + case 'A':
> + {
> + /* Propagate NotifyResponse. */
> + pq_putmessage(msg.data[0], &msg.data[1], nbytes - 1);
> + break;

Hm. Are we sure to be in a situation where the client expects these? And
are we sure their encoding is correct? The other data goe through
input/output methods checking for that, but here we don't. And the other
side AFAICS could have done a SET client_encoding.

> + /* If no data rows, return the command tags instead. */
> + if (!state->has_row_description)
> + {
> + if (tupdesc->natts != 1 || tupdesc->attrs[0]->atttypid != TEXTOID)
> + ereport(ERROR,
> + (errcode(ERRCODE_DATATYPE_MISMATCH),
> + errmsg("remote query did not return a result set, but "
> + "result rowtype is not a single text column")));
> + if (state->command_tags != NIL)
> + {
> + char *tag = linitial(state->command_tags);
> + Datum value;
> + bool isnull;
> + HeapTuple result;
> +
> + state->command_tags = list_delete_first(state->command_tags);
> + value = PointerGetDatum(cstring_to_text(tag));
> + isnull = false;
> + result = heap_form_tuple(tupdesc, &value, &isnull);
> + SRF_RETURN_NEXT(funcctx, HeapTupleGetDatum(result));
> + }

trailing whitespace.

> +/*
> + * Parse a DataRow message and form a result tuple.
> + */
> +static HeapTuple
> +form_result_tuple(pg_background_result_state *state, TupleDesc tupdesc,
> + StringInfo msg)
> +{
> + /* Handle DataRow message. */
> + int16 natts = pq_getmsgint(msg, 2);
> + int16 i;
> + Datum *values = NULL;
> + bool *isnull = NULL;
> + StringInfoData buf;
> +
> + if (!state->has_row_description)
> + elog(ERROR, "DataRow not preceded by RowDescription");
> + if (natts != tupdesc->natts)
> + elog(ERROR, "malformed DataRow");
> + if (natts > 0)
> + {
> + values = palloc(natts * sizeof(Datum));
> + isnull = palloc(natts * sizeof(bool));
> + }
> + initStringInfo(&buf);
> +
> + for (i = 0; i < natts; ++i)
> + {
> + int32 bytes = pq_getmsgint(msg, 4);
> +
> + if (bytes < 0)
> + {
> + values[i] = ReceiveFunctionCall(&state->receive_functions[i],
> + NULL,
> + state->typioparams[i],
> + tupdesc->attrs[i]->atttypmod);
> + isnull[i] = true;

> + }
> + else
> + {
> + resetStringInfo(&buf);
> + appendBinaryStringInfo(&buf, pq_getmsgbytes(msg, bytes), bytes);
> + values[i] = ReceiveFunctionCall(&state->receive_functions[i],
> + &buf,
> + state->typioparams[i],
> + tupdesc->attrs[i]->atttypmod);
> + isnull[i] = false;
> + }
> + }

Hm. I think you didn't check that the typemods are the same above.

> +/*
> + * Detach from the dynamic shared memory segment used for communication with
> + * a background worker. This prevents the worker from stalling waiting for
> + * us to read its results.
> + */
> +Datum
> +pg_background_detach(PG_FUNCTION_ARGS)
> +{
> + int32 pid = PG_GETARG_INT32(0);
> + pg_background_worker_info *info;
> +
> + info = find_worker_info(pid);
> + if (info == NULL)
> + ereport(ERROR,
> + (errcode(ERRCODE_UNDEFINED_OBJECT),
> + errmsg("PID %d is not attached to this session", pid)));
> + dsm_detach(info->seg);
> +
> + PG_RETURN_VOID();
> +}

So there 's really no limit of who is allowed to do stuff like
this. I.e. any SECURITY DEFINER and such may do the same.

> +/*
> + * Background worker entrypoint.
> + */
> +void
> +pg_background_worker_main(Datum main_arg)
> +{
> + dsm_segment *seg;
> + shm_toc *toc;
> + pg_background_fixed_data *fdata;
> + char *sql;
> + char *gucstate;
> + shm_mq *mq;
> + shm_mq_handle *responseq;
> +
> + /* Establish signal handlers. */
> + pqsignal(SIGTERM, handle_sigterm);

Hm. No SIGINT?

> + /* Find data structures in dynamic shared memory. */
> + fdata = shm_toc_lookup(toc, PG_BACKGROUND_KEY_FIXED_DATA);
> + sql = shm_toc_lookup(toc, PG_BACKGROUND_KEY_SQL);
> + gucstate = shm_toc_lookup(toc, PG_BACKGROUND_KEY_GUC);
> + mq = shm_toc_lookup(toc, PG_BACKGROUND_KEY_QUEUE);
> + shm_mq_set_sender(mq, MyProc);
> + responseq = shm_mq_attach(mq, seg, NULL);

Don't these need to ensure that values have been found? shm_toc_lookup
returns NULL for unknown itmes and such and such?

> + /* Redirect protocol messages to responseq. */
> + pq_redirect_to_shm_mq(mq, responseq);
> +
> + /*
> + * Initialize our user and database ID based on the strings version of
> + * the data, and then go back and check that we actually got the database
> + * and user ID that we intended to get. We do this because it's not
> + * impossible for the process that started us to die before we get here,
> + * and the user or database could be renamed in the meantime. We don't
> + * want to latch on the wrong object by accident. There should probably
> + * be a variant of BackgroundWorkerInitializeConnection that accepts OIDs
> + * rather than strings.
> + */
> + BackgroundWorkerInitializeConnection(fdata->database,
> + fdata->authenticated_user);
> + if (fdata->database_id != MyDatabaseId ||
> + fdata->authenticated_user_id != GetAuthenticatedUserId())
> + ereport(ERROR,
> + (errmsg("user or database renamed during pg_background startup")));
> +
> + /* Restore GUC values from launching backend. */
> + StartTransactionCommand();
> + RestoreGUCState(gucstate);
> + CommitTransactionCommand();

I haven't read the guc save patch, but is it a) required to this in a
transaction? We normally reload the config even without. b) correct to
do? What's with SET LOCAL variables?

> + /* Restore user ID and security context. */
> + SetUserIdAndSecContext(fdata->current_user_id, fdata->sec_context);
> +
> + /* Prepare to execute the query. */
> + SetCurrentStatementStartTimestamp();
> + debug_query_string = sql;
> + pgstat_report_activity(STATE_RUNNING, sql);
> + StartTransactionCommand();
> + if (StatementTimeout > 0)
> + enable_timeout_after(STATEMENT_TIMEOUT, StatementTimeout);
> + else
> + disable_timeout(STATEMENT_TIMEOUT, false);

I doubt that actually works correctly without a SIGINT handler as
statement timeout just falls back to kill(SIGINT)? Or does it, because
it falls back to just doing a proc_exit()? If so, is that actually safe?

> + /* Execute the query. */
> + execute_sql_string(sql);
> +
> + /* Post-execution cleanup. */
> + disable_timeout(STATEMENT_TIMEOUT, false);
> + CommitTransactionCommand();

So, we're allowed to do nearly arbitrary nastyness here...

> +/*
> + * Execute given SQL string.
> + *
> + * Using SPI here would preclude backgrounding commands like VACUUM which one
> + * might very well wish to launch in the background. So we do this instead.
> + */
> +static void
> +execute_sql_string(const char *sql)
> +{
> + List *raw_parsetree_list;
> + ListCell *lc1;
> + bool isTopLevel;
> + int commands_remaining;
> + MemoryContext parsecontext;
> + MemoryContext oldcontext;
> +
> + /*
> + * Parse the SQL string into a list of raw parse trees.
> + *
> + * Because we allow statements that perform internal transaction control,
> + * we can't do this in TopTransactionContext; the parse trees might get
> + * blown away before we're done executing them.
> + */
> + parsecontext = AllocSetContextCreate(TopMemoryContext,
> + "pg_background parse/plan",
> + ALLOCSET_DEFAULT_MINSIZE,
> + ALLOCSET_DEFAULT_INITSIZE,
> + ALLOCSET_DEFAULT_MAXSIZE);

Not that it hugely matters, but shouldn't this rather be
TopTransactionContext?

> + oldcontext = MemoryContextSwitchTo(parsecontext);
> + raw_parsetree_list = pg_parse_query(sql);
> + commands_remaining = list_length(raw_parsetree_list);
> + isTopLevel = commands_remaining == 1;
> + MemoryContextSwitchTo(oldcontext);
> +
> + /*
> + * Do parse analysis, rule rewrite, planning, and execution for each raw
> + * parsetree. We must fully execute each query before beginning parse
> + * analysis on the next one, since there may be interdependencies.
> + */
> + foreach(lc1, raw_parsetree_list)
> + {
> + Node *parsetree = (Node *) lfirst(lc1);
> + const char *commandTag;
> + char completionTag[COMPLETION_TAG_BUFSIZE];
> + List *querytree_list,
> + *plantree_list;

borked indentation.

> + bool snapshot_set = false;
> + Portal portal;
> + DestReceiver *receiver;
> + int16 format = 1;
> +
> + /*
> + * We don't allow transaction-control commands like COMMIT and ABORT
> + * here. The entire SQL statement is executed as a single transaction
> + * which commits if no errors are encountered.
> + */
> + if (IsA(parsetree, TransactionStmt))
> + ereport(ERROR,
> + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> + errmsg("transaction control statements are not allowed in pg_background")));

Hm. I don't think that goes far enough. This allows commands that
internally stop/start transactions like CREATE INDEX CONCURRETNLY. Are
you sure that's working right now?

> + /*
> + * Get the command name for use in status display (it also becomes the
> + * default completion tag, down inside PortalRun). Set ps_status and
> + * do any special start-of-SQL-command processing needed by the
> + * destination.
> + */
> + commandTag = CreateCommandTag(parsetree);
> + set_ps_display(commandTag, false);
> + BeginCommand(commandTag, DestNone);

indentation.

> + /* Set up a snapshot if parse analysis/planning will need one. */
> + if (analyze_requires_snapshot(parsetree))
> + {
> + PushActiveSnapshot(GetTransactionSnapshot());
> + snapshot_set = true;
> + }
> +
> + /*
> + * OK to analyze, rewrite, and plan this query.
> + *
> + * As with parsing, we need to make sure this data outlives the
> + * transaction, because of the possibility that the statement might
> + * perform internal transaction control.
> + */
> + oldcontext = MemoryContextSwitchTo(parsecontext);
> + querytree_list = pg_analyze_and_rewrite(parsetree, sql, NULL, 0);
> + plantree_list = pg_plan_queries(querytree_list, 0, NULL);
> +
> + /* Done with the snapshot used for parsing/planning */
> + if (snapshot_set)
> + PopActiveSnapshot();

> + /* If we got a cancel signal in analysis or planning, quit */
> + CHECK_FOR_INTERRUPTS();
> +
> + /*
> + * Execute the query using the unnamed portal.
> + */
> + portal = CreatePortal("", true, true);
> + /* Don't display the portal in pg_cursors */
> + portal->visible = false;

indentation.

> + PortalDefineQuery(portal, NULL, sql, commandTag, plantree_list, NULL);
> + PortalStart(portal, NULL, 0, InvalidSnapshot);
> + PortalSetResultFormat(portal, 1, &format); /* binary format */
> +
> + /*
> + * Tuples returned by any command other than the last are simply
> + * discarded; but those returned by the last (or only) command are
> + * redirected to the shared memory queue we're using for communication
> + * with the launching backend. If the launching backend is gone or has
> + * detached us, these messages will just get dropped on the floor.
> + */
> + --commands_remaining;
> + if (commands_remaining > 0)
> + receiver = CreateDestReceiver(DestNone);
> + else
> + {
> + receiver = CreateDestReceiver(DestRemote);
> + SetRemoteDestReceiverParams(receiver, portal);
> + }
> +
> + /*
> + * Only once the portal and destreceiver have been established can
> + * we return to the transaction context. All that stuff needs to
> + * survive an internal commit inside PortalRun!
> + */
> + MemoryContextSwitchTo(oldcontext);
> +
> + /* Here's where we actually execute the command. */
> + (void) PortalRun(portal, FETCH_ALL, isTopLevel, receiver, receiver,
> + completionTag);
> +
> + /* Clean up the receiver. */
> + (*receiver->rDestroy) (receiver);
> +
> + /*
> + * Send a CommandComplete message even if we suppressed the query
> + * results. The user backend will report these in the absence of
> + * any true query results.
> + */
> + EndCommand(completionTag, DestRemote);

Hm. This is a fair amount of code copied from postgres.c.

I think this is interesting work, but I doubt it's ready yet. I need to
read the preceding patches, to really understand where breakage lies
hidden.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Petr Jelinek <petr(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-10-08 22:41:15
Message-ID: 20141008224115.GA7829@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2014-10-08 15:03:01 -0400, Robert Haas wrote:
> #1 - Just committed, per discussion last week.
> #2 - No comments, possibly because it's pretty boring.

fine with me.

> #3 - Most people seem happy with v2, modulo the above renaming request.

If you do it without the error handling stuff, I'm good with committing
it. I'm not yet convinced that verbatimly rethrowing errors from
background workers without any indications that the process changed is a
good idea.

> #4 - No comments.

This needs review imo.

> #5 - No comments; trivial.

fine.

> #6 - Updated per Amit's feedback. Not sure if anyone else wants to
> review. Still needs docs.

This definitely needs more view.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-10-09 03:49:05
Message-ID: CAA4eK1KaCDBC3_yRLh521JYvLn6kGLWQbpc_jRERFLQUe5SnfQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Oct 9, 2014 at 4:02 AM, Andres Freund <andres(at)2ndquadrant(dot)com>
wrote:
>
> > /*
> > + * Arrange to remove a dynamic shared memory mapping at cleanup time.
> > + *
> > + * dsm_keep_mapping() can be used to preserve a mapping for the entire
> > + * lifetime of a process; this function reverses that decision, making
> > + * the segment owned by the current resource owner. This may be useful
> > + * just before performing some operation that will invalidate the
segment
> > + * for future use by this backend.
> > + */
> > +void
> > +dsm_unkeep_mapping(dsm_segment *seg)
> > +{
> > + Assert(seg->resowner == NULL);
> > + ResourceOwnerEnlargeDSMs(CurrentResourceOwner);
> > + seg->resowner = CurrentResourceOwner;
> > + ResourceOwnerRememberDSM(seg->resowner, seg);
> > +}
>
> Hm, I dislike the name unkeep.

I also think function name is not appropriate as per functionality.

> I guess you want to be symmetric to
> dsm_keep_mapping? dsm_manage_mapping(), dsm_ensure_mapping_cleanup()
> dm_remember_mapping()?

Another could be dsm_change_mapping(). Yet another idea could
be that we use single function (dsm_manage_mapping() with an
additional parameter to indicate the scope of segment) instead of
two different functions dsm_keep_mapping() and
dsm_unkeep_mapping().

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-10-14 17:24:34
Message-ID: CA+Tgmoaz88HjgSi0v-kQCM8nHCahYOzCH1At04q4Wb0s8szYYA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Oct 8, 2014 at 6:32 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
>> /*
>> + * Arrange to remove a dynamic shared memory mapping at cleanup time.
>> + *
>> + * dsm_keep_mapping() can be used to preserve a mapping for the entire
>> + * lifetime of a process; this function reverses that decision, making
>> + * the segment owned by the current resource owner. This may be useful
>> + * just before performing some operation that will invalidate the segment
>> + * for future use by this backend.
>> + */
>> +void
>> +dsm_unkeep_mapping(dsm_segment *seg)
>> +{
>> + Assert(seg->resowner == NULL);
>> + ResourceOwnerEnlargeDSMs(CurrentResourceOwner);
>> + seg->resowner = CurrentResourceOwner;
>> + ResourceOwnerRememberDSM(seg->resowner, seg);
>> +}
>
> Hm, I dislike the name unkeep. I guess you want to be symmetric to
> dsm_keep_mapping? dsm_manage_mapping(), dsm_ensure_mapping_cleanup()
> dm_remember_mapping()?

Yes, I want to be symmetrical with dsm_keep_mapping, which is itself
symmetrical with dsm_keep_segment(). Your proposed names don't sound
good to me because it's not obvious that "manage" or "remember" is the
opposite of "keep". It's pretty easy to remember that "unkeep" is the
opposite of "keep" though.

If I had to pick one of those, I'd probably pick
dsm_ensure_mapping_cleanup(), but I don't like that much. It
describes what the function does fairly well, but it's totally unlike
the function it pairs with.

Another idea is to add another argument to the existing function
(possibly renaming it along the way). For example, we could have
dsm_keep_mapping(seg, true) mean keep it, and dsm_keep_mapping(seg,
false) meaning don't keep it. That would break existing callers, but
there probably aren't many of those. We could even - and much more
generally - replace it with dsm_set_resowner(seg, res), but that would
require #including some some stuff into dsm.h that we might rather not
have there.

I'll respond to the rest of this later.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Petr Jelinek <petr(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-10-14 20:56:13
Message-ID: 543D8DED.5060702@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 09/10/14 00:32, Andres Freund wrote:
>> From c835a06f20792556d35a0eee4c2fa21f5f23e8a3 Mon Sep 17 00:00:00 2001
>> From: Robert Haas <rhaas(at)postgresql(dot)org>
>> Date: Fri, 11 Jul 2014 09:53:40 -0400
>> Subject: [PATCH 6/6] pg_background: Run commands in a background worker, and
>> get the results.
>>
>> The currently-active GUC values from the user session will be copied
>> to the background worker. If the command returns a result set, you
>> can retrieve the result set; if not, you can retrieve the command
>> tags. If the command fails with an error, the same error will be
>> thrown in the launching process when the results are retrieved.
>> Warnings and other messages generated by the background worker, and
>> notifications received by it, are also propagated to the foreground
>> process.
>
> I got to ask: Why is it helpful that we have this in contrib? I have a
> good share of blame to bear for that, but I think we need to stop
> dilluting contrib evermore with test programs. These have a place, but I
> don't think it should be contrib.
>

I don't see this as just mere test module, it seems to provide actual
useful functionality (I still dream of it being extended with scheduler
eventually and even if it's not, you still get "asynchronous transactions").

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-10-22 23:03:28
Message-ID: CA+TgmoY9B0mL7Ci1fG1cZ4-zwDHOjjQaKyLowB3F-jsbuvVw2g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Oct 8, 2014 at 6:32 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> I got to ask: Why is it helpful that we have this in contrib? I have a
> good share of blame to bear for that, but I think we need to stop
> dilluting contrib evermore with test programs. These have a place, but I
> don't think it should be contrib.

I don't think pg_background is merely a test program: I think it's a
quite useful piece of functionality. It can be used for running
VACUUM from a procedure, which is something people have asked for more
than once, or for simulating an autonomous transaction. Granted,
it'll be a lot slower than a real autonomous transaction, but it's
still better than doing it via dblink, because you don't have to futz
with authentication.

I would be all in favor of moving things like test_decoding,
test_parser, and test_shm_mq to src/test or contrib/test or wherever
else we want to put them, but I think pg_background belongs in
contrib.

>> +/* Fixed-size data passed via our dynamic shared memory segment. */
>> +typedef struct pg_background_fixed_data
>> +{
>> + Oid database_id;
>> + Oid authenticated_user_id;
>> + Oid current_user_id;
>> + int sec_context;
>> + char database[NAMEDATALEN];
>> + char authenticated_user[NAMEDATALEN];
>> +} pg_background_fixed_data;
>
> Why not NameData?

No particular reason. Changed.

> whitespace damage.

I went through and fixed everything that git diff --check complained
about. Let me know if you see other problems yet.

>> +static HTAB *worker_hash;
>
> Hm. So the lifetime of this hash's contents is managed via
> on_dsm_detach(), do I understand that correctly?

More or less, yes.

> Hm. So every user can do this once the extension is created as the
> functions are most likely to be PUBLIC. Is this a good idea?

Why not? If they can log in, they could start separate sessions with
similar effect.

>> + /*
>> + * Whether we succeed or fail, a future invocation of this function
>> + * may not try to read from the DSM once we've begun to do so.
>> + * Accordingly, make arrangements to clean things up at end of query.
>> + */
>> + dsm_unkeep_mapping(info->seg);
>> +
>> + /* Set up tuple-descriptor based on colum definition list. */
>> + if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
>> + ereport(ERROR,
>> + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
>> + errmsg("function returning record called in context "
>> + "that cannot accept type record"),
>> + errhint("Try calling the function in the FROM clause "
>> + "using a column definition list.")));
>
> Hm, normally we don't add linebreaks inside error messages.

I copied it from dblink.

> I'm unsure right now about the rules surrounding this, but shouldn't we
> check that the user is allowed to execute these? And shouldn't we fall
> back to non binary functions if no binary ones are available?

I can't see any reason to do either of those things. I'm not aware
that returning data in binary format is in any way intended to be a
security-restricted operation, or that we have any data types that
actually matter without send and receive functions. If we do, I think
the solution is to add them, not make this more complicated.

>> + /*
>> + * Limit the maximum error level to ERROR. We don't want
>> + * a FATAL inside the background worker to kill the user
>> + * session.
>> + */
>> + if (edata.elevel > ERROR)
>> + edata.elevel = ERROR;
>
> Hm. But we still should report that it FATALed? Maybe add error context
> notice about it? Not nice, but I don't have a immediately better idea. I
> think it generally would be rather helpful to add the information that
> this wasn't originally an error triggered by this process. The user
> might otherwise be confused when looking for the origin of the error in
> the log.

Yeah, I was wondering if we needed some kind of annotation here. What
I'm wondering about is appending something to the errcontext, perhaps
"background worker, PID %d".

>> + case 'A':
>> + {
>> + /* Propagate NotifyResponse. */
>> + pq_putmessage(msg.data[0], &msg.data[1], nbytes - 1);
>> + break;
>
> Hm. Are we sure to be in a situation where the client expects these? And
> are we sure their encoding is correct? The other data goe through
> input/output methods checking for that, but here we don't. And the other
> side AFAICS could have done a SET client_encoding.

I think there's no problem at the protocol level; I think the server
can send NotifyResponse pretty much whenever. It could be argued that
this is a POLA violation, but dropping the notify messages on the
floor (which seems to be the other option) doesn't seem like superior.
So I think this is mostly a matter of documentation.

>> +/*
>> + * Parse a DataRow message and form a result tuple.
>> + */
>> +static HeapTuple
>> +form_result_tuple(pg_background_result_state *state, TupleDesc tupdesc,
>> + StringInfo msg)
>> +{
>> + /* Handle DataRow message. */
>> + int16 natts = pq_getmsgint(msg, 2);
>> + int16 i;
>> + Datum *values = NULL;
>> + bool *isnull = NULL;
>> + StringInfoData buf;
>> +
>> + if (!state->has_row_description)
>> + elog(ERROR, "DataRow not preceded by RowDescription");
>> + if (natts != tupdesc->natts)
>> + elog(ERROR, "malformed DataRow");
>> + if (natts > 0)
>> + {
>> + values = palloc(natts * sizeof(Datum));
>> + isnull = palloc(natts * sizeof(bool));
>> + }
>> + initStringInfo(&buf);
>> +
>> + for (i = 0; i < natts; ++i)
>> + {
>> + int32 bytes = pq_getmsgint(msg, 4);
>> +
>> + if (bytes < 0)
>> + {
>> + values[i] = ReceiveFunctionCall(&state->receive_functions[i],
>> + NULL,
>> + state->typioparams[i],
>> + tupdesc->attrs[i]->atttypmod);
>> + isnull[i] = true;
>
>> + }
>> + else
>> + {
>> + resetStringInfo(&buf);
>> + appendBinaryStringInfo(&buf, pq_getmsgbytes(msg, bytes), bytes);
>> + values[i] = ReceiveFunctionCall(&state->receive_functions[i],
>> + &buf,
>> + state->typioparams[i],
>> + tupdesc->attrs[i]->atttypmod);
>> + isnull[i] = false;
>> + }
>> + }
>
> Hm. I think you didn't check that the typemods are the same above.

The same as what?

>> +Datum
>> +pg_background_detach(PG_FUNCTION_ARGS)
>> +{
>> + int32 pid = PG_GETARG_INT32(0);
>> + pg_background_worker_info *info;
>> +
>> + info = find_worker_info(pid);
>> + if (info == NULL)
>> + ereport(ERROR,
>> + (errcode(ERRCODE_UNDEFINED_OBJECT),
>> + errmsg("PID %d is not attached to this session", pid)));
>> + dsm_detach(info->seg);
>> +
>> + PG_RETURN_VOID();
>> +}
>
> So there 's really no limit of who is allowed to do stuff like
> this. I.e. any SECURITY DEFINER and such may do the same.

Do you think we need a restriction? It's not obvious to me that there
are any security-relevant consequences to this, but it's an important
question, and I might be missing something.

>> + /* Establish signal handlers. */
>> + pqsignal(SIGTERM, handle_sigterm);
>
> Hm. No SIGINT?

Nope; bgworker.c sets it to StatementCancelHandler, which should be
fine. Ideally I wouldn't have to do anything with SIGTERM either, but
bgworker.c sets it to bgworker_die(), which is pretty much complete
junk. It's not safe to just ereport() from within whatever the heck
the caller is doing. We should probably drop a small thermonuclear
weapon on bgworker_die(), but that's a problem for another patch.

>> + /* Find data structures in dynamic shared memory. */
>> + fdata = shm_toc_lookup(toc, PG_BACKGROUND_KEY_FIXED_DATA);
>> + sql = shm_toc_lookup(toc, PG_BACKGROUND_KEY_SQL);
>> + gucstate = shm_toc_lookup(toc, PG_BACKGROUND_KEY_GUC);
>> + mq = shm_toc_lookup(toc, PG_BACKGROUND_KEY_QUEUE);
>> + shm_mq_set_sender(mq, MyProc);
>> + responseq = shm_mq_attach(mq, seg, NULL);
>
> Don't these need to ensure that values have been found? shm_toc_lookup
> returns NULL for unknown itmes and such and such?

Meh. The launching process would have errored out if it hadn't been
able to set up the segment correctly. We could add some Assert()
statements if you really feel strongly about it, but it seems fairly
pointless to me. Any situation where those pointers come back NULL is
presumably going to be some sort of really stupid bug that will be
found even by trivial testing.

>> + /* Restore GUC values from launching backend. */
>> + StartTransactionCommand();
>> + RestoreGUCState(gucstate);
>> + CommitTransactionCommand();
>
> I haven't read the guc save patch, but is it a) required to this in a
> transaction? We normally reload the config even without. b) correct to
> do? What's with SET LOCAL variables?

(a) Yeah, it doesn't work without that. I forget what breaks, but if
you taking those out, it will blow up.

(b) Do those need special handling for some reason?

> I doubt that actually works correctly without a SIGINT handler as
> statement timeout just falls back to kill(SIGINT)? Or does it, because
> it falls back to just doing a proc_exit()? If so, is that actually safe?

See above kvetching.

>> + /* Post-execution cleanup. */
>> + disable_timeout(STATEMENT_TIMEOUT, false);
>> + CommitTransactionCommand();
>
> So, we're allowed to do nearly arbitrary nastyness here...

Can you be more specific about the nature of your concern? This is no
different than finish_xact_command().

>> + /*
>> + * Parse the SQL string into a list of raw parse trees.
>> + *
>> + * Because we allow statements that perform internal transaction control,
>> + * we can't do this in TopTransactionContext; the parse trees might get
>> + * blown away before we're done executing them.
>> + */
>> + parsecontext = AllocSetContextCreate(TopMemoryContext,
>> + "pg_background parse/plan",
>> + ALLOCSET_DEFAULT_MINSIZE,
>> + ALLOCSET_DEFAULT_INITSIZE,
>> + ALLOCSET_DEFAULT_MAXSIZE);
>
> Not that it hugely matters, but shouldn't this rather be
> TopTransactionContext?

No, for the reasons explained in the command that you quoted right, uh, there.

>> + bool snapshot_set = false;
>> + Portal portal;
>> + DestReceiver *receiver;
>> + int16 format = 1;
>> +
>> + /*
>> + * We don't allow transaction-control commands like COMMIT and ABORT
>> + * here. The entire SQL statement is executed as a single transaction
>> + * which commits if no errors are encountered.
>> + */
>> + if (IsA(parsetree, TransactionStmt))
>> + ereport(ERROR,
>> + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
>> + errmsg("transaction control statements are not allowed in pg_background")));
>
> Hm. I don't think that goes far enough. This allows commands that
> internally stop/start transactions like CREATE INDEX CONCURRETNLY. Are
> you sure that's working right now?

I tested VACUUM not of a specific table and that seems to work
correctly. I could try CIC, but I think the issues are the same. If
we only get one parse tree, then isTopLevel will be true and it's safe
for commands to do their own transaction control. If we get multiple
parse trees, then PreventTransactionChain will do its thing. This is
not novel territory.

> Hm. This is a fair amount of code copied from postgres.c.

Yes. I'm open to suggestions, but I don't immediately see a better way.

> I think this is interesting work, but I doubt it's ready yet. I need to
> read the preceding patches, to really understand where breakage lies
> hidden.

Breakage??? In my code??? Surely not. :-)

I'm reattaching all the uncommitted patches here. #3 and #6 have been
updated; #2 and #4 are unchanged.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
0002-Extend-dsm-API-with-a-new-function-dsm_unkeep_mappin.patch text/x-patch 2.0 KB
0003-Support-frontend-backend-protocol-communication-usin-v3.patch text/x-patch 22.4 KB
0004-Add-infrastructure-to-save-and-restore-GUC-values.patch text/x-patch 12.2 KB
0006-pg_background-Run-commands-in-a-background-worker-an-v3.patch text/x-patch 32.9 KB

From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-10-22 23:12:04
Message-ID: 544839C4.10801@commandprompt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 10/22/2014 04:03 PM, Robert Haas wrote:
> On Wed, Oct 8, 2014 at 6:32 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
>> I got to ask: Why is it helpful that we have this in contrib? I have a
>> good share of blame to bear for that, but I think we need to stop
>> dilluting contrib evermore with test programs. These have a place, but I
>> don't think it should be contrib.
>
> I don't think pg_background is merely a test program: I think it's a
> quite useful piece of functionality. It can be used for running
> VACUUM from a procedure, which is something people have asked for more
> than once, or for simulating an autonomous transaction. Granted,
> it'll be a lot slower than a real autonomous transaction, but it's
> still better than doing it via dblink, because you don't have to futz
> with authentication.

I think this is a very useful program but I wonder if it makes more
sense to push it out to pgxn? Why do we need an ever growing contrib?
Further, if it is good enough to go into contrib, why isn't it just in core?

Sincerely,

Joshua D. Drake

--
Command Prompt, Inc. - http://www.commandprompt.com/ 503-667-4564
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, @cmdpromptinc
"If we send our children to Caesar for their education, we should
not be surprised when they come back as Romans."


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-10-23 00:30:12
Message-ID: CA+TgmoaUnXWqTwNcF5eKzESgHLte5Hvp1sDZE-ELAeBqSaepCA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Oct 22, 2014 at 7:12 PM, Joshua D. Drake <jd(at)commandprompt(dot)com> wrote:
>> I don't think pg_background is merely a test program: I think it's a
>> quite useful piece of functionality. It can be used for running
>> VACUUM from a procedure, which is something people have asked for more
>> than once, or for simulating an autonomous transaction. Granted,
>> it'll be a lot slower than a real autonomous transaction, but it's
>> still better than doing it via dblink, because you don't have to futz
>> with authentication.
>
> I think this is a very useful program but I wonder if it makes more sense to
> push it out to pgxn? Why do we need an ever growing contrib? Further, if it
> is good enough to go into contrib, why isn't it just in core?

Sure, there's always this debate. Certainly, there's no reason it has
to be in contrib rather than core or PGXN. From my point of view,
putting it on PGXN would be a less-than-awesome development because
then the infrastructure that the patch introduces (patches 1-5 of the
series) would have no users in core or contrib, which means that (A)
if anyone is thinking of modifying that code in the future, they'll
have more difficulty knowing whether their changes break anything and
(B) if someone breaks something unwittingly, the buildfarm will not
tell us.

Now, admittedly, I expect that eventually all of this stuff will be
used in core - for parallelism. But in the meantime I think having at
least one example in the PostgreSQL source tree of how each piece of
extensibility infrastructure that we have can be used is handy and
useful and generally something that we ought to encourage. And this
one has the advantage - unlike some stuff in contrib - of actually
being somewhat useful for real work, which a decent amount of the
stuff in contrib isn't.

What about moving it the opposite direction, from contrib all the way
into core? The main advantage of that in my book is that it might
make it easier to reduce the code duplication between pg_background
and exec_simple_query; the main disadvantage is that it reduces the
super-user's control, because now the functionality is always
available instead of being something that the super-user can choose to
install, or not.

But these are all judgement calls, of course.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-10-23 00:49:22
Message-ID: 54485092.204@commandprompt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> What about moving it the opposite direction, from contrib all the way
> into core? The main advantage of that in my book is that it might
> make it easier to reduce the code duplication between pg_background
> and exec_simple_query; the main disadvantage is that it reduces the
> super-user's control, because now the functionality is always
> available instead of being something that the super-user can choose to
> install, or not.
>
> But these are all judgement calls, of course.

I lean toward a push into core because:

1. "I expect that eventually all of this stuff will be
used in core - for parallelism."

2. Contrib is already bloated out. (which is a whole other discussion)

3. I am not sure I buy into the super-user argument. Just because the
functionality is there, doesn't mean it has to be used.

4. "The main advantage of that in my book is that it might
make it easier to reduce the code duplication between pg_background
and exec_simple_query;"

JD

--
Command Prompt, Inc. - http://www.commandprompt.com/ 503-667-4564
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, @cmdpromptinc
"If we send our children to Caesar for their education, we should
not be surprised when they come back as Romans."


From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-10-24 17:13:09
Message-ID: 544A88A5.10206@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 10/22/14, 7:49 PM, Joshua D. Drake wrote:

> I lean toward a push into core because:

+1

> 3. I am not sure I buy into the super-user argument. Just because the functionality is there, doesn't mean it has to be used.

It's a valid concern, but I think the way to handle it if needed is to limit the number of connections a user can open. Or perhaps another option would be to change the permissions on the related functions (do we check ACLs for internal functions?)
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
Cc: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-10-24 17:21:51
Message-ID: CA+TgmoYOihJj4TM=E6=w5B1U5Ew=tJ2wBMPLUsWHXEQoT1h6YA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Oct 24, 2014 at 1:13 PM, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com> wrote:
> It's a valid concern, but I think the way to handle it if needed is to limit
> the number of connections a user can open. Or perhaps another option would
> be to change the permissions on the related functions (do we check ACLs for
> internal functions?)

I'm not sure dump-and-restore would preserve any properties of
anything in pg_catalog.

Anyway, I think we're getting a bit ahead of ourselves here. The
questions I need answers to right now are:

- What should we call dsm_unkeep_mapping, if not that?
- Are there remaining complaints about patch #3?
- How can I get somebody to review patch #4?
- Does anyone have a tangible suggestion for how to reduce the code
duplication in patch #6?

The question of where pg_background should ultimately live does
matter, but the answer will be "the -hackers mailing list archives"
unless we can get agreement on the prerequisite patches.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-10-24 19:23:12
Message-ID: 544AA720.9030305@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 10/24/14, 12:21 PM, Robert Haas wrote:
> On Fri, Oct 24, 2014 at 1:13 PM, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com> wrote:
>> It's a valid concern, but I think the way to handle it if needed is to limit
>> the number of connections a user can open. Or perhaps another option would
>> be to change the permissions on the related functions (do we check ACLs for
>> internal functions?)
>
> I'm not sure dump-and-restore would preserve any properties of
> anything in pg_catalog.
>
> Anyway, I think we're getting a bit ahead of ourselves here. The
> questions I need answers to right now are:
>
> - What should we call dsm_unkeep_mapping, if not that?
> - Are there remaining complaints about patch #3?
> - How can I get somebody to review patch #4?

Here's a review of patch 4.

Perhaps it would be good to document the serialization format. I at least would have found it helpful, especially when reading estimate_variable_size(). I can take a stab at that if you'd rather not be bothered.

estimate_variable_size():
+ valsize = 5; /* max(strlen('true'), strlen('false')) */
...
+ if (abs(*conf->variable) < 1000)
+ valsize = 3 + 1;

If we're worried about size, perhaps we should use 1/0 for bool instead of true/false?

On the serialization structure itself, should we be worried about a mismatch between available GUCs on the sender vs the receiver? Presumably if the sender outputs a GUC that the receiver doesn't know about we'll get an error, but what if the sender didn't include something? Maybe not an issue today, but could this cause problems down the road if we wanted to use the serialized data some other way? But maybe I'm just being paranoid. :)
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-10-24 19:30:52
Message-ID: 544AA8EC.5030105@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 10/24/14, 2:23 PM, Jim Nasby wrote:
> On the serialization structure itself, should we be worried about a mismatch between available GUCs on the sender vs the receiver? Presumably if the sender outputs a GUC that the receiver doesn't know about we'll get an error, but what if the sender didn't include something? Maybe not an issue today, but could this cause problems down the road if we wanted to use the serialized data some other way? But maybe I'm just being paranoid. :)

I just realized there's a bigger problem there; this isn't portable against any changes to any of the binary elements.

So I guess it's really a question of would we ever want this to function (as-is) cross-version.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
Cc: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-10-24 20:25:24
Message-ID: CA+TgmoZVFHQMO2MxOLt8RphnJC6ido_iFhLv3M25exiA8tN5XQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Oct 24, 2014 at 3:23 PM, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com> wrote:
> Here's a review of patch 4.

Thanks!

> Perhaps it would be good to document the serialization format. I at least
> would have found it helpful, especially when reading
> estimate_variable_size(). I can take a stab at that if you'd rather not be
> bothered.

I find the existing code pretty clear: to my eyes,
serialize_variable() really couldn't be much more transparent. But if
you want to suggest something I'm all ears.

> estimate_variable_size():
> + valsize = 5; /* max(strlen('true'),
> strlen('false')) */
> ...
> + if (abs(*conf->variable) < 1000)
> + valsize = 3 + 1;
>
> If we're worried about size, perhaps we should use 1/0 for bool instead of
> true/false?

I guess we could. I'm not particularly worried about size, myself.
We need all the rigamarole about estimating the amount of space needed
because we can't allocate more space once we begin serializing. If
the DSM where this data is getting stored turns out not to be big
enough, we can't resize it. So we need to *know* how big the data is
going to be, but I don't think we need to *care* very much.

> On the serialization structure itself, should we be worried about a mismatch
> between available GUCs on the sender vs the receiver? Presumably if the
> sender outputs a GUC that the receiver doesn't know about we'll get an
> error, but what if the sender didn't include something? Maybe not an issue
> today, but could this cause problems down the road if we wanted to use the
> serialized data some other way? But maybe I'm just being paranoid. :)

I think the principal danger here is that there could be some loadable
module which is present in one backend but not the other, or which (in
its inimitable wisdom) chose to register a different set of GUCs or
valid values for those GUCs in one backend than the other. That would
indeed be sad. Given that our rules for registering GUCs resemble
nothing so much as the wild west, I don't believe there's any ironclad
way to defend against it, either.

One thing we could do - not in this patch - is provide a mechanism
that would cause a background worker to load every loadable module
that's present in the launching worker. I'm a little reluctant to
spend time on that now. If pg_background doesn't work with certain
combinations of loadable modules and GUC settings... it's not a
catastrophe. It might be more of an issue for parallel query proper,
because expectations may be higher there, but if so it won't hurt to
have some experience seeing whether things go wrong with this simple
system, and in what manner. At least IMHO, it doesn't seem like a
good idea to put fiddling with the edges of this ahead of other
parallelism-related projects.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
Cc: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-10-24 20:26:55
Message-ID: CA+TgmoaC9xKcEaCYzoq7Zi6VAe1X8zuoyw_Jue1e63dXqSJ_XQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Oct 24, 2014 at 3:30 PM, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com> wrote:
> On 10/24/14, 2:23 PM, Jim Nasby wrote:
>> On the serialization structure itself, should we be worried about a
>> mismatch between available GUCs on the sender vs the receiver? Presumably if
>> the sender outputs a GUC that the receiver doesn't know about we'll get an
>> error, but what if the sender didn't include something? Maybe not an issue
>> today, but could this cause problems down the road if we wanted to use the
>> serialized data some other way? But maybe I'm just being paranoid. :)
>
> I just realized there's a bigger problem there; this isn't portable against
> any changes to any of the binary elements.
>
> So I guess it's really a question of would we ever want this to function
> (as-is) cross-version.

I think that would be pretty hard to make work, but I don't mind if
someone else wants to try for some use case that they want to meet.
My goal is to make parallel query work, so the data will just be
getting transferred between two simultaneously-running children of the
same postmaster.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-10-24 20:46:47
Message-ID: 544ABAB7.2080408@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 10/24/14, 12:21 PM, Robert Haas wrote:
> - What should we call dsm_unkeep_mapping, if not that?

Only option I can think of beyond unkeep would be dsm_(un)register_keep_mapping. Dunno that it's worth it.

> - Does anyone have a tangible suggestion for how to reduce the code
> duplication in patch #6?
Between execute_sql_string() and tcop/exec_simple_query()? Is there stuff in exec_simple that's not safe for bgwriter? I'm not seeing why we can't use exec_simple. :/

BTW, it does occur to me that we could do something to combine AllocSetContextCreate() followed by oldcontext=MemoryContextSwitchTo().

pg_background_result()
+ dsm_unkeep_mapping(info->seg);
+
+ /* Set up tuple-descriptor based on colum definition list. */
+ if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+ ereport(ERROR,
Is there a reason we can't check the result type before unkeeping? Seems silly to throw the results away just because someone flubbed a function call...

+ default:
+ elog(WARNING, "unknown message type: %c (%zu bytes)",
+ msg.data[0], nbytes);
It'd be useful to also provide DEBUG output with the message itself...
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-10-24 20:53:47
Message-ID: 544ABC5B.80300@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 10/24/14, 3:26 PM, Robert Haas wrote:
> On Fri, Oct 24, 2014 at 3:30 PM, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com> wrote:
>> On 10/24/14, 2:23 PM, Jim Nasby wrote:
>>> On the serialization structure itself, should we be worried about a
>>> mismatch between available GUCs on the sender vs the receiver? Presumably if
>>> the sender outputs a GUC that the receiver doesn't know about we'll get an
>>> error, but what if the sender didn't include something? Maybe not an issue
>>> today, but could this cause problems down the road if we wanted to use the
>>> serialized data some other way? But maybe I'm just being paranoid. :)
>>
>> I just realized there's a bigger problem there; this isn't portable against
>> any changes to any of the binary elements.
>>
>> So I guess it's really a question of would we ever want this to function
>> (as-is) cross-version.
>
> I think that would be pretty hard to make work, but I don't mind if
> someone else wants to try for some use case that they want to meet.
> My goal is to make parallel query work, so the data will just be
> getting transferred between two simultaneously-running children of the
> same postmaster.

The only case I can think of would be actually connecting to a remote database; in that case would we even want something as raw as this? I suspect not, in which case I don't see an issue. On the other hand, if we ever think we might want to do that, we should probably at least stick a version number field in there...

But my suspicion is if we ever wanted to do something more with this then we'd want some kind of text-based format that could be passed into a SQL command (ie: SET ENVIRONMENT TO blah;)
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
Cc: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-10-24 21:03:50
Message-ID: CA+TgmobDvZTaDfyvio8Us_qajapX0m9ykN-BeWLwWyL4hM_Nkw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Oct 24, 2014 at 4:46 PM, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com> wrote:
> On 10/24/14, 12:21 PM, Robert Haas wrote:
>> - What should we call dsm_unkeep_mapping, if not that?
>
> Only option I can think of beyond unkeep would be
> dsm_(un)register_keep_mapping. Dunno that it's worth it.

Hmm, we could rename dsm_keep_mapping() to dsm_unregister_mapping(),
since it's arranging to keep it by unregistering it from the resource
owner. And then we could call the new function
dsm_register_mapping(). That has the appeal that "unregister" is a
word, whereas "unkeep" isn't, but it's a little confusing otherwise,
because the sense is reversed vs. the current naming. Or we could
just leave dsm_keep_mapping() alone and call the new function
dsm_register_mapping(). A little non-orthogonal, but I think it'd be
OK.

>> - Does anyone have a tangible suggestion for how to reduce the code
>> duplication in patch #6?
>
> Between execute_sql_string() and tcop/exec_simple_query()? Is there stuff in
> exec_simple that's not safe for bgwriter? I'm not seeing why we can't use
> exec_simple. :/

There are some differences if you compare them closely.

> BTW, it does occur to me that we could do something to combine
> AllocSetContextCreate() followed by oldcontext=MemoryContextSwitchTo().

Meh.

> pg_background_result()
> + dsm_unkeep_mapping(info->seg);
> +
> + /* Set up tuple-descriptor based on colum definition list.
> */
> + if (get_call_result_type(fcinfo, NULL, &tupdesc) !=
> TYPEFUNC_COMPOSITE)
> + ereport(ERROR,
> Is there a reason we can't check the result type before unkeeping? Seems
> silly to throw the results away just because someone flubbed a function
> call...

Hmm, yeah, I see no reason why we couldn't move that up higher in the
function. It's probably a pretty common failure mode, too, so I can
see the convenience factor there.

> + default:
> + elog(WARNING, "unknown message type: %c (%zu
> bytes)",
> + msg.data[0], nbytes);
> It'd be useful to also provide DEBUG output with the message itself...

I think that's more work than is justified. We'd have to hexdump it
or something because it might not be valid in the client encoding, and
it's a case that shouldn't happen anyway.

(Hmm, that reminds me that I haven't thought of a solution to the
problem that the background worker might try to SET client_encoding,
which would be bad. Not sure what the best fix for that is, off-hand.
I'd rather not prohibit ALL GUC changes in the background worker, as
there's no good reason for such a draconian restriction.)

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
Cc: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-10-24 21:07:58
Message-ID: CA+TgmoaQ7TXBr0T5dB=88p8pMCPseH+DoSfW-3O5_vbaSxzVkQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Oct 24, 2014 at 4:53 PM, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com> wrote:
> The only case I can think of would be actually connecting to a remote
> database; in that case would we even want something as raw as this? I
> suspect not, in which case I don't see an issue. On the other hand, if we
> ever think we might want to do that, we should probably at least stick a
> version number field in there...
>
> But my suspicion is if we ever wanted to do something more with this then
> we'd want some kind of text-based format that could be passed into a SQL
> command (ie: SET ENVIRONMENT TO blah;)

I mean, I don't think this is much different than what we're already
doing to transfer variables from the postmaster to other backends in
EXEC_BACKEND builds; see write_nondefault_variables(). It doesn't
have a version number or anything either. I don't see a reason why
this code needs to be held to a different standard; the purpose is
fundamentally quite similar.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Petr Jelinek <petr(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
Cc: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-10-24 23:13:37
Message-ID: 544ADD21.2080204@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 24/10/14 23:03, Robert Haas wrote:
> On Fri, Oct 24, 2014 at 4:46 PM, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com> wrote:
>> On 10/24/14, 12:21 PM, Robert Haas wrote:
>>> - What should we call dsm_unkeep_mapping, if not that?
>>
>> Only option I can think of beyond unkeep would be
>> dsm_(un)register_keep_mapping. Dunno that it's worth it.
>
> Hmm, we could rename dsm_keep_mapping() to dsm_unregister_mapping(),
> since it's arranging to keep it by unregistering it from the resource
> owner. And then we could call the new function
> dsm_register_mapping(). That has the appeal that "unregister" is a
> word, whereas "unkeep" isn't, but it's a little confusing otherwise,
> because the sense is reversed vs. the current naming. Or we could
> just leave dsm_keep_mapping() alone and call the new function
> dsm_register_mapping(). A little non-orthogonal, but I think it'd be
> OK.
>

I don't like that too much, but I don't have better suggestion, if we
went with one of these, I would prefer taking the route of renaming the
dsm_keep_mapping to dsm_unregister_mapping.

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-10-24 23:17:03
Message-ID: 544ADDEF.9030403@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 10/24/14, 4:03 PM, Robert Haas wrote:
> On Fri, Oct 24, 2014 at 4:46 PM, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com> wrote:
>> On 10/24/14, 12:21 PM, Robert Haas wrote:
>>> - What should we call dsm_unkeep_mapping, if not that?
>>
>> Only option I can think of beyond unkeep would be
>> dsm_(un)register_keep_mapping. Dunno that it's worth it.
>
> Hmm, we could rename dsm_keep_mapping() to dsm_unregister_mapping(),
> since it's arranging to keep it by unregistering it from the resource
> owner. And then we could call the new function
> dsm_register_mapping(). That has the appeal that "unregister" is a
> word, whereas "unkeep" isn't, but it's a little confusing otherwise,
> because the sense is reversed vs. the current naming. Or we could
> just leave dsm_keep_mapping() alone and call the new function
> dsm_register_mapping(). A little non-orthogonal, but I think it'd be
> OK.

I think it's actually important to keep the names matching, even if it means "unkeep".

dsm_unregister_mapping seems like the opposite of what you'd want to do to have the mapping be remembered. I get that it works that way because of how it's implemented, but that seems like a lot of leakage of implementation into API...

FWIW, I don't see "unkeep" as being that horrible.

>>> - Does anyone have a tangible suggestion for how to reduce the code
>>> duplication in patch #6?
>>
>> Between execute_sql_string() and tcop/exec_simple_query()? Is there stuff in
>> exec_simple that's not safe for bgwriter? I'm not seeing why we can't use
>> exec_simple. :/
>
> There are some differences if you compare them closely.

Without doing a deep dive, I'm guessing that most of the extra stuff would be safe to re-use; it just wouldn't affect execute_sql_string. Obviously we could add a boolean to exec_simple_query for the case when it's being used by a bgwriter. Though, it seems like the biggest differences have to do with logging

Here's the differences I see:

- Disallowing transaction commands
- Logging
- What memory context is used (could we just use that differently in a pg_backend backend?)
- Portal output format
- What to do with the output of intermediate commands (surely there's other places we need to handle that? plpgsql maybe?)

I think all of those except logging could be handled by a function serving both exec_simple_query and execute_sql_command that accepts a few booleans (or maybe just one to indicate the caller) and some if's. At least I don't think it'd be too bad, without actually writing it.

I'm not sure what to do about logging. If the original backend has logging turned on, is it that horrible to do the same logging as exec_simple_query would do?

Another option would be factoring out parts of exec_simple_query; the for loop over the parse tree might be a good candidate. But I suspect that would be uglier than a separate support function.

I do feel uncomfortable with the amount of duplication there is right now though...

>> BTW, it does occur to me that we could do something to combine
>> AllocSetContextCreate() followed by oldcontext=MemoryContextSwitchTo().
>
> Meh.

Well, it does seem to be a fairly common pattern throughout the codebase. And you were pretty vague when you asked for ideas to reduce duplication. ;P

>> + default:
>> + elog(WARNING, "unknown message type: %c (%zu
>> bytes)",
>> + msg.data[0], nbytes);
>> It'd be useful to also provide DEBUG output with the message itself...
>
> I think that's more work than is justified. We'd have to hexdump it
> or something because it might not be valid in the client encoding, and
> it's a case that shouldn't happen anyway.

I'm worried that a user wouldn't have any good way to see what the unexpected data is... but I guess if a user ever saw this we've got bigger problems anyway...

> (Hmm, that reminds me that I haven't thought of a solution to the
> problem that the background worker might try to SET client_encoding,
> which would be bad. Not sure what the best fix for that is, off-hand.
> I'd rather not prohibit ALL GUC changes in the background worker, as
> there's no good reason for such a draconian restriction.)

Perhaps a new GucContext for background workers? Or maybe a new GucSource, though I'm not sure if that'd work and it seems pretty wrong anyway.

BTW, perhaps the amount of discussion on internal parts is an indication this shouldn't be in contrib. I think it's a damn strong indication it shouldn't be in PGXN. :)
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


From: Petr Jelinek <petr(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
Cc: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-10-24 23:20:42
Message-ID: 544ADECA.5090507@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 24/10/14 23:07, Robert Haas wrote:
> On Fri, Oct 24, 2014 at 4:53 PM, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com> wrote:
>> The only case I can think of would be actually connecting to a remote
>> database; in that case would we even want something as raw as this? I
>> suspect not, in which case I don't see an issue. On the other hand, if we
>> ever think we might want to do that, we should probably at least stick a
>> version number field in there...
>>
>> But my suspicion is if we ever wanted to do something more with this then
>> we'd want some kind of text-based format that could be passed into a SQL
>> command (ie: SET ENVIRONMENT TO blah;)
>
> I mean, I don't think this is much different than what we're already
> doing to transfer variables from the postmaster to other backends in
> EXEC_BACKEND builds; see write_nondefault_variables(). It doesn't
> have a version number or anything either. I don't see a reason why
> this code needs to be held to a different standard; the purpose is
> fundamentally quite similar.
>

I really do think that the GUC patch is ok as it is, we might want to do
1/0 for bools but I don't really see that as important thing. And it
works for the use-cases it's intended for. I don't see why this should
be required to work cross version really.
Loading of the modules by bgworker is probably bigger issue than just
GUCs, and I think it's out of scope for the current patch(set).

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-10-25 11:01:21
Message-ID: 20141025110121.GO1791@alvin.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas wrote:
> On Fri, Oct 24, 2014 at 4:46 PM, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com> wrote:
> > On 10/24/14, 12:21 PM, Robert Haas wrote:
> >> - What should we call dsm_unkeep_mapping, if not that?
> >
> > Only option I can think of beyond unkeep would be
> > dsm_(un)register_keep_mapping. Dunno that it's worth it.
>
> Hmm, we could rename dsm_keep_mapping() to dsm_unregister_mapping(),
> since it's arranging to keep it by unregistering it from the resource
> owner. And then we could call the new function
> dsm_register_mapping(). That has the appeal that "unregister" is a
> word, whereas "unkeep" isn't, but it's a little confusing otherwise,
> because the sense is reversed vs. the current naming. Or we could
> just leave dsm_keep_mapping() alone and call the new function
> dsm_register_mapping(). A little non-orthogonal, but I think it'd be
> OK.

I do think that dsm_keep_mapping is a strange name for what it does. If
it were a single function maybe it'd be okay because you read the
comments and know what it does, but if we have to invent the "unkeep"
stuff then it gets weird enough that a better solution seems worthwhile
to me. So +1 for renaming it to something else. I like the "pin"
analogy we use for buffers; so if you pin a mapping (dsm_pin_mapping)
then it doesn't go away automatically with the resowner, and then you
unpin it (dsm_unpin_mapping) to make the system collect it.

(Not sure what this means for the per-segment equivalent function. It
might be worth keeping both namings consistent.)

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-10-29 19:00:36
Message-ID: CA+TgmobE7yyxwW2Ux8F00M7AySjTc0F5pV06ztUG67S7vq-PVg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Oct 25, 2014 at 7:01 AM, Alvaro Herrera
<alvherre(at)2ndquadrant(dot)com> wrote:
> I do think that dsm_keep_mapping is a strange name for what it does.

OK, so let me see if I can summarize the votes so far on this (highly
important?) naming issue:

- Andres doesn't like "unkeep". He suggests dsm_manage_mapping(),
dsm_ensure_mapping_cleanup(), and dsm_remember_mapping() as possible
alternatives.
- Amit also doesn't like "unkeep". He suggests dsm_change_mapping().
Alternatively, he suggests having a function called
dsm_manage_mapping() with an additional boolean parameter to indicate
whether we are keeping or unkeeping.
- Jim, without taking a position on whether the current name is
problematic, suggested the naming, suggested
dsm_(un)register_keep_mapping.
- I am unbothered by the name "unkeep". But I suggested renaming
"dsm_keep_mapping" to "dsm_unregister_mapping" and adding
"dsm_register_mapping" as an alternative.
- Petr liked that proposal better than the others, although it wasn't
clear that he was unhappy with my original proposal.
- Alvaro proposes dsm_pin_mapping/dsm_unpin_mappng.
- Nobody's comments on any proposal made subsequent to the proposal
they made themselves.

After reviewing all of those possibilities with the sort of laser-like
focus the situation demands, I'm inclined to endorse Alvaro's proposal
to rename the existing dsm_keep_mapping() function to
dsm_pin_mapping() and the existing dsm_keep_segment() function to
dsm_pin_segment(). Then, I will add the new function as
dsm_unpin_mapping(). So:

1. Does anyone strongly object to that course of action?

2. Does anyone wish to argue for or against back-patching the name
changes to 9.4? My feeling is that we may as well, because either
nobody's using this yet, in which case back-patching it won't break
anything, or somebody is, in which case we'll cause less pain by
breaking it before release than a year on. But I don't care that much
either way, so I'll defer if others disagree.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-10-29 19:07:06
Message-ID: 20141029190706.GE17790@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2014-10-29 15:00:36 -0400, Robert Haas wrote:
> 1. Does anyone strongly object to that course of action?

I don't.

> 2. Does anyone wish to argue for or against back-patching the name
> changes to 9.4?

+1.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-10-29 19:28:39
Message-ID: 20141029192839.GG17724@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2014-10-22 19:03:28 -0400, Robert Haas wrote:
> On Wed, Oct 8, 2014 at 6:32 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> > I got to ask: Why is it helpful that we have this in contrib? I have a
> > good share of blame to bear for that, but I think we need to stop
> > dilluting contrib evermore with test programs. These have a place, but I
> > don't think it should be contrib.
>
> I don't think pg_background is merely a test program: I think it's a
> quite useful piece of functionality. It can be used for running
> VACUUM from a procedure, which is something people have asked for more
> than once, or for simulating an autonomous transaction. Granted,
> it'll be a lot slower than a real autonomous transaction, but it's
> still better than doing it via dblink, because you don't have to futz
> with authentication.

Fair enough.

> > Hm. So every user can do this once the extension is created as the
> > functions are most likely to be PUBLIC. Is this a good idea?
>
> Why not? If they can log in, they could start separate sessions with
> similar effect.

Connection limits and revoked connection permissions are possible
reasons.

> > I'm unsure right now about the rules surrounding this, but shouldn't we
> > check that the user is allowed to execute these? And shouldn't we fall
> > back to non binary functions if no binary ones are available?
>
> I can't see any reason to do either of those things. I'm not aware
> that returning data in binary format is in any way intended to be a
> security-restricted operation, or that we have any data types that
> actually matter without send and receive functions. If we do, I think
> the solution is to add them, not make this more complicated.

We do have a couple of types that don't have binary send/recv
functions. And there are many more out there. I don't think we can make
that a hard prerequisite.

> >> + /*
> >> + * Limit the maximum error level to ERROR. We don't want
> >> + * a FATAL inside the background worker to kill the user
> >> + * session.
> >> + */
> >> + if (edata.elevel > ERROR)
> >> + edata.elevel = ERROR;
> >
> > Hm. But we still should report that it FATALed? Maybe add error context
> > notice about it? Not nice, but I don't have a immediately better idea. I
> > think it generally would be rather helpful to add the information that
> > this wasn't originally an error triggered by this process. The user
> > might otherwise be confused when looking for the origin of the error in
> > the log.
>
> Yeah, I was wondering if we needed some kind of annotation here. What
> I'm wondering about is appending something to the errcontext, perhaps
> "background worker, PID %d".

Can't we just add another error context? That seems cleaner to me. It
should be sufficient to push something onto the relevant stack.

> >> + case 'A':
> >> + {
> >> + /* Propagate NotifyResponse. */
> >> + pq_putmessage(msg.data[0], &msg.data[1], nbytes - 1);
> >> + break;
> >
> > Hm. Are we sure to be in a situation where the client expects these? And
> > are we sure their encoding is correct? The other data goe through
> > input/output methods checking for that, but here we don't. And the other
> > side AFAICS could have done a SET client_encoding.
>
> I think there's no problem at the protocol level; I think the server
> can send NotifyResponse pretty much whenever.

I think the server not having done so for pretty much forever will still
confuse clients.

> It could be argued that this is a POLA violation, but dropping the
> notify messages on the floor (which seems to be the other option)
> doesn't seem like superior. So I think this is mostly a matter of
> documentation.

Maybe.

> >> +Datum
> >> +pg_background_detach(PG_FUNCTION_ARGS)
> >> +{
> >> + int32 pid = PG_GETARG_INT32(0);
> >> + pg_background_worker_info *info;
> >> +
> >> + info = find_worker_info(pid);
> >> + if (info == NULL)
> >> + ereport(ERROR,
> >> + (errcode(ERRCODE_UNDEFINED_OBJECT),
> >> + errmsg("PID %d is not attached to this session", pid)));
> >> + dsm_detach(info->seg);
> >> +
> >> + PG_RETURN_VOID();
> >> +}
> >
> > So there 's really no limit of who is allowed to do stuff like
> > this. I.e. any SECURITY DEFINER and such may do the same.
>
> Do you think we need a restriction? It's not obvious to me that there
> are any security-relevant consequences to this, but it's an important
> question, and I might be missing something.

Well, a security definer function might have started the worker. And
then the plain user kills it. Or the other way round.

> >> + /* Establish signal handlers. */
> >> + pqsignal(SIGTERM, handle_sigterm);
> >
> > Hm. No SIGINT?
>
> Nope; bgworker.c sets it to StatementCancelHandler, which should be
> fine.

Ah, right.

> >> + /* Find data structures in dynamic shared memory. */
> >> + fdata = shm_toc_lookup(toc, PG_BACKGROUND_KEY_FIXED_DATA);
> >> + sql = shm_toc_lookup(toc, PG_BACKGROUND_KEY_SQL);
> >> + gucstate = shm_toc_lookup(toc, PG_BACKGROUND_KEY_GUC);
> >> + mq = shm_toc_lookup(toc, PG_BACKGROUND_KEY_QUEUE);
> >> + shm_mq_set_sender(mq, MyProc);
> >> + responseq = shm_mq_attach(mq, seg, NULL);
> >
> > Don't these need to ensure that values have been found? shm_toc_lookup
> > returns NULL for unknown itmes and such and such?
>
> Meh. The launching process would have errored out if it hadn't been
> able to set up the segment correctly. We could add some Assert()
> statements if you really feel strongly about it, but it seems fairly
> pointless to me. Any situation where those pointers come back NULL is
> presumably going to be some sort of really stupid bug that will be
> found even by trivial testing.

If we write code like this we really should have made it an error to
look something up that doesn't exist in the toc. But whatever.

> >> + /* Restore GUC values from launching backend. */
> >> + StartTransactionCommand();
> >> + RestoreGUCState(gucstate);
> >> + CommitTransactionCommand();
> >
> > I haven't read the guc save patch, but is it a) required to this in a
> > transaction? We normally reload the config even without. b) correct to
> > do? What's with SET LOCAL variables?
>
> (a) Yeah, it doesn't work without that. I forget what breaks, but if
> you taking those out, it will blow up.

That seems bad. Because we actually parse gucs without a surrounding
transaction in some cases. But maybe it's just NOT_IN_FILE (or however
they're called) variables that are problematic?

Btw, how are we dealing with connection gucs?

> (b) Do those need special handling for some reason?

Well, the CommitTransactionCommand() will reset them (via AtEOXact_GUC),
no? Or am I missing something?

> >> + /* Post-execution cleanup. */
> >> + disable_timeout(STATEMENT_TIMEOUT, false);
> >> + CommitTransactionCommand();
> >
> > So, we're allowed to do nearly arbitrary nastyness here...
>
> Can you be more specific about the nature of your concern? This is no
> different than finish_xact_command().

What I was thinking about was that the background process can do stuff
that writes.

> >> + /*
> >> + * Parse the SQL string into a list of raw parse trees.
> >> + *
> >> + * Because we allow statements that perform internal transaction control,
> >> + * we can't do this in TopTransactionContext; the parse trees might get
> >> + * blown away before we're done executing them.
> >> + */
> >> + parsecontext = AllocSetContextCreate(TopMemoryContext,
> >> + "pg_background parse/plan",
> >> + ALLOCSET_DEFAULT_MINSIZE,
> >> + ALLOCSET_DEFAULT_INITSIZE,
> >> + ALLOCSET_DEFAULT_MAXSIZE);
> >
> > Not that it hugely matters, but shouldn't this rather be
> > TopTransactionContext?
>
> No, for the reasons explained in the command that you quoted right, uh, there.

Man. You expect me to read comments? .oO(hide)

> >> + bool snapshot_set = false;
> >> + Portal portal;
> >> + DestReceiver *receiver;
> >> + int16 format = 1;
> >> +
> >> + /*
> >> + * We don't allow transaction-control commands like COMMIT and ABORT
> >> + * here. The entire SQL statement is executed as a single transaction
> >> + * which commits if no errors are encountered.
> >> + */
> >> + if (IsA(parsetree, TransactionStmt))
> >> + ereport(ERROR,
> >> + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> >> + errmsg("transaction control statements are not allowed in pg_background")));
> >
> > Hm. I don't think that goes far enough. This allows commands that
> > internally stop/start transactions like CREATE INDEX CONCURRETNLY. Are
> > you sure that's working right now?
>
> I tested VACUUM not of a specific table and that seems to work
> correctly. I could try CIC, but I think the issues are the same. If
> we only get one parse tree, then isTopLevel will be true and it's safe
> for commands to do their own transaction control. If we get multiple
> parse trees, then PreventTransactionChain will do its thing. This is
> not novel territory.

It's not novel, but damned easy to get wrong.

> > Hm. This is a fair amount of code copied from postgres.c.
>
> Yes. I'm open to suggestions, but I don't immediately see a better way.

I don't immediately see something better either. But I definitely
dislike the amount of code it duplicates.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Petr Jelinek <petr(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-10-29 19:32:35
Message-ID: 545140D3.4030108@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 29/10/14 20:00, Robert Haas wrote:
> After reviewing all of those possibilities with the sort of laser-like
> focus the situation demands, I'm inclined to endorse Alvaro's proposal
> to rename the existing dsm_keep_mapping() function to
> dsm_pin_mapping() and the existing dsm_keep_segment() function to
> dsm_pin_segment(). Then, I will add the new function as
> dsm_unpin_mapping(). So:
>
> 1. Does anyone strongly object to that course of action?
>

Nah, it sounds reasonable.

> 2. Does anyone wish to argue for or against back-patching the name
> changes to 9.4? My feeling is that we may as well, because either
> nobody's using this yet, in which case back-patching it won't break
> anything, or somebody is, in which case we'll cause less pain by
> breaking it before release than a year on. But I don't care that much
> either way, so I'll defer if others disagree.
>

+1

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-10-29 20:38:59
Message-ID: CA+TgmoZjN6sC2XWDG2pr0c4qZcRQU2YSnOaWjakgGaT1VvKZ6Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Oct 29, 2014 at 3:28 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
>> > Hm. So every user can do this once the extension is created as the
>> > functions are most likely to be PUBLIC. Is this a good idea?
>>
>> Why not? If they can log in, they could start separate sessions with
>> similar effect.
>
> Connection limits and revoked connection permissions are possible
> reasons.

That's true. I don't know what to do about it. I'm somewhat inclined
to think that, if this remains in contrib, it's OK to ignore those
issues until such time as people complain about them, because anybody
who dislikes the things that can be done with this extension doesn't
have to install it. Also, the people complaining might have useful
ideas about what a good fix would look like, which I currently don't.
There's some push to move this into core, which I think is overkill,
but if we do it then we'd better have a good solution to this problem.

We could try to make connection limits apply to pg_background, and we
could also check CONNECT permission when starting a background worker.
Both of those things feel slightly odd because there's no actual
server connection. There *might* be a connection to the user backend
that started it, but it's sort of a "virtual" connection through
shared memory, and the background process continues running unimpeded
if it goes away, so there might be no actual connection at all. At
the other extreme, we could invent a BACKGROUND permission and
background_limit as analogues of what we have now and enforce those
instead. But I'm not keep to getting sucked into a long process of
perfecting pg_background. It's here; I think some people will like
it, but it exists mostly to show that, dude, we can use dynamic shared
memory to talk to a background worker and do real stuff.

>> > I'm unsure right now about the rules surrounding this, but shouldn't we
>> > check that the user is allowed to execute these? And shouldn't we fall
>> > back to non binary functions if no binary ones are available?
>>
>> I can't see any reason to do either of those things. I'm not aware
>> that returning data in binary format is in any way intended to be a
>> security-restricted operation, or that we have any data types that
>> actually matter without send and receive functions. If we do, I think
>> the solution is to add them, not make this more complicated.
>
> We do have a couple of types that don't have binary send/recv
> functions. And there are many more out there. I don't think we can make
> that a hard prerequisite.

Well, again, I think this comes down partly to the question of whether
this is a toy, a first-class facility, or something in between. But I
also think there's absolutely no rule that all types need to work with
every feature. Some types, for example, don't provide a hash opclass.
You can't build hash indexes on those data types; more importantly,
you can't do hash joins. Anyone who doesn't like that can write a
patch to provide a hash opclass. Similarly, if you try to retrieve
results in binary from the server and there are no send/recv
functions, it will fail. MUCH less significantly, the type also can't
be returned via pg_background. I don't see why that should be viewed
as a problem pg_background has to fix rather than a problem with the
type.

>> Yeah, I was wondering if we needed some kind of annotation here. What
>> I'm wondering about is appending something to the errcontext, perhaps
>> "background worker, PID %d".
>
> Can't we just add another error context? That seems cleaner to me. It
> should be sufficient to push something onto the relevant stack.

Right now, pq_parse_errornotice() just recontructs the exact ErrorData
that was present in the other backed, and ThrowErrorData() just
rethrows it as-is. I think that's the right design. It's for the
application code (patch 6) to figure out what to pass to
ThrowErrorData(). The error propagation machinery itself (patch 3)
should content itself with serializing and deserializing the data it's
given, and should not attempt to second-guess which parts of the error
data the user may want to change.

>> >> + case 'A':
>> >> + {
>> >> + /* Propagate NotifyResponse. */
>> >> + pq_putmessage(msg.data[0], &msg.data[1], nbytes - 1);
>> >> + break;
>> >
>> > Hm. Are we sure to be in a situation where the client expects these? And
>> > are we sure their encoding is correct? The other data goe through
>> > input/output methods checking for that, but here we don't. And the other
>> > side AFAICS could have done a SET client_encoding.
>>
>> I think there's no problem at the protocol level; I think the server
>> can send NotifyResponse pretty much whenever.
>
> I think the server not having done so for pretty much forever will still
> confuse clients.
>
>> It could be argued that this is a POLA violation, but dropping the
>> notify messages on the floor (which seems to be the other option)
>> doesn't seem like superior. So I think this is mostly a matter of
>> documentation.
>
> Maybe.

Yeah, I'm not totally sure about this either. Perhaps some other
people will weigh in.

>> Do you think we need a restriction? It's not obvious to me that there
>> are any security-relevant consequences to this, but it's an important
>> question, and I might be missing something.
>
> Well, a security definer function might have started the worker. And
> then the plain user kills it. Or the other way round.

Detaching the worker doesn't kill it; it just causes us not to wait
for the results. It does mean that if a security definer function
starts a worker, and returns without detaching it or cleaning it up,
the unprivileged user could then read the data back from that worker.
That's more insidious than it may at first appear, because the
security definer function could have been intending to read back the
data before returning, and then a transaction abort happens. We could
add a guard requiring that the data be read by the same effective user
ID that started the worker, although that might also foreclose useful
things people would otherwise be able to do with this.

>> Meh. The launching process would have errored out if it hadn't been
>> able to set up the segment correctly. We could add some Assert()
>> statements if you really feel strongly about it, but it seems fairly
>> pointless to me. Any situation where those pointers come back NULL is
>> presumably going to be some sort of really stupid bug that will be
>> found even by trivial testing.
>
> If we write code like this we really should have made it an error to
> look something up that doesn't exist in the toc. But whatever.

You might be right. I think there could be other cases where the data
could exist or be omitted and the worker would have to know to cope
with it, but having the dsm_toc stuff error out directly wouldn't have
been stupid either.

>> >> + /* Restore GUC values from launching backend. */
>> >> + StartTransactionCommand();
>> >> + RestoreGUCState(gucstate);
>> >> + CommitTransactionCommand();
>> >
>> > I haven't read the guc save patch, but is it a) required to this in a
>> > transaction? We normally reload the config even without. b) correct to
>> > do? What's with SET LOCAL variables?
>>
>> (a) Yeah, it doesn't work without that. I forget what breaks, but if
>> you taking those out, it will blow up.
>
> That seems bad. Because we actually parse gucs without a surrounding
> transaction in some cases. But maybe it's just NOT_IN_FILE (or however
> they're called) variables that are problematic?

I just tried this and got:

ERROR: invalid value for parameter "session_authorization": "rhaas"

> Btw, how are we dealing with connection gucs?

What do you mean by that?

>> (b) Do those need special handling for some reason?
>
> Well, the CommitTransactionCommand() will reset them (via AtEOXact_GUC),
> no? Or am I missing something?

We don't serialize and restore the GucStackState.

>> >> + /* Post-execution cleanup. */
>> >> + disable_timeout(STATEMENT_TIMEOUT, false);
>> >> + CommitTransactionCommand();
>> >
>> > So, we're allowed to do nearly arbitrary nastyness here...
>>
>> Can you be more specific about the nature of your concern? This is no
>> different than finish_xact_command().
>
> What I was thinking about was that the background process can do stuff
> that writes.

So what?

>> > Hm. This is a fair amount of code copied from postgres.c.
>>
>> Yes. I'm open to suggestions, but I don't immediately see a better way.
>
> I don't immediately see something better either. But I definitely
> dislike the amount of code it duplicates.

I'm not crazy about it myself. But I don't think polluting
exec_simple_query() with pg_background's concerns is a good plan
either. That'd really be getting our priorities mixed up.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-10-29 20:58:11
Message-ID: 20141029205811.GI17724@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2014-10-29 16:38:59 -0400, Robert Haas wrote:
> On Wed, Oct 29, 2014 at 3:28 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> >> > Hm. So every user can do this once the extension is created as the
> >> > functions are most likely to be PUBLIC. Is this a good idea?
> >>
> >> Why not? If they can log in, they could start separate sessions with
> >> similar effect.
> >
> > Connection limits and revoked connection permissions are possible
> > reasons.
>
> That's true. I don't know what to do about it. I'm somewhat inclined
> to think that, if this remains in contrib, it's OK to ignore those
> issues until such time as people complain about them, because anybody
> who dislikes the things that can be done with this extension doesn't
> have to install it. Also, the people complaining might have useful
> ideas about what a good fix would look like, which I currently don't.
> There's some push to move this into core, which I think is overkill,
> but if we do it then we'd better have a good solution to this problem.

At the very least it need to be clearly documented. Another solution
would be to simply not give out PUBLIC rights, and restrict it to the
owner/superuesers lest somebody makes explicit grants. I favor
combining those two.

> We could try to make connection limits apply to pg_background, and we
> could also check CONNECT permission when starting a background worker.
> Both of those things feel slightly odd because there's no actual
> server connection. There *might* be a connection to the user backend
> that started it, but it's sort of a "virtual" connection through
> shared memory, and the background process continues running unimpeded
> if it goes away, so there might be no actual connection at all.

I think that'd not be bad.

> >> > I'm unsure right now about the rules surrounding this, but shouldn't we
> >> > check that the user is allowed to execute these? And shouldn't we fall
> >> > back to non binary functions if no binary ones are available?
> >>
> >> I can't see any reason to do either of those things. I'm not aware
> >> that returning data in binary format is in any way intended to be a
> >> security-restricted operation, or that we have any data types that
> >> actually matter without send and receive functions. If we do, I think
> >> the solution is to add them, not make this more complicated.
> >
> > We do have a couple of types that don't have binary send/recv
> > functions. And there are many more out there. I don't think we can make
> > that a hard prerequisite.
>
> Well, again, I think this comes down partly to the question of whether
> this is a toy, a first-class facility, or something in between. But I
> also think there's absolutely no rule that all types need to work with
> every feature. Some types, for example, don't provide a hash opclass.
> You can't build hash indexes on those data types; more importantly,
> you can't do hash joins. Anyone who doesn't like that can write a
> patch to provide a hash opclass. Similarly, if you try to retrieve
> results in binary from the server and there are no send/recv
> functions, it will fail. MUCH less significantly, the type also can't
> be returned via pg_background. I don't see why that should be viewed
> as a problem pg_background has to fix rather than a problem with the
> type.

Hm. I'm unconvinced. It looks almost trivial to fail back to the text
based protocol.

> >> Yeah, I was wondering if we needed some kind of annotation here. What
> >> I'm wondering about is appending something to the errcontext, perhaps
> >> "background worker, PID %d".
> >
> > Can't we just add another error context? That seems cleaner to me. It
> > should be sufficient to push something onto the relevant stack.
>
> Right now, pq_parse_errornotice() just recontructs the exact ErrorData
> that was present in the other backed, and ThrowErrorData() just
> rethrows it as-is. I think that's the right design. It's for the
> application code (patch 6) to figure out what to pass to
> ThrowErrorData(). The error propagation machinery itself (patch 3)
> should content itself with serializing and deserializing the data it's
> given, and should not attempt to second-guess which parts of the error
> data the user may want to change.

I don't see how that follows. The error context logic is there to make
it clearer where an error originated from. It'll be really confusing if
there's ERRORs jumping out of a block of code without emitting context
that has set a error context set.

> >> Do you think we need a restriction? It's not obvious to me that there
> >> are any security-relevant consequences to this, but it's an important
> >> question, and I might be missing something.
> >
> > Well, a security definer function might have started the worker. And
> > then the plain user kills it. Or the other way round.
>
> Detaching the worker doesn't kill it; it just causes us not to wait
> for the results.

I think in many cases that's a academical distinction.

> It does mean that if a security definer function
> starts a worker, and returns without detaching it or cleaning it up,
> the unprivileged user could then read the data back from that worker.
> That's more insidious than it may at first appear, because the
> security definer function could have been intending to read back the
> data before returning, and then a transaction abort happens. We could
> add a guard requiring that the data be read by the same effective user
> ID that started the worker, although that might also foreclose useful
> things people would otherwise be able to do with this.

I think such a restriction would be a good idea for now.

> >> Meh. The launching process would have errored out if it hadn't been
> >> able to set up the segment correctly. We could add some Assert()
> >> statements if you really feel strongly about it, but it seems fairly
> >> pointless to me. Any situation where those pointers come back NULL is
> >> presumably going to be some sort of really stupid bug that will be
> >> found even by trivial testing.
> >
> > If we write code like this we really should have made it an error to
> > look something up that doesn't exist in the toc. But whatever.
>
> You might be right. I think there could be other cases where the data
> could exist or be omitted and the worker would have to know to cope
> with it, but having the dsm_toc stuff error out directly wouldn't have
> been stupid either.

There could be an explicit boolean function to check for the presence...

> >> >> + /* Restore GUC values from launching backend. */
> >> >> + StartTransactionCommand();
> >> >> + RestoreGUCState(gucstate);
> >> >> + CommitTransactionCommand();
> >> >
> >> > I haven't read the guc save patch, but is it a) required to this in a
> >> > transaction? We normally reload the config even without. b) correct to
> >> > do? What's with SET LOCAL variables?
> >>
> >> (a) Yeah, it doesn't work without that. I forget what breaks, but if
> >> you taking those out, it will blow up.
> >
> > That seems bad. Because we actually parse gucs without a surrounding
> > transaction in some cases. But maybe it's just NOT_IN_FILE (or however
> > they're called) variables that are problematic?
>
> I just tried this and got:
>
> ERROR: invalid value for parameter "session_authorization": "rhaas"

Hm. Ok.

> > Btw, how are we dealing with connection gucs?
>
> What do you mean by that?

There's some PGC_BACKEND guc's that normally are only allowed to be set
at connection start. I'm not sure whether you're just circumventing that
check or whether you didn't hit a problematic case.

What does e.g. happen if you set PGOPTIONS='-c
local_preload_libraries=auto_explain' after linking auto_explain into
the plugin directory?

> >> (b) Do those need special handling for some reason?
> >
> > Well, the CommitTransactionCommand() will reset them (via AtEOXact_GUC),
> > no? Or am I missing something?
>
> We don't serialize and restore the GucStackState.

Ah, ok.

> >> > Hm. This is a fair amount of code copied from postgres.c.
> >>
> >> Yes. I'm open to suggestions, but I don't immediately see a better way.
> >
> > I don't immediately see something better either. But I definitely
> > dislike the amount of code it duplicates.
>
> I'm not crazy about it myself. But I don't think polluting
> exec_simple_query() with pg_background's concerns is a good plan
> either. That'd really be getting our priorities mixed up.

Agreed that that'd be the wrong way round.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-10-30 22:03:27
Message-ID: CA+Tgmob+vvn++HS=8m_Tn114g=rH+WqDMt6=ZSOf6GNQ10FXoA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Oct 29, 2014 at 4:58 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
>> That's true. I don't know what to do about it. I'm somewhat inclined
>> to think that, if this remains in contrib, it's OK to ignore those
>> issues until such time as people complain about them, because anybody
>> who dislikes the things that can be done with this extension doesn't
>> have to install it. Also, the people complaining might have useful
>> ideas about what a good fix would look like, which I currently don't.
>> There's some push to move this into core, which I think is overkill,
>> but if we do it then we'd better have a good solution to this problem.
>
> At the very least it need to be clearly documented. Another solution
> would be to simply not give out PUBLIC rights, and restrict it to the
> owner/superuesers lest somebody makes explicit grants. I favor
> combining those two.

I don't think it's appropriate to put superuser() checks in the code,
if that's what you are proposing. Forcing this to be super-user only
is hitting a mouse with a battleship. If you're saying we should put
REVOKE commands into the install script as we do for some other
modules, like dblink, that makes sense to me.

>> We could try to make connection limits apply to pg_background, and we
>> could also check CONNECT permission when starting a background worker.
>> Both of those things feel slightly odd because there's no actual
>> server connection. There *might* be a connection to the user backend
>> that started it, but it's sort of a "virtual" connection through
>> shared memory, and the background process continues running unimpeded
>> if it goes away, so there might be no actual connection at all.
>
> I think that'd not be bad.

Looks like those checks happen in InitializeSessionUserId(), which is
called from InitPostgres(), which is called from
BackgroundWorkerInitializeConnection(). That makes me think we're
already applying these checks.

rhaas=# select * from
pg_background_result(pg_background_launch('vacuum pg_enum')) as (x
text);
x
--------
VACUUM
(1 row)

rhaas=# alter role rhaas nologin;
ALTER ROLE
rhaas=# select * from
pg_background_result(pg_background_launch('vacuum pg_enum')) as (x
text);
ERROR: role "rhaas" is not permitted to log in
CONTEXT: background worker, pid 64311
rhaas=# alter role rhaas login;
ALTER ROLE
rhaas=# select * from
pg_background_result(pg_background_launch('vacuum pg_enum')) as (x
text);
x
--------
VACUUM
(1 row)

> Hm. I'm unconvinced. It looks almost trivial to fail back to the text
> based protocol.

It's hard to argue with "I'm unconvinced". What specifically about
that argument do you think isn't valid?

While I am sure the problem can be worked around, it isn't trivial.
Right now, execute_sql_string() just requests binary format
unconditionally. To do what you're talking about, we'd need to
iterate through all of the types and figure out which ones have
typsend/typreceive functions. If there's a convenience function that
will do that for us, I don't see it, probably because I believe there
are exact zero situations where we do that kind of inference today.
Then, the user backend has to save the format codes from the
RowDescription message and decide whether to use text or binary. That
just seems like a silly waste of code and cycles.

I think this actually matters, too, because the question is what we're
going to do with full-blown parallelism. Best would be to actually
shuttle entire raw tuples between backends; second best, binary
format; third best, text format or mixture of text and binary. I'm
not sure what it's reasonable to try to get away with there, but I
certainly think minimizing IPC costs is going to be an important goal.
I didn't try to get around with shipping raw tuples here because we
don't lock types while they're in use, and I'm worried that Bad Things
Could Happen. But I'm sure somebody's going to care about the
overhead of converting back and forth at some point.

> I don't see how that follows. The error context logic is there to make
> it clearer where an error originated from. It'll be really confusing if
> there's ERRORs jumping out of a block of code without emitting context
> that has set a error context set.

I don't think I was proposing that, but I think I may have gotten a
little off-track here. See what you think of the attached, which
seems to work.

>> It does mean that if a security definer function
>> starts a worker, and returns without detaching it or cleaning it up,
>> the unprivileged user could then read the data back from that worker.
>> That's more insidious than it may at first appear, because the
>> security definer function could have been intending to read back the
>> data before returning, and then a transaction abort happens. We could
>> add a guard requiring that the data be read by the same effective user
>> ID that started the worker, although that might also foreclose useful
>> things people would otherwise be able to do with this.
>
> I think such a restriction would be a good idea for now.

Done in the attached version.

>> > Btw, how are we dealing with connection gucs?
>>
>> What do you mean by that?
>
> There's some PGC_BACKEND guc's that normally are only allowed to be set
> at connection start. I'm not sure whether you're just circumventing that
> check or whether you didn't hit a problematic case.
>
> What does e.g. happen if you set PGOPTIONS='-c
> local_preload_libraries=auto_explain' after linking auto_explain into
> the plugin directory?

Eh, I'm embarrassed to admit I don't know exactly how to do this. My
install directory doesn't seem to have a subdirectory called "plugin".

BTW, are we in agreement, more or less, on patch #3 now, the support
for error propagation? I'd like to go ahead and commit that if there
are no further review comments there.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
0006-pg_background-Run-commands-in-a-background-worker-an-v4.patch text/x-patch 34.8 KB

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-10-30 23:28:19
Message-ID: 20141030232819.GL17724@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On 2014-10-30 18:03:27 -0400, Robert Haas wrote:
> On Wed, Oct 29, 2014 at 4:58 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> >> That's true. I don't know what to do about it. I'm somewhat inclined
> >> to think that, if this remains in contrib, it's OK to ignore those
> >> issues until such time as people complain about them, because anybody
> >> who dislikes the things that can be done with this extension doesn't
> >> have to install it. Also, the people complaining might have useful
> >> ideas about what a good fix would look like, which I currently don't.
> >> There's some push to move this into core, which I think is overkill,
> >> but if we do it then we'd better have a good solution to this problem.
> >
> > At the very least it need to be clearly documented. Another solution
> > would be to simply not give out PUBLIC rights, and restrict it to the
> > owner/superuesers lest somebody makes explicit grants. I favor
> > combining those two.
>
> I don't think it's appropriate to put superuser() checks in the code,
> if that's what you are proposing.

That's not at all what I'm thinking of... The superuser reference was
only that explicit EXECUTE rights aren't required for superusers.

> Forcing this to be super-user only is hitting a mouse with a
> battleship. If you're saying we should put REVOKE commands into the
> install script as we do for some other modules, like dblink, that
> makes sense to me.

But instead this.

> >> We could try to make connection limits apply to pg_background, and we
> >> could also check CONNECT permission when starting a background worker.
> >> Both of those things feel slightly odd because there's no actual
> >> server connection. There *might* be a connection to the user backend
> >> that started it, but it's sort of a "virtual" connection through
> >> shared memory, and the background process continues running unimpeded
> >> if it goes away, so there might be no actual connection at all.
> >
> > I think that'd not be bad.
>
> Looks like those checks happen in InitializeSessionUserId(), which is
> called from InitPostgres(), which is called from
> BackgroundWorkerInitializeConnection(). That makes me think we're
> already applying these checks.

Oh, neat.

> > Hm. I'm unconvinced. It looks almost trivial to fail back to the text
> > based protocol.
>
> It's hard to argue with "I'm unconvinced". What specifically about
> that argument do you think isn't valid?

We don't normally just say "this is unsupported" for things that are
perfectly valid. And types without binary send/recv functions imo are
perfectly valid. You've made that kind of argument yourself more than
once, no?
It's really not a academic thing. Popular extensions like postgis don't
provide send/recv for all its types. Same with a couple of our own
contrib types (intarray, chkpass, seg, cube).

I guess we need input from others on this point.

> While I am sure the problem can be worked around, it isn't trivial.
> Right now, execute_sql_string() just requests binary format
> unconditionally. To do what you're talking about, we'd need to
> iterate through all of the types and figure out which ones have
> typsend/typreceive functions. If there's a convenience function that
> will do that for us, I don't see it, probably because I believe there
> are exact zero situations where we do that kind of inference today.
> Then, the user backend has to save the format codes from the
> RowDescription message and decide whether to use text or binary. That
> just seems like a silly waste of code and cycles.

The current client/server protocol definitely can do it. You can specify
per column whether you want binary/textual data. I'm pretty sure that at
least the jdbc driver does that.

It's also what I've decided to do for BDR's output plugin. Some builtin
types will be transferred 'as is' if the architecture/version
matches. Otherwise, if available and the version matches, send/recv will
be used (exluding some corner case, but ..). Only if neither is the case
if falls back to the textual protocol.

> I think this actually matters, too, because the question is what we're
> going to do with full-blown parallelism.

I think for parallelism it'll most definitely not be acceptable to not
support types without send/recv. So building some convenience
infrastructure here imo is going to pay off. It's not like it's that
hard to detect - just check OidIsValid(pg_type->typsend). The pg_type
tuple has to be looked up anyway to find the send/recv functions...

> Best would be to actually
> shuttle entire raw tuples between backends; second best, binary
> format; third best, text format or mixture of text and binary. I'm
> not sure what it's reasonable to try to get away with there, but I
> certainly think minimizing IPC costs is going to be an important goal.
> I didn't try to get around with shipping raw tuples here because we
> don't lock types while they're in use, and I'm worried that Bad Things
> Could Happen.

Yea, because of type changes and such I went with using the 'as-is'
format only for builtin types (via att->atttypid < FirstNormalObjectId)
- we know those aren't changed. And the majority of the data usually
uses such types...

> > There's some PGC_BACKEND guc's that normally are only allowed to be set
> > at connection start. I'm not sure whether you're just circumventing that
> > check or whether you didn't hit a problematic case.
> >
> > What does e.g. happen if you set PGOPTIONS='-c
> > local_preload_libraries=auto_explain' after linking auto_explain into
> > the plugin directory?
>
> Eh, I'm embarrassed to admit I don't know exactly how to do this. My
> install directory doesn't seem to have a subdirectory called "plugin".

Sadly you need to create it. Inside the normal $libdir. I.e. $(pg_config
--libdir)/plugins. And then (sym-)link/copy libraries that you want
every user to use in there.

> BTW, are we in agreement, more or less, on patch #3 now, the support
> for error propagation? I'd like to go ahead and commit that if there
> are no further review comments there.

You're talking about
0003-Support-frontend-backend-protocol-communication-usin-v3.patch from
http://archives.postgresql.org/message-id/CA%2BTgmoY9B0mL7Ci1fG1cZ4-zwDHOjjQaKyLowB3F-jsbuvVw2g%40mail.gmail.com?

If so, go ahead from my POV. It seems likely that we'll whack it around
some more, but I don't see an advantage in waiting longer.

Andres

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-10-31 23:12:57
Message-ID: 54541779.1010906@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 10/24/14, 6:17 PM, Jim Nasby wrote:
>>>> - Does anyone have a tangible suggestion for how to reduce the code
>>>> duplication in patch #6?
>>>
>>> Between execute_sql_string() and tcop/exec_simple_query()? Is there stuff in
>>> exec_simple that's not safe for bgwriter? I'm not seeing why we can't use
>>> exec_simple. :/
>>
>> There are some differences if you compare them closely.
>
> Without doing a deep dive, I'm guessing that most of the extra stuff would be safe to re-use; it just wouldn't affect execute_sql_string. Obviously we could add a boolean to exec_simple_query for the case when it's being used by a bgwriter. Though, it seems like the biggest differences have to do with logging
>
> Here's the differences I see:
>
> - Disallowing transaction commands
> - Logging
> - What memory context is used (could we just use that differently in a pg_backend backend?)
> - Portal output format
> - What to do with the output of intermediate commands (surely there's other places we need to handle that? plpgsql maybe?)
>
> I think all of those except logging could be handled by a function serving both exec_simple_query and execute_sql_command that accepts a few booleans (or maybe just one to indicate the caller) and some if's. At least I don't think it'd be too bad, without actually writing it.
>
> I'm not sure what to do about logging. If the original backend has logging turned on, is it that horrible to do the same logging as exec_simple_query would do?
>
> Another option would be factoring out parts of exec_simple_query; the for loop over the parse tree might be a good candidate. But I suspect that would be uglier than a separate support function.
>
> I do feel uncomfortable with the amount of duplication there is right now though...

I took a stab at this by refactoring the guts of exec_simple_query (patch attached) into a new function called exec_query_string (also attached in raw form). As indicated it needs a bit more work. In particular, how it's dealing with excluding transactional commands is rather ugly. Why do we need to do that in pg_background?

Andres was concerned about the performance impact of doing this. I tested this by doing

for i in {1..999999}; do echo 'SELECT 1;' >> test.sql; done

and then

time psql -f test.sql > /dev/nul

It appears there may be a ~1% performance hit, but my laptop isn't repeatable enough to be sure. I did try manually in-lining exec_query_string to see if it was the function call causing the issue; it didn't seem to make a difference.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

Attachment Content-Type Size
0001-Refactor-the-guts-of-exec_simple_query.patch text/plain 11.0 KB
exec_query_string.c text/plain 9.1 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-11-10 17:10:49
Message-ID: CA+TgmobPiT_3Qgjeh3_v+8Cq2nMczkPyAYernF_7_W9a-6T1PA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

So, summarizing the state of this patch set:

- Patches 1, 2, 3, and 5 are committed.

- Patch 4 has had some review and really, as far as I can tell, the
only really major issue that has cropped up is the possibility that
the GUC settings that are legal in backend A aren't legal in backend B
for some reason; in that case, restoring the GUC settings there will
probably fail. While this is a legitimate concern, I think what it
really boils down to is that there's a possibility that there are
libraries loaded in the user backend that are not loaded in the
postmaster and therefore won't get loaded into the background worker.
That's an issue to which we may need to engineer some solution, but
not in this patch. Overall, the patch's architecture is modeled
closely on the way we synchronize GUCs to new backends when using
EXEC_BACKEND, and I don't think we're going do any better than stating
with that and working to file down the rough edges as we go. So I'd
like to go ahead and commit this.

- Patch 6, pg_background itself, has a number of outstanding issues.
Some people aren't sure it's useful enough to really be worth
bothering with; other people seem quite excited about it, even to the
point of wanting to push it all the way into core. Whatever we
ultimately decide to do there, the patch as it stands today is clearly
not ready for commit. The issues I know of are:

-- I still haven't written documentation for it.
-- There's some code duplication with exec_simple_query() that doesn't
feel great, but it's not clear that there's a superior alternative. I
think we certainly don't want to complicate exec_simple_query() to
make pg_background happy.
-- We should add REVOKE to the SQL script that installs it so that
non-superusers can't use it unless the superuser decides to grant them
rights.
-- It doesn't handle types without send/receive functions. I thought
that was tolerable since the functions without such types seem like
mostly edge cases, but Andres doesn't like it. Apparently, BDR is
makes provisions to either ship the tuple as one big blob - if all
built-in types - or else use binary format where supported and text
format otherwise. Since this issue is common to BDR, parallelism, and
pg_background, we might want to introduce some common infrastructure
for it, but it's not too clear to me right now what that should look
like.
-- Evil things can happen if the background process changes client_encoding.
-- Even if I fix all that stuff, it's not entirely clear that there's
a consensus in favor of the patch at all.

I certainly think that as far as this CommitFest is concerned, patch 6
ought to be considered Returned with Feedback. Many thanks to Andres
and all for reviewing. What I am less certain about is whether it
makes sense to spend the energy fixing the above issues in the
short-term. I wrote pg_background mostly a demonstration that the
rest of these patches work and can be used to accomplish useful stuff,
whether as part of parallelism or otherwise; and there's a lot of work
left to be done on parallelism proper that won't necessarily benefit
from polishing pg_background first.

So, questions:

1. Any other opinions for or against pg_background as a concept? I
thought the ability to kick of vacuums (or other stuff with
PreventTransactionChain) asynchronously was pretty cool, as we as the
ability to use it as an authentication-free loopback dblink. But
opinions obviously vary.

2. Is anyone sufficiently interested in pg_background as a concept
that they'd be willing to take over the patch and work on the TODO
list mentioned above?

Thanks,

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-11-10 18:29:13
Message-ID: 20141110182913.GH28007@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2014-11-10 12:10:49 -0500, Robert Haas wrote:
> - Patch 4 has had some review and really, as far as I can tell, the
> only really major issue that has cropped up is the possibility that
> the GUC settings that are legal in backend A aren't legal in backend B
> for some reason; in that case, restoring the GUC settings there will
> probably fail. While this is a legitimate concern, I think what it
> really boils down to is that there's a possibility that there are
> libraries loaded in the user backend that are not loaded in the
> postmaster and therefore won't get loaded into the background worker.

I'm not too concerned about that one.

> That's an issue to which we may need to engineer some solution, but
> not in this patch. Overall, the patch's architecture is modeled
> closely on the way we synchronize GUCs to new backends when using
> EXEC_BACKEND, and I don't think we're going do any better than stating
> with that and working to file down the rough edges as we go. So I'd
> like to go ahead and commit this.

Did you check out whether PGC_BACKEND GUCs work? Other than that I agree
that we can just solve further issues as we notice them. This isn't
particularly intrustive.

> -- There's some code duplication with exec_simple_query() that doesn't
> feel great, but it's not clear that there's a superior alternative. I
> think we certainly don't want to complicate exec_simple_query() to
> make pg_background happy.

Yea, I think we can live with it if really necessary.

> -- We should add REVOKE to the SQL script that installs it so that
> non-superusers can't use it unless the superuser decides to grant them
> rights.

Right. That seems trivial enough.

> -- It doesn't handle types without send/receive functions. I thought
> that was tolerable since the functions without such types seem like
> mostly edge cases, but Andres doesn't like it. Apparently, BDR is
> makes provisions to either ship the tuple as one big blob - if all
> built-in types - or else use binary format where supported and text
> format otherwise. Since this issue is common to BDR, parallelism, and
> pg_background, we might want to introduce some common infrastructure
> for it, but it's not too clear to me right now what that should look
> like.

I think we definitely need to solve this - but I'm also not at all that
sure how.

For something like pg_background it's pretty trivial to fall back to
in/out when there's no send/recv. It's just a couple of lines, and the
portal code has the necessary provisions. So solving it for
pg_background itself is pretty trivial. But, as you say, pg_background
itself isn't a primary goal (although a nice demonstration, with some
real world applications). There's enough differences between the
parallelism and the replication cases that I'm not entirely sure how
much common ground there is. Namely replication has the additional
concerns of version differences, trust (don't use blobs if you don't
fully trust the other side), and differences in the allocated oids
(especially relevant for arrays which, very annoyingly, embed the oid in
the send/recv format).

> 1. Any other opinions for or against pg_background as a concept? I
> thought the ability to kick of vacuums (or other stuff with
> PreventTransactionChain) asynchronously was pretty cool, as we as the
> ability to use it as an authentication-free loopback dblink. But
> opinions obviously vary.

I think it's a cool concept, but I'm not sure if it's worth the work to
make it fully usable. Or rather, I think it's worthy enough, but I
personally would priorize other stuff.

> 2. Is anyone sufficiently interested in pg_background as a concept
> that they'd be willing to take over the patch and work on the TODO
> list mentioned above?

I personally won't. If we can come up with a sketch of how to deal with
the data transport encoding issue above, I'd be willing to to work on
that specific part. But not pg_background in itself.

Greetings,

Andres Freund


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-11-10 18:37:57
Message-ID: 20141110183757.GO28859@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Andres Freund (andres(at)2ndquadrant(dot)com) wrote:
> > 1. Any other opinions for or against pg_background as a concept? I
> > thought the ability to kick of vacuums (or other stuff with
> > PreventTransactionChain) asynchronously was pretty cool, as we as the
> > ability to use it as an authentication-free loopback dblink. But
> > opinions obviously vary.
>
> I think it's a cool concept, but I'm not sure if it's worth the work to
> make it fully usable. Or rather, I think it's worthy enough, but I
> personally would priorize other stuff.

I've not read through the whole thread/discussionm but I'd put myself in
more-or-less the same boat at this point. I've got a number of other
things on my plate already that need to get done (more RLS cleanup /
consistency, back-patching the ereport() column-privs issue, reviewing
pgAudit, the less-than-superuser privileges work, actually helping out
with the in-progress commitfest..) and so I really doubt I'd be able to
seriously help with pg_background- at least for 9.5, which is coming up
awful fast at this point, if we're going to stick with the 'normal'
schedule and freeze in the spring.

That said, I love the concept and had really been hoping to see it in
9.5, and maybe some at or cron-like ability happening later (yes, I
absolutely feel we need this, though I know others have different
opinions..).

> > 2. Is anyone sufficiently interested in pg_background as a concept
> > that they'd be willing to take over the patch and work on the TODO
> > list mentioned above?
>
> I personally won't. If we can come up with a sketch of how to deal with
> the data transport encoding issue above, I'd be willing to to work on
> that specific part. But not pg_background in itself.

If other things get done or additional resources show up, I'd be
interested in helping, but I don't see either happening in time for 9.5.

Thanks!

Stephen


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-11-10 18:45:26
Message-ID: 20141110184526.GF1791@alvin.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Stephen Frost wrote:

> at least for 9.5, which is coming up awful fast at this point, if
> we're going to stick with the 'normal' schedule and freeze in the
> spring.

https://wiki.postgresql.org/wiki/PgCon_2014_Developer_Meeting#9.5_Schedule

Doesn't look all that "normal" to me, with that commitfest in Feb.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-11-10 19:54:15
Message-ID: CA+Tgmobr9u15pE9d6gTD0DLwnjLwSDyW72PeaKc9-kROB0=Duw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Nov 10, 2014 at 1:29 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
>> That's an issue to which we may need to engineer some solution, but
>> not in this patch. Overall, the patch's architecture is modeled
>> closely on the way we synchronize GUCs to new backends when using
>> EXEC_BACKEND, and I don't think we're going do any better than stating
>> with that and working to file down the rough edges as we go. So I'd
>> like to go ahead and commit this.
>
> Did you check out whether PGC_BACKEND GUCs work? Other than that I agree
> that we can just solve further issues as we notice them. This isn't
> particularly intrustive.

Is the scenario you're concerned about this one:

1. Start postmaster.
2. Start a user session.
3. Change a PGC_BACKEND GUC in postgresql.conf.
4. Reload.
5. Launch a background worker that uses this code.

It should end up with the same values there as the active session, not
the current one from postgresql.conf. But we want to verify that's
the behavior we actually get. Right?

>> -- It doesn't handle types without send/receive functions. I thought
>> that was tolerable since the functions without such types seem like
>> mostly edge cases, but Andres doesn't like it. Apparently, BDR is
>> makes provisions to either ship the tuple as one big blob - if all
>> built-in types - or else use binary format where supported and text
>> format otherwise. Since this issue is common to BDR, parallelism, and
>> pg_background, we might want to introduce some common infrastructure
>> for it, but it's not too clear to me right now what that should look
>> like.
>
> I think we definitely need to solve this - but I'm also not at all that
> sure how.
>
> For something like pg_background it's pretty trivial to fall back to
> in/out when there's no send/recv. It's just a couple of lines, and the
> portal code has the necessary provisions. So solving it for
> pg_background itself is pretty trivial.

That's not really the interesting part of the problem, I think. Yeah,
pg_background can be made to speak text format if we really care. But
the real issue is that even with the binary format, converting a tuple
in on-disk format into a DataRow message so that the other side can
reconstruct the tuple and shove it into an executor slot (or wherever
it wants to shove it) seems like it might be expensive. You might
have data on that; if it's not expensive, stop here. If it is
expensive, what we really want to do is figure out some way to make it
safe to copy the tuple into the shared memory queue, or send it out
over the socket, and have the process on the other end use it without
having to revalidate the whole thing column-by-column.

> But, as you say, pg_background
> itself isn't a primary goal (although a nice demonstration, with some
> real world applications). There's enough differences between the
> parallelism and the replication cases that I'm not entirely sure how
> much common ground there is. Namely replication has the additional
> concerns of version differences, trust (don't use blobs if you don't
> fully trust the other side), and differences in the allocated oids
> (especially relevant for arrays which, very annoyingly, embed the oid in
> the send/recv format).

Yeah. It would be nice to use the same code for deciding what to do
in a particular case. It seems like it ought to be possible to have
one rule for whether a tuple with a given set of data types is safe to
ship in the on-disk format. For pg_background/parallelism, it's
enough if that function returns true; for BDR, there might be
additional criteria applied before deciding to do it that way. But
they could use the same decision tree for part of it.

What would be even better is to find some way to MAKE IT SAFE to send
the undecoded tuple. I'm not sure what that would look like. Locking
the types against concurrent changes? Marking dropped types as
dropped without actually dropping them, and garbage collecting them
later? Providing safer ways to deconstruct tuples? It might help to
start by enumerating what exactly can go wrong here. As far as I can
see, the main risks for pg_background and parallelism are (1) that the
type might get concurrently dropped, leaving us unable to interpret
the received tuple and (2) that the type might get concurrently
modified in some way that leaves us confused about how to interpret
the tuple, as by having the type size change. The chances of a
problem in practice seem remote, since you can't do either of those
things if a table column with that type exists, but I can't convince
myself that there's no way for the type to be modified under us in
such a way that two different backends have different ideas about
whether it exists or how wide it is.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-11-10 20:53:31
Message-ID: 20141110205331.GQ28859@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Alvaro Herrera (alvherre(at)2ndquadrant(dot)com) wrote:
> Stephen Frost wrote:
> > at least for 9.5, which is coming up awful fast at this point, if
> > we're going to stick with the 'normal' schedule and freeze in the
> > spring.
>
> https://wiki.postgresql.org/wiki/PgCon_2014_Developer_Meeting#9.5_Schedule
>
> Doesn't look all that "normal" to me, with that commitfest in Feb.

I'd be more inclined to expect the late release of 9.4 to impact things
than that schedule to really change when we freeze.. Just my 2c though.

Thanks,

Stephen


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-11-10 21:58:15
Message-ID: 20141110215815.GL28007@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2014-11-10 14:54:15 -0500, Robert Haas wrote:
> On Mon, Nov 10, 2014 at 1:29 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> >> That's an issue to which we may need to engineer some solution, but
> >> not in this patch. Overall, the patch's architecture is modeled
> >> closely on the way we synchronize GUCs to new backends when using
> >> EXEC_BACKEND, and I don't think we're going do any better than stating
> >> with that and working to file down the rough edges as we go. So I'd
> >> like to go ahead and commit this.
> >
> > Did you check out whether PGC_BACKEND GUCs work? Other than that I agree
> > that we can just solve further issues as we notice them. This isn't
> > particularly intrustive.
>
> Is the scenario you're concerned about this one:
>
> 1. Start postmaster.
> 2. Start a user session.
> 3. Change a PGC_BACKEND GUC in postgresql.conf.
> 4. Reload.
> 5. Launch a background worker that uses this code.

No, that's not what I was thinking of. Consider:

export pg_libdir=$(pg_config --libdir)
mkdir $pg_libdir/plugins
ln -s $pg_libdir/auto_explain.so $pg_libdir/plugins/
PG_OPTIONS='-c local_preload_libraries=auto_explain' psql

and use pg_background() (or something else using this infrastructure) in
that context.
Without having reread your relevant code I'd expect a:
ERROR: 55P02: parameter "local_preload_libraries" cannot be set after connection start
because the code would try to import the "user session's"
local_preload_libraries values into the background process - after that
actually already has done its initialization.

> It should end up with the same values there as the active session, not
> the current one from postgresql.conf. But we want to verify that's
> the behavior we actually get. Right?

But that's also something worthwile to check.

> >> -- It doesn't handle types without send/receive functions. I thought
> >> that was tolerable since the functions without such types seem like
> >> mostly edge cases, but Andres doesn't like it. Apparently, BDR is
> >> makes provisions to either ship the tuple as one big blob - if all
> >> built-in types - or else use binary format where supported and text
> >> format otherwise. Since this issue is common to BDR, parallelism, and
> >> pg_background, we might want to introduce some common infrastructure
> >> for it, but it's not too clear to me right now what that should look
> >> like.
> >
> > I think we definitely need to solve this - but I'm also not at all that
> > sure how.
> >
> > For something like pg_background it's pretty trivial to fall back to
> > in/out when there's no send/recv. It's just a couple of lines, and the
> > portal code has the necessary provisions. So solving it for
> > pg_background itself is pretty trivial.
>
> That's not really the interesting part of the problem, I think. Yeah,
> pg_background can be made to speak text format if we really care. But
> the real issue is that even with the binary format, converting a tuple
> in on-disk format into a DataRow message so that the other side can
> reconstruct the tuple and shove it into an executor slot (or wherever
> it wants to shove it) seems like it might be expensive. You might
> have data on that; if it's not expensive, stop here. If it is
> expensive, what we really want to do is figure out some way to make it
> safe to copy the tuple into the shared memory queue, or send it out
> over the socket, and have the process on the other end use it without
> having to revalidate the whole thing column-by-column.

Yes, sending the tuples as-is is noticeable win. I've not fully analyzed
where the difference is - I'm suspect it's to a large degree because it
requires less copying/memory allocations. But also some of the send/recv
functions aren't exactly cheap in themselves.

For BDR we've forgone problems around changing type definitions and such
by saying only builtin types get to do the 'as-is' stuff. That can be
expanded later, but that's it for now. For those there's no chance that
the type definition changes or anything.

A general problem is that you really can't allow this to happen outside
a controlled setting - manipulating data on the 'Datum' level allows you
to do bad things.

> > But, as you say, pg_background
> > itself isn't a primary goal (although a nice demonstration, with some
> > real world applications). There's enough differences between the
> > parallelism and the replication cases that I'm not entirely sure how
> > much common ground there is. Namely replication has the additional
> > concerns of version differences, trust (don't use blobs if you don't
> > fully trust the other side), and differences in the allocated oids
> > (especially relevant for arrays which, very annoyingly, embed the oid in
> > the send/recv format).
>
> Yeah. It would be nice to use the same code for deciding what to do
> in a particular case. It seems like it ought to be possible to have
> one rule for whether a tuple with a given set of data types is safe to
> ship in the on-disk format. For pg_background/parallelism, it's
> enough if that function returns true; for BDR, there might be
> additional criteria applied before deciding to do it that way. But
> they could use the same decision tree for part of it.

Yes, that'd be a good idea.

> What would be even better is to find some way to MAKE IT SAFE to send
> the undecoded tuple. I'm not sure what that would look like.

How do you define 'SAFE' here? Safe against concurrent SQL level
activity? Safe against receiving arbitrary data?

I see little chance in being safe against the latter. But imo that's
fine. Neither is WAL shipping. And that's perfectly fine for most
usecases...

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-11-12 16:15:26
Message-ID: CA+TgmobdfKqdsxZYcpUf6J0ZkxgXGkdHT70eZCveWRQ29bppcQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Nov 10, 2014 at 4:58 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> No, that's not what I was thinking of. Consider:
>
> export pg_libdir=$(pg_config --libdir)
> mkdir $pg_libdir/plugins
> ln -s $pg_libdir/auto_explain.so $pg_libdir/plugins/
> PG_OPTIONS='-c local_preload_libraries=auto_explain' psql
>
> and use pg_background() (or something else using this infrastructure) in
> that context.
> Without having reread your relevant code I'd expect a:
> ERROR: 55P02: parameter "local_preload_libraries" cannot be set after connection start
> because the code would try to import the "user session's"
> local_preload_libraries values into the background process - after that
> actually already has done its initialization.

Thanks for these pointers. Your directions turn out to be wrong in
detail, because it's PGOPTIONS not PG_OPTIONS, and the plugins
directory needs to be under $(pg_config --libdir)/postgresql, not just
$(pg_config --libdir). But it was enough to get me on the right
track. Long story short, it seems to work:

[rhaas ~]$ PGOPTIONS='-c local_preload_libraries=auto_explain' psql
psql (9.5devel)
Type "help" for help.

rhaas=# show local_preload_libraries;
local_preload_libraries
-------------------------
auto_explain
(1 row)

rhaas=# select * from pg_background_result(pg_background_launch('show
local_preload_libraries')) as (x text);
x
--------------
auto_explain
(1 row)

This is what I expected, and I think what we want. Background workers
enlisted for parallelism (or pg_background) need to be able to have
the same values for GUCs as the user backend that started them, and
our definition of setting things "at backend start" needs to be
expansive enough to include stuff we pull out of a dynamic shared
memory segment shortly thereafter.

>> It should end up with the same values there as the active session, not
>> the current one from postgresql.conf. But we want to verify that's
>> the behavior we actually get. Right?
>
> But that's also something worthwile to check.

Mmph. There's a problem here. I tried changing
local_preload_libraries='auto_explain' in postgresql.conf and then
sending PGC_SIGHUP. A session started before that change does indeed
create a worker with that value still empty, but a session started
after that change also creates a worker with that value still empty.
Oops. Not sure why that's happening yet, will investigate.

>> What would be even better is to find some way to MAKE IT SAFE to send
>> the undecoded tuple. I'm not sure what that would look like.
>
> How do you define 'SAFE' here? Safe against concurrent SQL level
> activity? Safe against receiving arbitrary data?

The first one. The code powering the background worker is just as
trustworthy as any other part of the backend, so it can be subverted
if you load malicious C code into the backend, but not otherwise.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-11-12 16:19:53
Message-ID: 20141112161953.GA29112@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2014-11-12 11:15:26 -0500, Robert Haas wrote:
> On Mon, Nov 10, 2014 at 4:58 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> > No, that's not what I was thinking of. Consider:
> >
> > export pg_libdir=$(pg_config --libdir)
> > mkdir $pg_libdir/plugins
> > ln -s $pg_libdir/auto_explain.so $pg_libdir/plugins/
> > PG_OPTIONS='-c local_preload_libraries=auto_explain' psql
> >
> > and use pg_background() (or something else using this infrastructure) in
> > that context.
> > Without having reread your relevant code I'd expect a:
> > ERROR: 55P02: parameter "local_preload_libraries" cannot be set after connection start
> > because the code would try to import the "user session's"
> > local_preload_libraries values into the background process - after that
> > actually already has done its initialization.
>
> Thanks for these pointers. Your directions turn out to be wrong in
> detail, because it's PGOPTIONS not PG_OPTIONS, and the plugins
> directory needs to be under $(pg_config --libdir)/postgresql, not just
> $(pg_config --libdir). But it was enough to get me on the right
> track. Long story short, it seems to work:

Sorry, was extracting it from a larger local script...

> [rhaas ~]$ PGOPTIONS='-c local_preload_libraries=auto_explain' psql
> psql (9.5devel)
> Type "help" for help.
>
> rhaas=# show local_preload_libraries;
> local_preload_libraries
> -------------------------
> auto_explain
> (1 row)
>
> rhaas=# select * from pg_background_result(pg_background_launch('show
> local_preload_libraries')) as (x text);
> x
> --------------
> auto_explain
> (1 row)

The question is whether the library is actually loaded in that case?
Because that normally only happens early during startup - which is why
it's a PGC_BACKEND guc.

> > How do you define 'SAFE' here? Safe against concurrent SQL level
> > activity? Safe against receiving arbitrary data?
>
> The first one. The code powering the background worker is just as
> trustworthy as any other part of the backend, so it can be subverted
> if you load malicious C code into the backend, but not otherwise.

Ah, good. Because the latter sounds quite hard, if not impossible lest
we recreate send/recv ;)

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-11-12 16:36:14
Message-ID: CA+TgmoZBcUaiSduCEM93j1q9y2AD2PPqUHQos1d3HeqiyYem5w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Nov 12, 2014 at 11:19 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> The question is whether the library is actually loaded in that case?
> Because that normally only happens early during startup - which is why
> it's a PGC_BACKEND guc.

It looks like that does not work.

[rhaas pgsql]$ PGOPTIONS='-c local_preload_libraries=auto_explain' psql
psql (9.5devel)
Type "help" for help.

rhaas=# select * from pg_background_result(pg_background_launch('show
auto_explain.log_min_duration')) as (x text);
ERROR: unrecognized configuration parameter "auto_explain.log_min_duration"
CONTEXT: background worker, pid 31316

So, there's more to be done here. Rats.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-11-12 17:15:47
Message-ID: 20141112171547.GC13473@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2014-11-12 11:36:14 -0500, Robert Haas wrote:
> On Wed, Nov 12, 2014 at 11:19 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> > The question is whether the library is actually loaded in that case?
> > Because that normally only happens early during startup - which is why
> > it's a PGC_BACKEND guc.
>
> It looks like that does not work.
>
> [rhaas pgsql]$ PGOPTIONS='-c local_preload_libraries=auto_explain' psql
> psql (9.5devel)
> Type "help" for help.
>
> rhaas=# select * from pg_background_result(pg_background_launch('show
> auto_explain.log_min_duration')) as (x text);
> ERROR: unrecognized configuration parameter "auto_explain.log_min_duration"
> CONTEXT: background worker, pid 31316
>
> So, there's more to be done here. Rats.

We could just say having PGC_BACKEND guc's that aren't set to their
config files aren't supported for anything parallel. I find that a
reasonable thing - otherwise pooling of bgworkers and all that will
likely be out. And it's not that there are that many important
PGC_BACKEND gucs.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-11-12 17:39:13
Message-ID: CA+TgmoZBeVaUAZ2FpwAg99YqmpU5TptmsVMihE-w3Xgy8UYmiw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Nov 12, 2014 at 11:36 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Wed, Nov 12, 2014 at 11:19 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
>> The question is whether the library is actually loaded in that case?
>> Because that normally only happens early during startup - which is why
>> it's a PGC_BACKEND guc.
>
> It looks like that does not work.
>
> [rhaas pgsql]$ PGOPTIONS='-c local_preload_libraries=auto_explain' psql
> psql (9.5devel)
> Type "help" for help.
>
> rhaas=# select * from pg_background_result(pg_background_launch('show
> auto_explain.log_min_duration')) as (x text);
> ERROR: unrecognized configuration parameter "auto_explain.log_min_duration"
> CONTEXT: background worker, pid 31316
>
> So, there's more to be done here. Rats.

It turned out to be quite simple to fix both problems. This
particular case fails because the call that loads the libraries
specified by session_preload_libraries and local_preload_libraries is
in PostgresMain() and thus never gets called by pg_background. I
fixed that by adding that call to pg_background in the appropriate
place. While I was at it, I added the REVOKE statements we discussed
earlier to pg_background's .sql file.

The other problem was due to this code in set_config_option:

/*
* If a PGC_BACKEND or PGC_SU_BACKEND
parameter is changed in
* the config file, we want to accept
the new value in the
* postmaster (whence it will propagate to
* subsequently-started backends), but
ignore it in existing
* backends. This is a tad klugy, but
necessary because we
* don't re-read the config file
during backend start.
*
* In EXEC_BACKEND builds, this works
differently: we load all
* nondefault settings from the
CONFIG_EXEC_PARAMS file during
* backend start. In that case we
must accept PGC_SIGHUP
* settings, so as to have the same
value as if we'd forked
* from the postmaster. We detect
this situation by checking
* IsInitProcessingMode, which is a
bit ugly, but it doesn't
* seem worth passing down an explicit
flag saying we're doing
* read_nondefault_variables().
*/
#ifdef EXEC_BACKEND
if (IsUnderPostmaster &&
!IsInitProcessingMode())
return -1;
#else
if (IsUnderPostmaster)
return -1;
#endif

When restoring variables via RestoreGUCState(), we need the same kind
of special-handling that we do when running in EXEC_BACKEND mode and
restoring variables via read_nondefault_variables(). Extending the
IsInitProcessingMode kludge() doesn't look appealing, so I instead
added the flag contemplated by the comment.

Updated patches attached.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
0004-Add-infrastructure-to-save-and-restore-GUC-values-v2.patch text/x-patch 21.2 KB
0006-pg_background-Run-commands-in-a-background-worker-an-v5.patch text/x-patch 35.3 KB

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-11-20 12:30:40
Message-ID: CAA4eK1J2kw2Pb4uK=iwPVVVAF_cquDMoN-gLkVGnMqcoYwKv-w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Nov 12, 2014 at 11:09 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Wed, Nov 12, 2014 at 11:36 AM, Robert Haas <robertmhaas(at)gmail(dot)com>
wrote:
> > On Wed, Nov 12, 2014 at 11:19 AM, Andres Freund <andres(at)2ndquadrant(dot)com>
wrote:
> >> The question is whether the library is actually loaded in that case?
> >> Because that normally only happens early during startup - which is why
> >> it's a PGC_BACKEND guc.
> >
> > It looks like that does not work.
> >
> > [rhaas pgsql]$ PGOPTIONS='-c local_preload_libraries=auto_explain' psql
> > psql (9.5devel)
> > Type "help" for help.
> >
> > rhaas=# select * from pg_background_result(pg_background_launch('show
> > auto_explain.log_min_duration')) as (x text);
> > ERROR: unrecognized configuration parameter
"auto_explain.log_min_duration"
> > CONTEXT: background worker, pid 31316
> >
> > So, there's more to be done here. Rats.
>
> It turned out to be quite simple to fix both problems.
>
> Updated patches attached.
>

Few compilation errors in the patch:
1>contrib\postgres_fdw\postgres_fdw.c(2107): error C2198:
'set_config_option' : too few arguments for call
1>contrib\postgres_fdw\postgres_fdw.c(2111): error C2198:
'set_config_option' : too few arguments for call
1>contrib\postgres_fdw\postgres_fdw.c(2115): error C2198:
'set_config_option' : too few arguments for call
2>contrib\dblink\dblink.c(2983): error C2198: 'set_config_option' : too few
arguments for call

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-11-20 16:00:01
Message-ID: CA+TgmobpnEtawmqC-U3F17kaHHm2SDnp5niv=7Bw9KBAGeTNmA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Nov 20, 2014 at 7:30 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> Few compilation errors in the patch:
> 1>contrib\postgres_fdw\postgres_fdw.c(2107): error C2198:
> 'set_config_option' : too few arguments for call
> 1>contrib\postgres_fdw\postgres_fdw.c(2111): error C2198:
> 'set_config_option' : too few arguments for call
> 1>contrib\postgres_fdw\postgres_fdw.c(2115): error C2198:
> 'set_config_option' : too few arguments for call
> 2>contrib\dblink\dblink.c(2983): error C2198: 'set_config_option' : too few
> arguments for call

Oops. Good catch. Fixed in the attached version.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
0004-Add-infrastructure-to-save-and-restore-GUC-values-v3.patch text/x-patch 22.2 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-11-24 22:19:53
Message-ID: CA+TgmoYqScwimccJNJd7r91vJZH_vAd=Ziu3iHTfzWRRks+9MQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Nov 20, 2014 at 11:00 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Thu, Nov 20, 2014 at 7:30 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>> Few compilation errors in the patch:
>> 1>contrib\postgres_fdw\postgres_fdw.c(2107): error C2198:
>> 'set_config_option' : too few arguments for call
>> 1>contrib\postgres_fdw\postgres_fdw.c(2111): error C2198:
>> 'set_config_option' : too few arguments for call
>> 1>contrib\postgres_fdw\postgres_fdw.c(2115): error C2198:
>> 'set_config_option' : too few arguments for call
>> 2>contrib\dblink\dblink.c(2983): error C2198: 'set_config_option' : too few
>> arguments for call
>
> Oops. Good catch. Fixed in the attached version.

Committed.

Unfortunately, I forgot to credit you and Andres as reviewers; sorry about that.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company