Re: Proof of concept: standalone backend with full FE/BE protocol

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Dimitri Fontaine <dimitri(at)2ndquadrant(dot)fr>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, simon(at)2ndQuadrant(dot)com, Merlin Moncure <mmoncure(at)gmail(dot)com>, Gurjeet Singh <singh(dot)gurjeet(at)gmail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proof of concept: standalone backend with full FE/BE protocol
Date: 2013-11-14 06:41:00
Message-ID: CAA4eK1JDDtjfi_RfRxRs98_j=OQGcmwfo35Yug78b+RMj9ekmA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I have gone through the mail chain of this thread and tried to find
the different concerns or open ends for this patch.

Summarisation of the discussion and concerns for this patch:

1. Security concern in interface
2. Security concern in Windows implementation
3. Handling of Ctrl-C/SIGTERM
4. Secondary connections for maintenance activities, replication
5. Windows Implementation - what should be behaviour for admin users?
6. Restricting operation's in single backend mode
7. Proposal related to maintenance activities

Description of each concern
----------------------------------------------

1. Security concern in interface -

Interface
---------------
$ psql "standalone_datadir = $PGDATA dbname = regression"
There is another option "standalone_backend", which can be set to
specify which postgres executable to launch.
If the latter isn't specified, libpq defaults to trying the
installation PGBINDIR that was selected by configure.

Security Concern
-----------------------------
If a user can specify libpq connection options, he can now execute
any file he wants by passing it as standalone_backend.

Method to resolve Security concern
--------------------------------------------------------
If an application wants to allow these connection parameters to be
used, it would need to do PQenableStartServer() first. If it doesn't,
those connection parameters will be rejected.

2. Security concern in Windows implementation -

Interface
---------------
PQcancel -
In Unix, need to use kill(conn->postgres_pid, SIGINT)
In Windows, pgkill(int pid, int sig) API can be used.

Security concern
---------------------------
pgkill is used to send cancel signal using pipe mechanism, so
someone else can create a pipe with our name before we do (since we
use the actual name - it's \\.\pipe\pgsinal_<pid>), by
guessing what pid we will have. If that happens, we'll go into a
loop and try to recreate it while logging a warning message to
eventlog/stderr. (this happens for every backend). We can't
throw an error on this and kill the backend because the pipe is
created in the background thread not the main one.

Some suggestions
------------------------------
Once it is detected that already a same Named Pipe already exists,
there can be following options:
a. try to create with some other name, but in that case how to
communicate the new name to client end of pipe. Some solution can be
thought if this approach seems to be reasonable,
though currently I don't have any in mind.
b. give error, as creation of pipe is generally at beginning of
process creation(backend)
c. any other better solution?

3. Handling of Ctrl-C/SIGTERM

Behaviour
---------------
If you kill the client, the child postgres will see connection
closure and will shut down.

Concern
--------------
will make scripting harder because you cannot start another single
backend pg_dump before the old backend noticed it, checkpointed and
shut down. It can happen if you forcibly kill
pg_dump (or some other client) and then immediately try to start a
new one, it's not clear how long you'll have to wait.

Suggestions for alternatives for this case
-------------------------------------------------------------
a. There is no expectation that a standalone PG implementation
would provide performance for a series of standalone sessions that is
equivalent to what you'd get from a persistent
server. If that scenario is what's important to you, you'd use
a persistent server.
b. An extra libpq call to handle this case can be helpful.

4. Secondary connections for data access

Proposal
---------------
A single-user connection database with "no administrative hassles"

Concerns
-----------------
As this proposal will not allow any data it stores to be accessed
by another connection, so all forms of replication are excluded and
all maintenance actions force the database to be
unavailable for a period of time. Those two things are barriers of
the most major kind to anybody working in an enterprise with connected
data and devices.

Suggestions for it's use or make it usable
----------------------------------------------------------------
a. a usable & scriptable --single mode is justification enough.
Having to wait for hours just enter one more command because --single
doesn't support any scripts sucks. Especially in
recovery situations.
b. it's worth having this particular thing because it makes
pg_upgrade more robust.
c. some competing solutions already provide similar solution
(http://www.firebirdsql.org/manual/fbmetasecur-embedded.html).
d. we need to make sure that this isn't foreclosing the option of
having a multi-process environment with a single user connection. I
don't see that it is, but it might be wise to sketch
exactly how that case would work before accepting this.

5. Windows Implementation - what should be behaviour for admin users

Behavior clarification -
does this follow the behavior that admin users will not be allowed
to invoke postgres child process?

6. Restricting operation's in single backend mode

Serializable transactions could skip all the SSI predicate locking
and conflict checking when in single-connection mode. With only one
connection the transactions could never overlap, so
there would be no chance of serialization anomalies when running
snapshot isolation.

It could be of use if someone had code they wanted to run under
both normal and single-connection modes. For single-connection only,
they could just choose REPEATABLE READ to
get exactly the same semantics.

7. Proposal related to maintainence activities

For maintainence activities, in longer run, we can have a
postmaster process that isn't listening on any ports, but is managing
background processes in addition to a single child backend.

As per my understanding, to complete this patch we need to
a. complete the work for #1, #2, #5
b. #6 and #7 can be done as enhancements after the initial feature is committed
c. need to decide what should we do for #3 and #4.

Rebased the patch (changes in patch)
-----------------------------------------------------------
a. fillPGconn(), loops through each connection option, so no need to
do it separately for standalone_datadir and standalone_backend
b. In function, ChildPostgresMain()->PostgresMain() pass third
parameter dbname as NULL.
c. Changed second parameter of read_standalone_child_variables() from
"int *" to "pgsocket *" to remove warning.
d. removed trailing white spaces.
e. update PQconninfoOptions array to include offset.

Any objections for adding this idea/patch to CF?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
standalone_backend.3.patch application/octet-stream 21.0 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2013-11-14 06:47:22 Re: Proof of concept: standalone backend with full FE/BE protocol
Previous Message Tatsuo Ishii 2013-11-14 06:08:06 Re: tcp_keepalives_idle