Concurrent connections in psql patch

Lists: pgsql-hackerspgsql-patches
From: stark <stark(at)enterprisedb(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Going for "all green" buildfarm results
Date: 2006-08-17 15:17:01
Message-ID: 87oduj75oy.fsf@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


> Alvaro Herrera <alvherre ( at ) commandprompt ( dot ) com> writes:
>> Maybe we could write a suitable test case using Martijn's concurrent
>> testing framework.
>
> The trick is to get process A to commit between the times that process B
> looks at the new and old versions of the pg_class row (and it has to
> happen to do so in that order ... although that's not a bad bet given
> the way btree handles equal keys).
>
> I think the reason we've not tracked this down before is that that's a
> pretty small window. You could force the problem by stopping process B
> with a debugger breakpoint and then letting A do its thing, but short of
> something like that you'll never reproduce it with high probability.

Actually I was already looking into a related issue and have some work here
that may help with this.

I wanted to test the online index build and to do that I figured you needed to
have regression tests like the ones we have now except with multiple database
sessions. So I hacked psql to issue queries asynchronously and allow multiple
database connections. That way you can switch connections while a blocked or
slow transaction is still running and issue queries in other transactions.

I thought it was a proof-of-concept kludge but actually it's worked out quite
well. There were a few conceptual gotchas but I think I have a reasonable
solution for each.

The main issue was that any time you issue an asynchronously connection that
you expect to block you have a race condition in the test. You can't switch
connections and proceed right away or you may actually proceed with the other
connection before the first connection's command is received and acted on by
the backend.

The "right" solution to this would involve altering the backend and the
protocol to provide some form of feedback when an asynchronous query had
reached various states including when it was blocked. You would have to
annotate it with enough information that the client can determine it's
actually blocked on the right thing and not just on some uninteresting
transient lock too.

Instead I just added a command to cause psql to wait for a time. This is
nearly as good since all the regression tests run fairly quickly so if you
wait even a fraction of a second you can be pretty certain the command has
been received and if it were not going to block it would have finished and
printed output already. And it was *much* simpler.

Also, I think for interactive use we would want a somewhat more sophisticated
scheduling of output. It would be nice to print out results as they come in
even if we're on another connection. For the regression tests you certainly do
not want that since that would introduce unavoidable non-deterministic race
conditions in your output files all over the place. The way I've coded it now
takes care to print out output only from the "active" database connection and
the test cases need to be written to switch connections at each point they
want to test for possibly incorrect output.

Another issue was that I couldn't come up with a nice set of names for the
commands that didn't conflict with the myriad of one-letter commands already
in psql. So I just prefixed the all with "c" (connection). I figured when I
submitted it I would just let the community hash out the names and take the 2s
it would take to change them.

The test cases are actually super easy to write and read, at least considering
we're talking about concurrent sql sessions here. I think it's far clearer
than trying to handle separate scripts and nearly as clear as Martin's
proposal from a while back to prepend a connection number on every line.

The commands I've added or altered are:

\c[onnect][&] [DBNAME|- USER|- HOST|- PORT|-]
connect to new database (currently "postgres")
if optional & is present open do not close existing connection
\cswitch n
switch to database connection n
\clist
list database connections
\cdisconnect
close current database connection
use \cswitch or \connect to select another connection
\cnowait
issue next query without waiting for results
\cwait [n]
if any queries are pending wait n seconds for results

Also I added %& to the psql prompt format to indicate the current connection.

So the tests look like, for example:

postgres=# \c&
[2] You are now connected to database "postgres".
postgres[2]=# begin;
BEGIN
postgres[2]=# create table foo (a integer);
CREATE TABLE
postgres[2]=# \cswitch 1
[1] You are now connected to database "postgres"
postgres[1]=# select * from foo;
ERROR: relation "foo" does not exist
postgres[1]=# \cswitch 2
[2] You are now connected to database "postgres"
postgres[2]=# commit;
COMMIT
postgres[2]=# \cswitch 1
[1] You are now connected to database "postgres"
postgres[1]=# select * from foo;
a
---
(0 rows)

postgres[1]=# insert into foo values (1);
INSERT 0 1
postgres[1]=# begin;
BEGIN
postgres[1]=# update foo set a = 2;
UPDATE 1
postgres[1]=# \cswitch 2
[2] You are now connected to database "postgres"
postgres[2]=# select * from foo;
a
---
1
(1 row)

postgres[2]=# \cnowait
postgres[2]=# update foo set a = 3;
postgres[2]=# \cwait .1
postgres[2]=# \cswitch 1
[1] You are now connected to database "postgres"
postgres[1]=# commit;
COMMIT
postgres[1]=# \cswitch 2
[2] You are now connected to database "postgres"
UPDATE 1
postgres[2]=# \clist
[1] Connected to database "postgres"
[2] Connected to database "postgres"
postgres[2]=# \cdisconnect
Disconnecting from database (use \connect to reconnect or \cswitch to select another connection)
!> \cswitch 1
[1] You are now connected to database "postgres"

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: stark <stark(at)enterprisedb(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Going for "all green" buildfarm results
Date: 2006-08-17 19:09:30
Message-ID: 20060817190930.GA318@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

stark wrote:

> Actually I was already looking into a related issue and have some work here
> that may help with this.
>
> I wanted to test the online index build and to do that I figured you needed to
> have regression tests like the ones we have now except with multiple database
> sessions. So I hacked psql to issue queries asynchronously and allow multiple
> database connections. That way you can switch connections while a blocked or
> slow transaction is still running and issue queries in other transactions.
>
> I thought it was a proof-of-concept kludge but actually it's worked out quite
> well. There were a few conceptual gotchas but I think I have a reasonable
> solution for each.

I have had an idea for some time that is actually much simpler -- just
launch several backends at once to do different things, and randomly
send SIGSTOP and SIGCONT to each. If they keep doing whatever they are
doing in infinite loops, and you leave it enough time, it's very likely
that you'll get problems if the concurrent locking (or whatever) is not
right.

The nice thing about this is that it's completely random, i.e. you don't
have to introduce individual stop points in the backend (which may
themselves hide some bugs). It acts (or at least, I expect it to act)
just like the kernel gave execution to another process.

The main difference with your approach is that I haven't tried it.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
To: stark <stark(at)enterprisedb(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Going for "all green" buildfarm results
Date: 2006-08-17 20:15:12
Message-ID: 20060817201512.GE21363@pervasive.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Thu, Aug 17, 2006 at 04:17:01PM +0100, stark wrote:
> I wanted to test the online index build and to do that I figured you needed to
> have regression tests like the ones we have now except with multiple database
> sessions. So I hacked psql to issue queries asynchronously and allow multiple
> database connections. That way you can switch connections while a blocked or
> slow transaction is still running and issue queries in other transactions.

Wow, that's damn cool! FWIW, one thing I can think of that would be
useful is the ability to 'background' a long-running query. I see
\cnowait, but having something like & from unix shells would be even
easier. It'd also be great to have the equivalent of ^Z so that if you
got tired of waiting on a query, you could get back to the psql prompt
without killing it.

> Also, I think for interactive use we would want a somewhat more sophisticated
> scheduling of output. It would be nice to print out results as they come in
> even if we're on another connection. For the regression tests you certainly do
> not want that since that would introduce unavoidable non-deterministic race
> conditions in your output files all over the place. The way I've coded it now
> takes care to print out output only from the "active" database connection and
> the test cases need to be written to switch connections at each point they
> want to test for possibly incorrect output.

Thinking in terms of tcsh & co, there's a number of ways to handle this:

1) Output happens real-time
2) Only output from current connection (what you've done)
3) Only output after user input (ie: code that handles output is only
run after the user has entered a command). I think most shells
operate this way by default.
4) Provide an indication that output has come in from a background
connection, but don't provide the actual output. This could be
combined with #3.

#3 is nice because you won't get interrupted in the middle of entering
some long query. #4 could be useful for automated testing, especially if
the indicator was routed to another output channel, such as STDERR.

> Another issue was that I couldn't come up with a nice set of names for the
> commands that didn't conflict with the myriad of one-letter commands already
> in psql. So I just prefixed the all with "c" (connection). I figured when I
> submitted it I would just let the community hash out the names and take the 2s
> it would take to change them.
>
> The test cases are actually super easy to write and read, at least considering
> we're talking about concurrent sql sessions here. I think it's far clearer
> than trying to handle separate scripts and nearly as clear as Martin's
> proposal from a while back to prepend a connection number on every line.
>
> The commands I've added or altered are:
>
> \c[onnect][&] [DBNAME|- USER|- HOST|- PORT|-]
> connect to new database (currently "postgres")
> if optional & is present open do not close existing connection
> \cswitch n
> switch to database connection n

I can see \1 - \9 as being a handy shortcut.
> \clist
> list database connections
> \cdisconnect
> close current database connection
> use \cswitch or \connect to select another connection

Would ^d have the same effect?
--
Jim C. Nasby, Sr. Engineering Consultant jnasby(at)pervasive(dot)com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461


From: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
To: stark <stark(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Going for "all green" buildfarm results
Date: 2006-08-17 20:17:10
Message-ID: 20060817201710.GF21363@pervasive.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Thu, Aug 17, 2006 at 03:09:30PM -0400, Alvaro Herrera wrote:
> stark wrote:
>
> > Actually I was already looking into a related issue and have some work here
> > that may help with this.
> >
> > I wanted to test the online index build and to do that I figured you needed to
> > have regression tests like the ones we have now except with multiple database
> > sessions. So I hacked psql to issue queries asynchronously and allow multiple
> > database connections. That way you can switch connections while a blocked or
> > slow transaction is still running and issue queries in other transactions.
> >
> > I thought it was a proof-of-concept kludge but actually it's worked out quite
> > well. There were a few conceptual gotchas but I think I have a reasonable
> > solution for each.
>
> I have had an idea for some time that is actually much simpler -- just
> launch several backends at once to do different things, and randomly
> send SIGSTOP and SIGCONT to each. If they keep doing whatever they are
> doing in infinite loops, and you leave it enough time, it's very likely
> that you'll get problems if the concurrent locking (or whatever) is not
> right.

This is probably worth doing as well, since it would simulate what an
IO-bound system would look like.
--
Jim C. Nasby, Sr. Engineering Consultant jnasby(at)pervasive(dot)com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
Cc: stark <stark(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Going for "all green" buildfarm results
Date: 2006-08-18 00:33:58
Message-ID: 995.1155861238@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

"Jim C. Nasby" <jnasby(at)pervasive(dot)com> writes:
> On Thu, Aug 17, 2006 at 03:09:30PM -0400, Alvaro Herrera wrote:
>> I have had an idea for some time that is actually much simpler -- just
>> launch several backends at once to do different things, and randomly
>> send SIGSTOP and SIGCONT to each. If they keep doing whatever they are
>> doing in infinite loops, and you leave it enough time, it's very likely
>> that you'll get problems if the concurrent locking (or whatever) is not
>> right.

> This is probably worth doing as well, since it would simulate what an
> IO-bound system would look like.

While that might be useful for testing, it'd absolutely suck for
debugging, because of the difficulty of reproducing a problem :-(

regards, tom lane


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Cc: stark <stark(at)enterprisedb(dot)com>
Subject: Re: Going for "all green" buildfarm results
Date: 2006-08-18 12:46:39
Message-ID: 200608181446.40273.peter_e@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Am Donnerstag, 17. August 2006 17:17 schrieb stark:
> Instead I just added a command to cause psql to wait for a time.

Do we need the full multiple-connection handling command set, or would
asynchronous query support and a wait command be enough?

--
Peter Eisentraut
http://developer.postgresql.org/~petere/


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org, stark <stark(at)enterprisedb(dot)com>
Subject: Re: Going for "all green" buildfarm results
Date: 2006-08-18 13:50:33
Message-ID: 20060818135033.GC20754@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Fri, Aug 18, 2006 at 02:46:39PM +0200, Peter Eisentraut wrote:
> Am Donnerstag, 17. August 2006 17:17 schrieb stark:
> > Instead I just added a command to cause psql to wait for a time.
>
> Do we need the full multiple-connection handling command set, or would
> asynchronous query support and a wait command be enough?

I am interested in this too. For example the tool I posted a while ago
supported only this. It controlled multiple connections and only
supported sending async & wait.

It is enough to support fairly deterministic scenarios, for example,
testing if the locks block on eachother as documented. However, it
works less well for non-deterministic testing. Yet, a test-suite has to
be deterministic, right?

From a client side, is there any testing method better than async and
wait? I've wondered about a tool that attached to the backend with gdb
and for testing killed the backend when it hit a particular function.
By selecting different functions each time, once you'd covered a lot of
functions and tested recovery, you could have a good idea if the
recovery code works properly.

Has anyone seens a tool like that?

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: stark <stark(at)enterprisedb(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Going for "all green" buildfarm results
Date: 2006-08-18 14:01:20
Message-ID: 44E5C830.6090605@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

stark wrote:
>> Alvaro Herrera <alvherre ( at ) commandprompt ( dot ) com> writes:
>>
>>> Maybe we could write a suitable test case using Martijn's concurrent
>>> testing framework.
>>>
>> The trick is to get process A to commit between the times that process B
>> looks at the new and old versions of the pg_class row (and it has to
>> happen to do so in that order ... although that's not a bad bet given
>> the way btree handles equal keys).
>>
>> I think the reason we've not tracked this down before is that that's a
>> pretty small window. You could force the problem by stopping process B
>> with a debugger breakpoint and then letting A do its thing, but short of
>> something like that you'll never reproduce it with high probability.
>>
>
> Actually I was already looking into a related issue and have some work here
> that may help with this.
>
> I wanted to test the online index build and to do that I figured you needed to
> have regression tests like the ones we have now except with multiple database
> sessions. So I hacked psql to issue queries asynchronously and allow multiple
> database connections. That way you can switch connections while a blocked or
> slow transaction is still running and issue queries in other transactions.
>
> I thought it was a proof-of-concept kludge but actually it's worked out quite
> well. There were a few conceptual gotchas but I think I have a reasonable
> solution for each.
>
>

[snip]

Can you please put the patch up somewhere so people can see what's involved?

thanks

cheers

andrew


From: Gregory Stark <gsstark(at)mit(dot)edu>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Going for "all green" buildfarm results
Date: 2006-08-19 10:57:53
Message-ID: 87oduhq9fy.fsf@stark.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


Andrew Dunstan <andrew(at)dunslane(dot)net> writes:

> stark wrote:
>
> > So I hacked psql to issue queries asynchronously and allow multiple
> > database connections. That way you can switch connections while a blocked
> > or slow transaction is still running and issue queries in other
> > transactions.
>
> [snip]
>
> Can you please put the patch up somewhere so people can see what's involved?

I'll send it to pgsql-patches.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com


From: Gregory Stark <gsstark(at)mit(dot)edu>
To: PostgreSQL-development <pgsql-patches(at)postgresql(dot)org>
Subject: Concurrent connections in psql patch
Date: 2006-08-19 11:00:38
Message-ID: 87lkplq9bd.fsf_-_@stark.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


Andrew Dunstan <andrew(at)dunslane(dot)net> writes:

> stark wrote:
>
> > So I hacked psql to issue queries asynchronously and allow multiple
> > database connections. That way you can switch connections while a blocked
> > or slow transaction is still running and issue queries in other
> > transactions.
>
> [snip]
>
> Can you please put the patch up somewhere so people can see what's involved?

As promised:

Attachment Content-Type Size
concurrent-psql-patch.2 application/octet-stream 70.6 KB

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Gregory Stark <gsstark(at)mit(dot)edu>
Cc: PostgreSQL-development <pgsql-patches(at)postgresql(dot)org>
Subject: Re: Concurrent connections in psql patch
Date: 2006-09-02 21:16:23
Message-ID: 200609022116.k82LGNj05415@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


Is this something people are interested in? I am thinking no based on
the lack of requests and the size of the patch.

---------------------------------------------------------------------------

Gregory Stark wrote:
>
> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
>
> > stark wrote:
> >
> > > So I hacked psql to issue queries asynchronously and allow multiple
> > > database connections. That way you can switch connections while a blocked
> > > or slow transaction is still running and issue queries in other
> > > transactions.
> >
> > [snip]
> >
> > Can you please put the patch up somewhere so people can see what's involved?
>
> As promised:
>

[ Attachment, skipping... ]

>
>
>
> --
> Gregory Stark
> EnterpriseDB http://www.enterprisedb.com
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly

--
Bruce Momjian bruce(at)momjian(dot)us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Gregory Stark <gsstark(at)mit(dot)edu>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Gregory Stark <gsstark(at)mit(dot)edu>, PostgreSQL-development <pgsql-patches(at)postgresql(dot)org>
Subject: Re: Concurrent connections in psql patch
Date: 2006-09-03 21:09:44
Message-ID: 87fyf83bdz.fsf@stark.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Bruce Momjian <bruce(at)momjian(dot)us> writes:

> Is this something people are interested in? I am thinking no based on
> the lack of requests and the size of the patch.

Lack of requests? I was actually surprised by how enthusiastically people
reacted to it.

However I don't think the patch as is is ready to be committed. Aside from
missing documentation and regression tests it was only intended to be a
proof-of-concept and to be useful for specific tests I was doing.

I did try to do a decent job, I got \timing and server-tracked variables like
encoding. But I need to go back through the code and make sure there are no
other details like that.

It would be nice to get feedback from other developers from looking at the
patch to confirm that there aren't more fundamental problems with the approach
and how it uses libpq before I go through the effort of cleaning up the
details.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com


From: David Fetter <david(at)fetter(dot)org>
To: Gregory Stark <gsstark(at)mit(dot)edu>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development <pgsql-patches(at)postgresql(dot)org>
Subject: Re: Concurrent connections in psql patch
Date: 2006-09-05 22:43:52
Message-ID: 20060905224352.GA10617@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Sun, Sep 03, 2006 at 05:09:44PM -0400, Gregory Stark wrote:
> Bruce Momjian <bruce(at)momjian(dot)us> writes:
>
> > Is this something people are interested in? I am thinking no
> > based on the lack of requests and the size of the patch.
>
> Lack of requests? I was actually surprised by how enthusiastically
> people reacted to it.

I think it could form the basis of some concurrency testing, something
we'll need more and more as time goes on. :)

Gregory,

Would you be up for getting this updated in the 8.3 cycle?

Cheers,
D
>
> However I don't think the patch as is is ready to be committed. Aside from
> missing documentation and regression tests it was only intended to be a
> proof-of-concept and to be useful for specific tests I was doing.
>
> I did try to do a decent job, I got \timing and server-tracked variables like
> encoding. But I need to go back through the code and make sure there are no
> other details like that.
>
> It would be nice to get feedback from other developers from looking at the
> patch to confirm that there aren't more fundamental problems with the approach
> and how it uses libpq before I go through the effort of cleaning up the
> details.
>
> --
> Gregory Stark
> EnterpriseDB http://www.enterprisedb.com
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly

--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
phone: +1 415 235 3778 AIM: dfetter666
Skype: davidfetter

Remember to vote!